25
 Architecture and Infrastructure Paper 215 Reviewed by Oracle Certified Master Korea Community http://www.ocmko rea.com h ttp://cafe.daum.net/oraclemanager ) R R  A  A C B  A  A C CK K UP P A  A N ND D R R EC CO O  V  V E ER R  Y  Y  INTRODUCTION  The backup and recovery of Oracle9i RAC databases may at first seem to be a redundant exerci se. This is until one remembers that RAC allows for failure at the instance level but doesn't allow for failure at the disk or more common, the user level. Essentially there are two types of recovery that the DBA needs to be concerned with, these are:  Physical recovery - this is when a physical insult such as hardware failure or disaster strike.  Instance recovery - this is when the janitor unplugs the server to plug in his vacuum cleaner. Let's look at these in more detail. O  VERVIEW OF RAC B  ACKUP AND R ECOVERY  Believe it or not, other than a few quirks wh ich we shall discuss in detail, RAC backup and recovery is identical to almost all other Oracle database backup and recovery operations. When you get down to the basic level you are after all only backing up a single Oracle9i database. In most cases an instance failure will be recovered by other RAC instances. We will cover special cases of instance failure at the end of the paper.  The quirks come into play when dealing wit h a RAC database that uses archive logging. Archive logging introd uces an added layer of complexity due to the requirement that your backup isn't complete unless you backup all archive logs from all of the instances in your RAC environment. Luckily Oracle9i allows the database to archive log to more than one destination and with a little ingenuity on your part, you can ensure all archive logs are available for recovery at all times.  Within Oracle9i RAC backup and recovery you have a mult itude of backup possibilities:  Export  Cold backup using scripts  Hot backup using scripts  RMAN backup with a catalog  RMAN backup without a catalog  Using third party tools to perform backup and restore operations  All of these options have their good and bad qualities: EXPORT  A database export is a logical copy of the struct ure and data contained in an Oracle database. You cannot appl y archive log information against a database recovered using the import of a export file. This means that a expo rt is a point-in-time recovery of a database. In this way, an export is like a cold backup of a database that is not in archivelog mode. Exports are useful in that they allow easy restoration of tables and other structures instead of having to bring back entire tablespaces as you would in most other forms of backup and recovery. The import process can also be used to rebuild tables and indexes into more optimal configurations or to place data into new locations. Another benefit is that exports are capable of being copied across platforms, for example, an archive from a WIN2K server can be copied to a Solaris server and applied there.

1142483353236

Embed Size (px)

Citation preview

Page 1: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 1/25

 Architecture and Infrastructure

Paper 215

Reviewed by Oracle Certified Master Korea Community

( http://www.ocmkorea.com http://cafe.daum.net/oraclemanager )

R R  A  A CC BB A  A CCK K UUPP A  A NNDD R R EECCOO V  V EER R  Y  Y  

INTRODUCTION

 The backup and recovery of Oracle9i RAC databases may at first seem to be a redundant exercise. This is until one remembersthat RAC allows for failure at the instance level but doesn't allow for failure at the disk or more common, the user level.

Essentially there are two types of recovery that the DBA needs to be concerned with, these are:

•  Physical recovery - this is when a physical insult such as hardware failure or disaster strike.

•  Instance recovery - this is when the janitor unplugs the server to plug in his vacuum cleaner.

Let's look at these in more detail.

O VERVIEW OF RAC B ACKUP AND R ECOVERY  

Believe it or not, other than a few quirks which we shall discuss in detail, RAC backup and recovery is identical to almost allother Oracle database backup and recovery operations. When you get down to the basic level you are after all only backing upa single Oracle9i database.

In most cases an instance failure will be recovered by other RAC instances. We will cover special cases of instance failure at theend of the paper.

 The quirks come into play when dealing with a RAC database that uses archive logging. Archive logging introduces an addedlayer of complexity due to the requirement that your backup isn't complete unless you backup all archive logs from all of theinstances in your RAC environment. Luckily Oracle9i allows the database to archive log to more than one destination and witha little ingenuity on your part, you can ensure all archive logs are available for recovery at all times.

 Within Oracle9i RAC backup and recovery you have a multitude of backup possibilities:

•  Export

•  Cold backup using scripts

•  Hot backup using scripts

•  RMAN backup with a catalog 

•  RMAN backup without a catalog 

•  Using third party tools to perform backup and restore operations All of these options have their good and bad qualities:

EXPORT 

 A database export is a logical copy of the structure and data contained in an Oracle database. You cannot apply archive log information against a database recovered using the import of a export file. This means that a export is a point-in-time recovery of a database. In this way, an export is like a cold backup of a database that is not in archivelog mode.

Exports are useful in that they allow easy restoration of tables and other structures instead of having to bring back entiretablespaces as you would in most other forms of backup and recovery. The import process can also be used to rebuild tablesand indexes into more optimal configurations or to place data into new locations. Another benefit is that exports are capableof being copied across platforms, for example, an archive from a WIN2K server can be copied to a Solaris server and appliedthere.

Page 2: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 2/25

 Architecture and Infrastructure

Paper 215

 The drawbacks to exports are that they take a great deal of time to generate (depending on database size), they can only beperformed against a running database, and they take a long time to recover (again based on database size). In some versions of 

Oracle there are also file size limitations.

COLD B ACKUP USING SCRIPTS 

Cold backup using scripts can be used to backup an Oracle9i RAC database whether it is in archivelog mode or not. A coldbackup means that the database is shutdown and all files are backed up via a manually created script.

Generally speaking you can always recover from a cold backup unless something happens to your backup media or files.However, unless the database is in archivelog mode a cold backup is a point-in-time backup. If a database is archive logging,(all filled redo logs are copied to an archive log before being reused) then the cold backup can be restored to the server andarchive logs applied to bring the database to near current time status as possible.

 The drawbacks to cold backups are that the database must be shutdown in order to perform a cold backup, a cold backup cantake a long period of time (depending on database size) and, the DBA has to manually maintain the backup scripts.

HOT B ACKUP USING SCRIPTS  A Hot backup is a backup taken when the database is operating. A special command places the databases tablespaces intobackup mode that live files are copied. Once the copy operation in a hot backup is complete the datafiles are taken out of backup mode.

In order to use hot backup the database must be in archive log mode. Once all datafiles are copied during a hot backup thearchive logs that were generated while the datafiles were in backup mode are copied to the backup location. The datafilebackups and the archive logs are then used to recover the database to the exact state it was in at the end of the backup.

Once the database is at the point it was at the end of the backup, any other archive logs that have been generated since can beapplied in order to recover to any point in time between the backup and the last available archive log.

Notice that the database is active and in use during a hot backup. This allows a 24X7 shop to operate without shutting downto backup the database.

 The drawbacks to a hot backup are that the database performance can degrade during the backup, all archive logs generatedduring the backup process must be captured, and, the scripting for a hot backup can be quite complex.

 A  FEW WORDS ON USING M ANUAL SCRIPTS 

Manual scripts should be generated using SQL and PL/SQL routines against the data dictionary of an operating database. Thisallows any additions of tablespaces or changes in archive log destinations to be automatically engineered into the script. Mostattempts to manually maintain backup scripts ultimately end in disaster as a DBA misses a new tablespace or other structuralelement and thus it is not backed up.

In RAC the GV$ and DBA_ series of views should be used to create dynamically generated cold and hot backup scripts.Examples of hot and cold backup script generators are available on the Rampant website using the username and passwordprovided with this book. These scripts have been tested under WIN2K and Unix environments but should be thoroughly tested on your own system before relying on them.

RMAN (R ECOVERY M ANAGER )

Oracle provides the RMAN product to provide the DBA with a comprehensive backup environment. Using RMAN a DBAcan perform a multitude of backup types. The abilities of RMAN depend entirely on whether you are using a backup catalog ornot. A backup catalog is a set of tables stored in an Oracle database that track backup sets, pieces and databases backed up. If acatalog is not available then RMAN stores the backup data in the control file of the instance it is performing the backupagainst.

RMAN also allows a block level backup, where only the changed blocks of a tablespace get backed up. For large databases thespace savings in using a block level backup can be enormous.

Using RMAN with OEM (Oracle Enterprise Manager) allows for the scheduling of backup jobs with automated notificationcapabilities. Using OEM requires that Oracle Intelligent Agents be running on all RAC nodes. The configuration of OracleIntelligent Agents can sometimes be complex and the DBA needs to be very careful in the setup of the OEM agents in a RAC

environment.

Page 3: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 3/25

 Architecture and Infrastructure

Paper 215

In later sections we will look at various RMAN scenarios in detail.

 THIRD P ARTY SOLUTIONS Many providers of SAN and NAS storage also provide backup capabilities with their hardware. By using shadow volumescomplete backups of databases can be maintained with no impact on database operations. Recovery using third party solutionscan also be incredibly fast as these solutions often either mirror the entire database to a second location or provide only forbackup of changed blocks.

B ACKUP OF A RAC D ATABASE 

 As we discussed earlier the two major modes in a backup scenario are cold, where the database is shutdown and hot where thedatabase is open and in use. Both of these modes can be used by the Oracle DBA when backing up a RAC database. However,since the major reason for utilizing Oracle RAC is to provide uninterrupted service to your customer base, it makes little senseto use a cold backup with RAC since it would require the shutdown of all instances that are using the Oracle database.

 You must also ensure that all archive log files from all instances are gathered into the backup set, miss even a single archive log 

and your backups usability is placed in serious question.

USING RMAN FOR B ACKUPS 

RMAN should be a serious consideration for all RAC backups, by following a few simple requirements a RAC instance can bebacked up with little of no intervention from the DBA.

One thing to remember is that the RMAN process can only attach to a single instance in a RAC environment at any one pointin time. The connection RMAN establishes is what is known as a utility connection in that it itself doesn't perform any of thebackup or restore operations, it is only an information gathering tool for RMAN.

 An example of the command line connection to a RAC environment would be (assuming the RAC instances are ault1 andault2):

$ rman TARGET SYS/kr87m@ault2 CATALOG rman/cat@rcat

 The connection string (in this case ault2  ) can only apply to a single instance so the entry in the tnsnames.ora for the ault2  connection would be:

ault2 =(DESCRIPTION =

(ADDRESS_LIST =(LOAD_BALANCE = OFF)(FAILOVER = ON)(ADDRESS = (PROTOCOL = TCP)(HOST = aultlinux2)(PORT = 1521))

)(CONNECT_DATA =(SERVICE_NAME = ault)

(INSTANCE_NAME = ault2))

For RAC there is a special requirement if the instances are using archive logs, a channel connection must be specified for eachinstance and must resolve to only one instance, for example using our ault1 and ault2 instances from our previous example:

CONFIGURE DEFAULT DEVICE TYPE TO sbt;CONFIGURE DEVICE TYPE TO sbt PARALLELISM 2;CONFIGURE CHANNEL 1 DEVICE TYPE sbt CONNECT = 'SYS/kr87m@ault1';CONFIGURE CHANNEL 2 DEVICE TYPE sbt CONNECT = 'SYS/kr87m@ault2';

 This configuration only has to be specified once for a RAC environment and only should be changed if nodes are added or

removed from the RAC configuration. In this way it is known as a "persistent" configuration and need never be changed forthe life of your RAC environment. One requirement on this configuration for RAC is that each of the nodes specified must be

Page 4: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 4/25

 Architecture and Infrastructure

Paper 215

open (the database operational) or all must be closed (the database shutdown.) If even one of the instances specified is not inthe same state as the others, the backup will fail.

RMAN is also aware of the node affinity of the various database files and uses the node that has the greatest access to backupthe datafiles that the instance has greatest affinity for. Node affinity can however be overridden with manual commands:

BACKUP#Channel 1 gets datafiles 1,2,3(DATAFILE 1,2,3 CHANNEL ORA_SBT_TAPE_1)#Channel 2 gets datafiles 4,5,6,7(DATAFILE 4,5,6,7 CHANNEL ORA_SBT_TAPE2)

 The nodes chosen to backup and Oracle RAC cluster must have the ability to see all of the files that require backup. Forexample:

BACKUP DATABASE PLUS ARCHIVELOG;

Requires that the nodes specified have access to all archive logs generated by all instances. This requires some specialconsiderations when configuring the Oracle RAC environment.

 The essential steps for using RMAN in Oracle RAC are:

1.  Configure the snapshot control file location

2.  Configure the control file autobackup feature

3.  Configure the archiving scheme

4.  Change the archivemode of the database (optional)

5.  Monitor the archiver process

First, let's look at the snapshot control file location configuration.

S  NAPSHOT C ONTROL F ILE L OCATION 

Identical copies of the control file must be maintained on every node that participates in the RAC backup process. Thereforeeach node must have an identical directory location for the storage of a snapshot of the current control file taken at the time of the backup. This directory location is referred to as the snapshot control file location. The default snapshot control filelocation can be shown by issuing the command:

SHOW SNAPSHOT CONTROLFILE NAME;

 To change the default location, for example, to '/u04/backup/ault_rac/snap_ault.cf' the command would be:

CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u04/backup/ault_rac/snap_ault.cf'

Note that this is only specified to a single location which requires each node to have a '/u04/backup/ault_rac' directory in ourconfiguration for this example. Note that like the channel specifications shown in the previous section, this is also a persistentglobal specification.

 The control file can be automatically backed up during each backup operation by specification of the command:

CONFIGURE CONTROLFILE AUTOBACKUP ON;

 This automatic control file backup is also a persistent setting and need never be specified again. By using the automatic controlfile backup RMAN can restore the control file even after loss of the recovery catalog and the current control files.

Page 5: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 5/25

 Architecture and Infrastructure

Paper 215

 ARCHIVE L OGS AND RAC  B ACKUP U SING RMAN 

 As we have stressed, archive logging, while vital to the backup of any Oracle database, provides special problems when you arebacking up a RAC database. All archived logs, no matter what instance has generated them, must be backed up in order torecover using the backup.

 The configuration for archive logs must be carefully set up. For example if ault1 archives a log file to the location:

/u01/backups/archives/ault_rac/ault_log_1_1234.log

 Then a duplicate file must be placed on any node that performs backup operations in a directory names the same. Only onenode performs backup operations in a RAC environment, that node must have access to all archived logs.

 Another consideration it the setting of the parameter LOG_ARCHIVE_FORMAT , it must be specified identically on allnodes and should include the instance number and redo log thread number.

How archive log locations are configured depends on the type of filesystems you use for your RAC environment. The

filesystems available for RAC are:•  OCFS based filesystems

•  RAW (non-OCFS filesystems)

 An OCFS based filesystem setup allows any node to read the archive logs from any other node, in fact all logs can be placed ina centralized area. This is the preferred setup for Oracle RAC. If node ault1 writes a log to /u01/backup/archives/ault_rac,any other RAC instance in an OCFS setup could see the log. This is demonstrated in Figure 1.

Figure 1: Example OCFS Archive Log Layout 

In a RAW filesystem setup the shared drives are configured into raw partitions, only a single file can be written to a specificraw device, thus raw devices must not be used for archive logs. This means that for a RAC environment that uses a raw 

configuration the archive logs are written to server-local disks that are not shared. Unless special configuration options such asNFS are used, no other instances can see the archive logs for any other in a raw environment. This is demonstrated in Figure 2.Let's examine a technique for overcoming this problem in a raw filesystem environment.

Page 6: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 6/25

 Architecture and Infrastructure

Paper 215

Figure 2: Example RAW Archive Log Layout 

 To make all archive log files visible to all other nodes in a raw filesystem configuration, you must name the directoriesaccording to which node they service, in out ault1, ault2 configuration this would be:

On node aultlinux1:

/usr/backup/ault_rac1/archives1

On node aultlinux2:

/usr/backp/ault_rac2/archives2

In order to make the archive logs available, the /usr/backup//ault_rac2/archives2 directory would be NFS mounted to theaultlinux1 node and the /usr/backup/ault_rac1/archives1 directory NFS mounted to the aultlinux2 node. To make things even

easier, a periodic job could be scheduled that copies the archive log files form the NFS mounts into the local archivedestination. By copying the archive logs from the NFS mounted directories into the local archive log location you ensure thateven if you lose the NFS mount the database will be recoverable. This setup is illustrated in Figure 3.

Figure 3: Example Use of NFS Mounting 

 A Unix utility known as fuser verifies if a file is open or closed. If an archive log is being written Unix will allow you to copy it,delete it or anything else. Therefore using fuser before copying an archive file will ensure all files are complete and usable forrecovery. Any script you write on a Unix or Linux box for copying archive logs should use this command to verify a log is notbeing written before attempting to copy it.

 About the only advantage of the raw configuration is that if you have multiple tapes (one on each node) then the archive log 

portion of the backup can be parallelized.

Page 7: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 7/25

 Architecture and Infrastructure

Paper 215

If only a single tape drive is available then you must use the NFS mount scheme and if the NFS mount is lost, you may not beable to fully recover the database should any archive log be lost.

 The initialization parameters should be set similar to:

ault1.LOG_ARCHIVE_DEST_1="LOCATION=/u01/backup/ault_rac1"ault2.LOG_ARCHIVE_DEST_1="LOCATION=/u01/backup/ault_rac2"

By using the NFS mount scheme, either node can backup the archive logs in the other. On Linux the process to set up an NFSmounted drive is:

1.  Configure the services such that the NFS services are running:

•  NFS

•  NFSLOCK 

•  NETFS

Figure 4 shows the Service Configuration GUI with these services checked.

Figure 4: Example Service Configuration Screen with NFS services selected.

2.  Configure the NFS server on each node. Figure 5 shows the NFS Server GUI configuration screen for Linux.

Page 8: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 8/25

 Architecture and Infrastructure

Paper 215

Figure 5: Example NFS Configuration Screen 

 The configuration for NFS mounts is performed by the root user.

3.  On each RAC node, create the mount point directory exactly as it appears on each remote node, for example in ourdemo system, for the server aultlinux2 the command to create the mount directory for the archive directory onaultlinux1 would be:

$ mkdir /usr/backup/ault_rac1/archives1

4.  On each RAC node use the mount command to mount the NFS drive(s) from the other nodes, using the mountdirectory we created in the previous step, for our example setup this would be:

$ mount aultlinux1:/usr/backup/ault_rac2/archives2 /usr/backup/ault_rac2/archives2

Once the directories are cross-mounted we can then continue. 

T HE CONFIGURATION OF  NFS MOUNT POINTS ON S OLARIS 9 WOULD BE DONE USING MANUAL COMMANDS : 

1.  Start the NFS server:

# /etc/init.d/nfs.server start

2.  Set up the shares:

# share -F nfs -o rw=aultsolaris2 /usr/backup/ault_rac1/archives1

3.   Verify the shares are available on the target box:

# dfshares -F nfs aultsolaris1

Page 9: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 9/25

 Architecture and Infrastructure

Paper 215

RESOURCE SERVER ACCESS TRANSPORT Aultsolaris1:/usr/backup/ault_rac1/archives1 aultsolaris1 - -

4.  On the target box, start the NFS client:

# /etc/init.d/nfs.client start

5.  On the target box create the required mount directory:

# mkdir /usr/backup/ault_rac1/archives1

On the target, mount the shared directory:

# mount -F nfs -o rw aultsolaris1:/usr/backup/ault_rac1/archives1

Make sure the mount directory is created with the same owner and group as you want to have access to the directory tree thatis cross-mounted.

In Windows2000 you share the drives across the network to achieve the same capabilities.

 The LOG_ARCHIVE_FORMAT parameter determines the format of the archive logs that have been generated. TheLOG_ARCHIVE_FORMAT parameter must be the same on all clustered nodes. The allowed format strings for theLOG_ARCHIVE_FORMAT are:

%T -- Thread number, left-zero-padded so LOG_ARCHIVE_FORMAT = ault_%T would be ault_0001

%t -- Thread number, non-left-zero-padded so LOG_ARCHIVE_FORMAT = ault_%t would be ault_1

%S -- Log sequence number, left-zero-padded, so LOG_ARCHIVE_FORMAT = ault_%S would be ault_0000000001

%s -- Log sequence number, non-left-zero_padded, so LOG_ARCHIVE_FORMAT = ault_%s would be ault_1.

 The format strings can be combined to show both thread and sequence number. The %T or %t parameters are required forRAC archive logs.

In order to perform a complete recovery a database, whether it is a normal database or a RAC database, must use archivelogging. In order to turn on archive logging in RAC the following procedure should be used:

1.  Shut down all instances associated with the RAC database.

2.  Choose one instance in the RAC cluster, in its unique initialization parameter file set the CLUSTER_DATABASE parameter to FALSE. (Note: If you are using a server parameter file, then append "<sid.>" to the parameter.)

3.  In the instance parameter file, set the LOG_ARCHIVE_DEST_n, LOG_ARCHIVE_FORMAT and

LOG_ARCHIVE_START parameters, for example for our example instances:

ault1.LOG_ARCHIVE_DEST_1 = "LOCATION=/u01/backup/ault_rac1 MANDATORY"ault2.LOG_ARCHIVE_DEST_2 = "LOCATION=/u01/backup/ault_rac2 MANDATORY"LOG_ARCHIVE_FORMAT = ault_%T_%sLOG_ARCHIVE_START = TRUE

4.  Start the instance.

$ sqlplus /nologSQL> connect / as sysdba

Connected to an idle instace

SQL> startup

Page 10: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 10/25

 Architecture and Infrastructure

Paper 215

<normal startup messages>

5.  Execute the following command from the SQLPLUS session:

SQL> alter database archivelog;

Command executed.

6.  Shut down the instance.

7.  Edit the instance initialization parameter file or server parameter file to reset CLUSTER_DATABASE to TRUE.

8.  Restart the all the instances.

D  ELETION OF B ACKED - UP  ARCHIVE L OGS  

 After a successful backup the archive logs should be deleted. If you use RMAN, this deletion of archive logs can be automated.

I suggest that only the logs that have actually been backed up be deleted. To achieve the deletion of backed up archive logs theBACKUP command is issued with either the DELETE INPUT or DELELTE ALL INPUT clause following the ARCHIVELOG portion of the command.

 To delete just the archive logs that have been backed up the command for instance ault1 would be:

BACKUP DATABASE PLUS ARCHIVELOG DELETE INPUT;

 To delete all of the archive logs at the archive log destination:

BACKUP DATABASE PLUS ARCHIVELOG DELETE ALL INPUT;

 To be absolutely sure that only backed up archive logs are deleted, use the DELETE command with a MAINTENANCE

connection in RMAN. In our example with instances ault1 and ault2 : ALLOCATE CHANNEL FOR MAINTENANCE DEVICE TYPE DISK CONNECT 'SYS/kr87m@ault1';DELETE ARCHIVELOG LIKE '%arc_dest_1%'BACKED UP 1 TIMES TO DEVICE TYPE sbt;RELEASE CHANNEL;

 ALLOCATE CHANNEL FOR MAINTENANCE DEVICE TYPE DISK CONNECT 'SYS/kr87m@ault2';DELETE ARCHIVELOG LIKE '%arc_dest_1%'BACKED UP 1 TIMES TO DEVICE TYPE sbt;RELEASE CHANNEL;

Notice the BACKED UP 1 TIMES clause in the above commands. This tells RMAN not to delete the archive logs unless ithas a record of them being backed up at least once. The '$arch_dest_1%' token tells what logs to remove and translates intothe value specified for LOG_ARCHIVE_DEST_1 for the instance specified in the connection alias (example: @ault1).

RMAN is capable of autolocation of files it needs to backup. RMAN, through the database synchronization and resyncprocesses is aware of which files it needs to backup for each node. RMAN can only backup the files it has autolocated for eachnode on that node.

During recovery, autolocation means that only the files backed up from a specific node will be written to that node.

B ACKUP PROCEDURES FOR RMAN AND RAC

 We have discussed various pieces of the backup procedures using RMAN. IN this section let's tie them into a coherent set of backup procedures. Essentially we will look at RMAN scripts to perform a cluster file system type backup with a single drive, acluster file system backup with a single drive, a raw file system backup with multiple drives and a raw file system backup with asingle drive.

Page 11: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 11/25

 Architecture and Infrastructure

Paper 215

CFS  S INGLE T  APE D RIVE BACKUP  

Perhaps the simplest RAC backup is the CFS backup using a single tape drive. Basically you configure one backup device, onechannel and issue the backup command. After connecting to the RMAN catalog database and the target database you wouldrun the following RMAN commands:

CONFIGURE DEVICE TYPE sbt PARALLELISM 1;CONFIGURE DEFAULT DEVICE TYPE TO sbt;BACKUP DATABASE PLUS ARCHVIELOG DELETE INPUT;

 The reason the backup set of commands is so simple in a CFS single tape drive backup is that in a CFS type system all nodescan see all drives in the CFS array, this allows any node to backup the entire instance.

CFS   M ULTIPLE T  APE D RIVE B ACKUP  

If your RAC setup includes a tape drive on each node, or, you are backing up to a disk on each node, then you can parallelizethe backup (making it go faster) and utilize all the availbel instance for the backup operation. For our example 2-node rac setup,ffter connecting to the RMAN catalog and the target instance, you would issue the commands:

CONFIGURE DEVICE TYPE sbt PARALLELISM 2;CONFIGURE DEFAULT DEVICE TYPE sbt;CONFIGURE CHANNEL 1 DEVICE TYPE sbt CONNECT 'SYS/kr87m@ault1';CONFIGURE CHANNEL 2 DEVICE TYPE sbt CONNECT 'SYS/kr87m@ault2';

Note: if you are using a device type of DISK substitute DISK for sbt and specify the path to the backup directory as a part of the CHANNEL configuration, for example:

CONFIGURE CHANNEL 1 TYPE disk FORMAT '/u01/backup/ault_rac1/b_%u_%p_%c' CONNECT 'sys/kr87m@ault1';

 As we said before, this configuration only has to specified once unless something happens to your RMAN catalog. Once theconfiguration is set the command to perform the backup is fairly simple:

BACKUP DATABASE PLUS ARCHIVELOG DELETE INPUT;

 You can also provide for a control file backup.

BACKUP DATABASE INCLUDE CURRENT CONTROLFILE PLUS ARCHIVE LOG DELETE INPUT;

RMAN  B ACKUP R  AW F ILESYSTEMS T O A S INGLE T  APE D RIVE 

 The major problem with Raw filesystem setups is that each node only sees the archive logs it has produced. In order for thenode to see the other archive logs from the other nodes the archive log directories must be NFS mounted to the backup node. You can only specify DELETE ALL INPUT or DELETE ALL if the NFS mounts are read/write.

 The script to perform the backup of a raw filesystem based RAC cluster would resemble:

CONFIGURE DEVICE TYPE sbt PARALLELISM 1;CONFIGURE DEFAULT DEVICE TYPE TO sbt;BACKUP DATABASE PLUS ARCHIVELOG DELETE INPUT;

 The above commands will only work if the backup node has read/write access to the archive log directories for all nodes in theRAC database cluster.

RMAN  B ACKUP R  AW F ILESYSTEMS TO M ULTIPLE T  APE D RIVES  

CONFIGURE DEVICE TYPE sbt PARALLELISM 2;CONFIGURE DEFAULT DEVICE TYPE TO sbt;CONFIGURE CHANNEL 1 DEVICE TYPE sbt CONNECT 'SYS/kr87m@ault1';CONFIGURE CHANNEL 2 DEVICE TYPE sbt CONNECT 'SYS/kr87m@ault2';

Page 12: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 12/25

 Architecture and Infrastructure

Paper 215

Note: if you are using a device type of DISK substitute DISK for sbt and specify the path to the backup directory as a part of the CHANNEL configuration, for example:

CONFIGURE CHANNEL 1 DEVICE TYPE disk FORMAT '/u01/backup/ault_rac1/b_%u_%p_%c' CONNECT 'sys/kr87m@ault1';

 As we said before, this configuration only has to be specified once unless something happens to your RMAN catalog. Oncethe configuration is set the command to perform the backup is fairly simple:

BACKUP DATABASE PLUS ARCHIVELOG DELETE INPUT;

 You can also provide for a control file backup.

BACKUP DATABASE INCLUDE CURRENT CONTROLFILE PLUS ARCHIVELOG DELETE INPUT;

Note that we use the DELETE INPUT clause in this case, this is because the process doing the backup is local to each nodeand has read/write access to the archive log directory.

EXAMPLE CONFIGURATION AND B ACKUP USING OEM AND RMAN FOR A RAW FILESYSTEM 

Let's take a look at actual screen shots of the configuration and execution of a backup in the Linux environment from theOracle Enterprise Manager interface for Rman.

First we need to execute a one-time configuration job on our server. Figure 6 shows the OEM Main GUI with the Jobdefinition menu pulled down. You will use this menu to select the Create Job option.

Figure 6: OEM Job Menu 

 The selection of the Create Job option invokes the OEM job edit menu as is shown in Figure 7. In this shot you can see wefilled in the entries for the job name and the node to which it is to be submitted. In a RAC environment it doesn't matter which node we submit to for the configuration step as long as that node is available and will be used for future backupactivities.

Page 13: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 13/25

 Architecture and Infrastructure

Paper 215

Figure 7: OEM Job Edit GUI 

Next, we want to select the Tasks tab and select Run RMAN Script . This is shown in Figure 8.

Figure 8: Job GUI Tasks Screen 

Next, we use the Parameters screen to actually enter the RMAN commands we wish to execute. Notice that we do not have tobracket them with the RUN {…} construct. Figure 9 shows the Job Parameters GUI with the script commands to configureour servers using tape drives.

Page 14: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 14/25

 Architecture and Infrastructure

Paper 215

Figure 9: JOB Parameters Screen Configured for Tape Drives 

One thing to notice in Figure 9 is that with the TAPE type configuration the command for the default device type must usethe "TO sbt", if we were specifying a type of disk the "TO" is excluded. Also, with a device type of disk you need to specify thedirectory where you want the backup files placed, this is done using the FORMAT option on the CONFIGURE commands asshown in Figure 10.

Figure 10: Job Edit GUI Showing FORMAT Clause 

Once the commands are entered into the Rman Script window, you can Submit, Save to Library or Submit and Save to Library , Isuggest the Submit and Save to Library option in case you have made an error. If the job is not saved, you will have to re-enter itif you have made an error.

Page 15: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 15/25

 Architecture and Infrastructure

Paper 215

 When the job is submitted, you can follow its progress by selecting the Jobs icon in the OEM menu tree and then doubleclicking on the entry that corresponds to your submitted job. A Job tracking screen, similar to the one in Figure 11 will be

displayed.

Figure 11: OEM Job Status Screen 

Notice that the screen in Figure 11 is actually called the Edit Jobs screen, however, it is only used to show the job status andno actual editing is allowed. By clicking on the various steps of the job process you can get any output generated to bedisplayed. An example log from a configuration job is shown in figure 12.

Recovery Manager: Release 9.2.0.2.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

RMAN> 2>connected to target database: AULT (DBID=127952943)using target database controlfile instead of recovery catalog

RMAN>RMAN> CONFIGURE DEFAULT DEVICE TYPE TO disk;old RMAN configuration parameters:CONFIGURE DEFAULT DEVICE TYPE TO DISK;new RMAN configuration parameters:CONFIGURE DEFAULT DEVICE TYPE TO DISK;

new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE DEVICE TYPE disk PARALLELISM 2;old RMAN configuration parameters:CONFIGURE DEVICE TYPE DISK PARALLELISM 2;new RMAN configuration parameters:CONFIGURE DEVICE TYPE DISK PARALLELISM 2;new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE CHANNEL 1 DEVICE TYPE disk CONNECT = 'SYS/xenon137@ault1' FORMAT "/usr/backup/ault_rac1/%U";new RMAN configuration parameters:CONFIGURE CHANNEL 1 DEVICE TYPE DISK CONNECT 'SYS/xenon137@ault1' FORMAT "/usr/backup/ault_rac1/%U";new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE CHANNEL 2 DEVICE TYPE disk CONNECT = 'SYSTEM/xenon137@ault2' FORMAT "/usr/backup/ault_rac2/%U";new RMAN configuration parameters:CONFIGURE CHANNEL 2 DEVICE TYPE DISK CONNECT 'SYSTEM/xenon137@ault2' FORMAT "/usr/backup/ault_rac2/%U";new RMAN configuration parameters are successfully stored

RMAN>RMAN>

Page 16: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 16/25

 Architecture and Infrastructure

Paper 215

RMAN> **end-of-file**

RMAN>

Recovery Manager complete.

Figure 12: Example Log from Configuration Job 

Once the configuration is complete (remember, this only has to be done once unless the configuration changes) we are ready to actually submit our backup job. To submit a backup job we once again invoke the OEM Job editor, and other than the jobname (now RAC Backup) the steps are essentially the same until we get to the script entry, now we enter the actual backupcommands as shown in Figure 13.

Figure 13: Example Job Screen for Backup Job 

If your NFS mounts are read-only, exclude the DELETE INPUT portion of the command and use a manual deletion. Oncethe command(s) are entered, use the Submit and Save to Library selection to submit the job. The log file from this job isshown in Figure 14.

Recovery Manager: Release 9.2.0.2.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

RMAN> 2>connected to target database: AULT (DBID=127952943)

using target database controlfile instead of recovery catalogRMAN>RMAN> BACKUP DATABASE INCLUDE CURRENT CONTROLFILE PLUS ARCHIVELOG DELETE INPUT;

Starting backup at 15-FEB-03current log archivedallocated channel: ORA_DISK_1channel ORA_DISK_1: sid=20 devtype=DISK allocated channel: ORA_DISK_2channel ORA_DISK_2: sid=20 devtype=DISK channel ORA_DISK_1: starting archive log backupsetchannel ORA_DISK_1: specifying archive log(s) in backup setinput archive log thread=1 sequence=21 recid=7 stamp=486059165channel ORA_DISK_1: starting piece 1 at 15-FEB-03channel ORA_DISK_2: starting archive log backupsetchannel ORA_DISK_2: specifying archive log(s) in backup set

Page 17: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 17/25

 Architecture and Infrastructure

Paper 215

input archive log thread=2 sequence=23 recid=4 stamp=486046138input archive log thread=2 sequence=24 recid=6 stamp=486046347

channel ORA_DISK_2: starting piece 1 at 15-FEB-03channel ORA_DISK_1: finished piece 1 at 15-FEB-03piece handle=/usr/backup/ault_rac1/0befhb51_1_1 comment=NONEchannel ORA_DISK_1: backup set complete, elapsed time: 00:00:09channel ORA_DISK_1: deleting archive log(s)archive log filename=/usr/backup/ault_rac1/archives1/1_21.dbf recid=7 stamp=486059165channel ORA_DISK_1: starting archive log backupsetchannel ORA_DISK_1: specifying archive log(s) in backup setinput archive log thread=2 sequence=25 recid=8 stamp=486059165channel ORA_DISK_1: starting piece 1 at 15-FEB-03channel ORA_DISK_1: finished piece 1 at 15-FEB-03piece handle=/usr/backup/ault_rac1/0defhb5a_1_1 comment=NONEchannel ORA_DISK_1: backup set complete, elapsed time: 00:00:02channel ORA_DISK_1: deleting archive log(s)archive log filename=/usr/backup/ault_rac1/archives1/2_25.dbf recid=8 stamp=486059165channel ORA_DISK_2: finished piece 1 at 15-FEB-03piece handle=/usr/backup/ault_rac2/0cefhb0h_1_1 comment=NONEchannel ORA_DISK_2: backup set complete, elapsed time: 00:00:14channel ORA_DISK_2: deleting archive log(s)archive log filename=/usr/backup/ault_rac1/archives1/2_23.dbf recid=4 stamp=486046138archive log filename=/usr/backup/ault_rac1/archives1/2_24.dbf recid=6 stamp=486046347Finished backup at 15-FEB-03

Starting backup at 15-FEB-03using channel ORA_DISK_1using channel ORA_DISK_2channel ORA_DISK_1: starting full datafile backupsetchannel ORA_DISK_1: specifying datafile(s) in backupsetinput datafile fno=00002 name=/oracle/oradata/ault_rac/ault_rac_raw_undotbs1_200m.dbfinput datafile fno=00005 name=/oracle/oradata/ault_rac/ault_rac_raw_example_140m.dbfinput datafile fno=00010 name=/oracle/oradata/ault_rac/ault_rac_raw_xdb_40m.dbf

input datafile fno=00006 name=/oracle/oradata/ault_rac/ault_rac_raw_indx_25m.dbfinput datafile fno=00009 name=/oracle/oradata/ault_rac/ault_rac_raw_users_25m.dbfinput datafile fno=00004 name=/oracle/oradata/ault_rac/ault_rac_raw_drsys_20m.dbfchannel ORA_DISK_1: starting piece 1 at 15-FEB-03channel ORA_DISK_2: starting full datafile backupsetchannel ORA_DISK_2: specifying datafile(s) in backupsetincluding current controlfile in backupsetinput datafile fno=00001 name=/oracle/oradata/ault_rac/ault_rac_raw_system_411m.dbfinput datafile fno=00003 name=/oracle/oradata/ault_rac/ault_rac_raw_cwlite_20m.dbfinput datafile fno=00007 name=/oracle/oradata/ault_rac/ault_rac_raw_odm_20m.dbfinput datafile fno=00011 name=/oracle/oradata/ault_rac/ault_rac_raw_undotbs2_200m.dbfinput datafile fno=00008 name=/oracle/oradata/ault_rac/ault_rac_raw_tools_10m.dbfchannel ORA_DISK_2: starting piece 1 at 15-FEB-03channel ORA_DISK_1: finished piece 1 at 15-FEB-03piece handle=/usr/backup/ault_rac1/0eefhb5h_1_1 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:01:09channel ORA_DISK_2: finished piece 1 at 15-FEB-03piece handle=/usr/backup/ault_rac2/0fefhb11_1_1 comment=NONEchannel ORA_DISK_2: backup set complete, elapsed time: 00:01:53Finished backup at 15-FEB-03

Starting backup at 15-FEB-03current log archivedusing channel ORA_DISK_1using channel ORA_DISK_2channel ORA_DISK_1: starting archive log backupsetchannel ORA_DISK_1: specifying archive log(s) in backup setinput archive log thread=1 sequence=22 recid=9 stamp=486059309channel ORA_DISK_1: starting piece 1 at 15-FEB-03channel ORA_DISK_2: starting archive log backupset

channel ORA_DISK_2: specifying archive log(s) in backup setinput archive log thread=2 sequence=26 recid=10 stamp=486059309

Page 18: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 18/25

 Architecture and Infrastructure

Paper 215

channel ORA_DISK_2: starting piece 1 at 15-FEB-03channel ORA_DISK_1: finished piece 1 at 15-FEB-03

piece handle=/usr/backup/ault_rac1/0gefhb9e_1_1 comment=NONEchannel ORA_DISK_1: backup set complete, elapsed time: 00:00:00channel ORA_DISK_1: deleting archive log(s)archive log filename=/usr/backup/ault_rac1/archives1/1_22.dbf recid=9 stamp=486059309channel ORA_DISK_2: finished piece 1 at 15-FEB-03piece handle=/usr/backup/ault_rac2/0hefhb4u_1_1 comment=NONEchannel ORA_DISK_2: backup set complete, elapsed time: 00:00:01channel ORA_DISK_2: deleting archive log(s)archive log filename=/usr/backup/ault_rac1/archives1/2_26.dbf recid=10 stamp=486059309Finished backup at 15-FEB-03

RMAN> **end-of-file**

RMAN>

Recovery Manager complete.Figure 14: Example Backup Log 

 As you can see, using RMAN, RAC backup is much easier than doing manual scripting. The most difficult part of using OEMand RAMN is getting the intelligent agents properly configured on the RAC nodes. I cover OEM and RAC in another paper.

R ECOVERY IN THE RAC ENVIRONMENT 

 As we said earlier there are basically two types of failure in a RAC environment, instance and media. Instance failure involvesthe loss of one or more RAC instances whether due to node failure or connectivity failure. Media failure involves the loss of one or more of the disk assets used to store the database files themselves.

If a RAC database undergoes instance failure the first node that is still available to the RAC database that detects the failedinstance or instances will perform instance recovery on all failed instances using the failed instances redo logs and the SMONprocess of the surviving instance. The redo logs for all RAC instances are located either on a OCFS shared disk asset or on a

raw filesystem that is visible to all the other RAC instances. This allows any other node to recover for a filed RAC node in thecase of instance failure.

Recovery using redo logs allows committed transactions to be completed. Non-committed transactions are rolled back andtheir resources released.

In over a dozen years of working with Oracle databases I have yet to see an instance failure result in a non-recoverablesituation with an Oracle database. Generally speaking an instance failure, in RAC or in normal Oracle, requires no activeparticipation from the DBA other than to restart the failed instance when the node becomes available once again.

If for some reason the surviving instance cannot see the control file or redo logs of the failed instance. If for some reason therecovering instance cannot see all of the datafiles accessed by the failed instance, an error will be written to the alert log. To verify that all datafiles are available you can use the ALTER SYSTEM CHECK DATAFILES command to validate properaccess.

Instance recover undergoes nine distinct steps (the Oracle manual only lists eight, but I believe that the actual instance failureshould be included) :

1.  Normal RAC operation, all nodes available

2.  One or more RAC instances fail

3.  Node failure is detected

4.  GCS (Global Cache Service) reconfigures to distribute resource management to the surviving instances.

5.   The SMON process in the instance which first discovers the failed instance(s) reads the failed instance(s) redo logs todetermine which blocks have to be recovered

6.  SMON issues requests for all of the blocks it needs to recover, once all blocks are made available to the SMONprocess doing the recovery, all other database blocks are available for normal processing.

7.  Oracle performs roll forward recovery against the blocks applying all redo log recorded transactions.

Page 19: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 19/25

 Architecture and Infrastructure

Paper 215

8.  Once redo transactions are applied, all undo (rollback) records are applied, this eliminates non-committed transactions.

9.  Database is now fully available to surviving nodes.

 Again, instance recovery is automatic and other than the performance hit to instances which survive and the disconnection of users who where using the failed instance, basically invisible to the other instances. If you properly utilize RAC failover andtransparent application failover (TAF) technologies, the only users that should see a problem are those with in-flighttransactions. For a look at what the other instance sees in its alert log during a reconfiguration, look at Figure 15.

Sat Feb 15 16:39:09 2003Reconfiguration startedList of nodes: 0,Global Resource Directory frozenone node partitionCommunication channels reestablishedMaster broadcasted resource hash value bitmapsNon-local Process blocks cleaned out

Resources and enqueues cleaned outResources remastered 19772381 GCS shadows traversed, 1 cancelled, 13 closed1026 GCS resources traversed, 0 cancelled3264 GCS resources on freelist, 4287 on array, 4287 allocatedset master node infoSubmitted all remote-enqueue requestsUpdate rdomain variablesDwn-cvts replayed, VALBLKs dubious All grantable enqueues granted2381 GCS shadows traversed, 0 replayed, 13 unopenedSubmitted all GCS remote-cache requests0 write requests issued in 2368 GCS resources2 PIs marked suspect, 0 flush PI msgsSat Feb 15 16:39:10 2003Reconfiguration completePost SMON to start 1st pass IRSat Feb 15 16:39:10 2003Instance recovery: looking for dead threadsSat Feb 15 16:39:10 2003Beginning instance recovery of 1 threadsSat Feb 15 16:39:10 2003Started first pass scanSat Feb 15 16:39:11 2003Completed first pass scan208 redo blocks read, 6 data blocks need recoverySat Feb 15 16:39:11 2003Started recovery atThread 2: logseq 26, block 14, scn 0.0Recovery of Online Redo Log: Thread 2 Group 4 Seq 26 Reading mem 0Mem# 0 errs 0: /oracle/oradata/ault_rac/ault_rac_raw_rdo_2_2.log

Recovery of Online Redo Log: Thread 2 Group 3 Seq 27 Reading mem 0Mem# 0 errs 0: /oracle/oradata/ault_rac/ault_rac_raw_rdo_2_1.log

Sat Feb 15 16:39:12 2003Completed redo applicationSat Feb 15 16:39:12 2003Ended recovery atThread 2: logseq 27, block 185, scn 0.54793116 data blocks read, 8 data blocks written, 208 redo blocks readEnding instance recovery of 1 threadsSMON: about to recover undo segment 11SMON: mark undo segment 11 as available

Figure 15: Alert Log Entries During Reconfiguration 

On word of caution however, during testing to get the listing in Figure 15 I stumbled upon a rare occurrence of not being ableto get the instance up after an instance failure. In the Linux/RAC/Raw environment I did a "kill -9" on the SMON process onaultlinux1, the above was the result for the database (aultlinux2 stayed up and operating and recovered the failed instance)

Page 20: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 20/25

 Architecture and Infrastructure

Paper 215

however, when I attempted a restart of the instance on aultlinux1 I received a Linux Error: 24 : Too Many Files Open error. This was actually caused by something wacking the spfile link. Once I pointed the instance toward the proper spfile location during 

startup, it restarted with no problems.

MEDIA R ECOVERY IN RAC INSTANCES 

Media recovery is only required if a physical insult occurs to the database files which prevent proper access to Oracle data ortransaction (redo and undo) files. Our entire discussion of RMAN in the backup section of this paper was to allow you asDBA to make a copy of your Oracle files to allow for recovery if a physical insult occurs.

If you use self-created scripts to backup your Oracle database, I suggest engineering self-generating recovery scripts as well. Anexample is available on the Rampant website using the username and password provided with this book. By generating arecovery script and storing it with the backup set you ensure that a recovery can be easily performed.

Generally it is wisest to use RMAN or a third party tool to perform backups in complex environments. Any backup andrecovery process must be thoroughly tested, An untested backup and recovery plan is no plan at all.

 The steps in media recovery depend on what files are lost, however, using RMAN simplifies this in that once given a recovercommand the RMAN process will recover the files, up to a complete database, that need to be recovered.

Once a failed file is detected, you use RMAN, a third party program or your own scripts and procedures to bring the affectedfile or files back from your backup media and then, if you are in archivelog mode, apply the archived logs to bring the databaseto a fully recovered state. If you are not using archive logs loss of any datafile means you must fully recover the database to thepoint in time of the last backup.

 The node that performs recovery must be able to access all files that need to be recovered. This means that the recovery nodemust be able to access the online redo logs, datafiles, rollback (undo) tablespaces and all archived redo logs from all databaseinstances. This requirement to see all archive logs may require the NFS mount strategy be used that was laid out in the sectionof this paper on backup.

USING RMAN TO R ECOVER A RAC ENVIRONMENT 

 Just as in backup, there are two environments that we must consider when performing a RAC recovery, these are:1.  Recovery in a OCFS environment

2.  Recovery in a raw filesystem environment

R ECOVERY IN AN OCFS ENVIRONMENT 

In an OCFS environment all nodes can see all files. This ability of an OCFS environment to allow all nodes to see all filesgreatly simplifies the recovery process. The recovery process using OCFS doesn't require NFS mounting or elaborate archivelog copying schemes. For example, in our sample environment using RAC nodes aultlinux1 and aultlinux2 which supportinstances ault1 and ault2, in a OCFS based recovery the commands could be as simple as:

CONFIGURE DEVICE TYPE sbt PARALLELISM 1;CONFIGURE DEFAULT DEVICE TYPE TO sbt;RESTORE DATABASE;RECOVER DATABASE;

If more than one tape device is available the parallelism of the recovery can be increased thus reducing recovery time, forexample:

CONFIGURE DEVICE TYPE sbt PARALLELISM 2;CONFIGURE DEFAULT DEVICE TYPE TO sbt;CONFIGURE CHANNEL 1 DEVICE TYPE sbt CONNECT 'SYS/kr87m@ault1';CONFIGRE CHANNEL 2 DEVICE TYPE sbt CONNECT 'SYS/kr87m@ault2';RESTORE DATABASE;RECOVER DATABASE;

Since Oracle RMAN uses autolocation, the channel connected to each node restores the files backed up by that node.

Remember that the configure commands only have to be issued once.

Page 21: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 21/25

 Architecture and Infrastructure

Paper 215

R ECOVERY IN A R  AW FILESYSTEM ENVIRONMENT 

If you have a raw filesystem environment (I highly suggest if you can use a clustered filesystem, do so, it is easier to manage,backup and restore) then the archive logs form each of the nodes are not available to any of the other nodes for recovery unless you employ some form of log copy procedure (scripted and run through a system scheduling program) or you haveNFS mounted all of the other nodes archive log locations to the node performing the recovery operation. If the NFSmounting scheme is used and one or more nodes are not available then only an incomplete recovery can be performed, up tothe first unavailable archive log. Even in a NFS scheme I suggest some form of log copy script be used to allow the logs fromunavailable nodes to be available for recovery. The frequency of log copying should be based on the required concurrency of your database. If you can only afford to lose 15 minutes worth of data then you better have all archive logs available up to thelast fifteen-minute interval before the media failure.

Using RMAN, a recovery using a single tape looks identical to the recovery using a single tape in the OCFS environment:

CONFIGURE DEVICE TYPE sbt PARALLELISM 1;CONFIGURE DEFAULT DEVICE TYPE TO sbt;

RESTORE DATABASE;RECOVER DATABASE;

However, this assumes all of the archived redo logs are available. If all of the logs are not available then an RMAN scriptsimilar to:

RUN{

SET UNTIL LOG SEQUENCE 2245 THREAD 2'RESTORE DATABASE;RECOVER DATABASE;

}

 ALTER DATABASE OPEN RESETLOGS;

 would be used to recover until the first unavailable log (in the example log sequence 2245 in thread 2 is the first unavailablelog.) Notice the use of the alter command to open the database, the resetlogs option resets the archive log sequences andrenders previous archive logs unusable for most recovery scenarios.

P ARALLEL R ECOVERY  

Recovery in Oracle9i RAC is automatically parallelised for these three stages of recovery:

1.  Restoration of data files

2.   Application of incremental backups

3.   Application of redo logs

 The number of channels configured for a RAC database in RMAN determines the degree of parallel for data file restoration.In our example configuration we could have two streams restoring data files since we configured two channels. The degree of parallel for restoration of incremental backups is also dependent on the number of configured channels.

Redo logs are applied by RMAN using the degree of parallel specified in the initialization parameterRECOVERY_PARALLELISM .

Using manual recovery methods such as sqlplus (remember, in Oracle9i there is no server manager program, so all DBAfunctions are done through SQLPLUS) you can specify values for RECOVERY_PARALLELISM since it is a dynamicparameter, however, it cannot exceed your setting for PARALLEL_MAX_SERVERS . Using the DEGREE option for theRECOVER command you can also control the degree of parallel for other recovery operations.

S TANDBY D ATABASES IN RAC CONFIGURATION 

 The Oracle9i Dataguard feature has revolutionized the standby database concept. It allows Oracle9i to have both physical andlogical standby databases for either normal instance or RAC instances.

Page 22: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 22/25

 Architecture and Infrastructure

Paper 215

In the case of a normal instance Oracle provides processes on the main node that handle the copying of archive logs to thestandby server for physical standby and processes that read the changes from the redo logs and apply them as SQL statements

in the case of logical standby. There are two types of Standby Databases supported for RAC. You can either create the Standby database on a Single-Node oron a Clustered-node system. In Oracle 9i (both releases) the DataGuard Manager does not support RAC, so you have to setupthe Dataguard (Standby) Configuration manually.

SETTING UP A S TANDBY D ATABASE FOR A RAC CLUSTER  TO A SINGLE-INSTANCE (ONE NODE)

 When creating a Standby Database on a Single-Instacne you proceed like a normal Standby Database creation, The steps tocreate a physical standby database for a RAC environment are similar to those employed for a regular instance:

1.   Take a full backup (hot or cold) of your Oracle9i RAC database. (As an alternative, if your database is small (<10gigabytes) place the files in backup mode and transfer them using ftp)

2.  Create a standby control file from any instance in the RAC system: ALTER DATABASE CREATE STANDBY 

CONTROLFILE AS <path>;3.  Restore the database to the standby node and place the standby control file in the appropriate location.

4.  Copy any required archive logs to the new node

5.  Configure the LOG_ARCHIVE_DEST_n parameter for the RAC instance to the proper standby configuration

6.  I suggest using Metalink NOTE:180031.1: "Creating a Data Guard Configuration" for appropriate Network (TNSNAMES.ORA and LISTENER.ORA) settings.

7.  Next, we have to configure the Log-transport-Services. This is done by setting up the LOG_ARCHIVE_DEST_nparameter on all primary instances. All the instances have to archive the Logs to the Service of the Single-NodeStandby System. The Standby System creates a "pool" of Archive logs and on the basis of the SCN it can determine which Archive log from which Thread is the next one to apply. So you have to set the next freeLOG_ARCHIVE_DEST_n-parameter to the Standby Service. The other setting like which process to use for transfer

or type of transfer (SYNC or ASYNC,...) depend on your prefered Protection Mode. (Look at Metalink NOTE:68537.1: Init.ora Parameter "LOG_ARCHIVE_DEST_n" Reference Note).

8.  Now perform STARTUP MOUNT on the Standby Database and set it to the RECOVERY MODE.

CONFIGURATION WHEN THE S TANDBY D ATABASE IS ALSO A CLUSTER (RAC-) S YSTEM

 You can also create the Standby System in a second RAC-environment. This provides more confidence, scalability andperformance. If you can utilize two identical systems, in the case of switchover or failover, there should be no performanceand availability degradation. Normally, you have the same number of Instances (Nodes) on the Standby system as on thePrimary System however this isn't required.

Essentially you proceed like a normal Standby Database creation. You take full backup from your Primary Database and createthe Standby controlfile from any of the Instances. Then you Prepare the Standby system for the Database (create the RAW-devices, if not using OCFS, and logical links, configure your Hardware and install the Oracle Software on each Node). After

that, you are able to restore the backup of your primary instance including the new created Standby Controlfile. Then setup theappropriate parameters in the PFILE or SPFILE for EACH instance correctly (See the preceeding section, Setting Up aStandby Database For a RAC Cluster To a Single-Instance (One Node)) and also configure the Network parameter(TNSNAMES.ORA, LISTENER.ORA) corresponding to your requirements and settings.

 The greatest difference between the multi-node and Single-Node system is, that we now have multiple Standby instances andonly one of them can perform the Recovery.

Every Primary Instance transports its Archive logs to a corresponding Standby Instance. The Standby Instance receiving theLogs now transfers them to the instance performing the Recovery. This is configured by the LOG_ARCHIVE_DEST_nparameter. I also suggest too copy all Standby Redo logs on the Standby Database for each Standby instance. For clarification,I will show you an example:

 We have a Primary RAC Database with two Nodes and two Instances (ault1 and ault2). We also have a Standby-RAC

environment, also containing the Standby RAC Database with two Nodes and Instances (ault3 and ault4). The normal Primary RAC Database has Archive log Mode enabled. Each Primary Instance archives its Redo logs on a formatted Partition of the

Page 23: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 23/25

 Architecture and Infrastructure

Paper 215

Shared Disk, if a Cluster File System (CFS) is supported for the used Platform. Otherwise, the Archive logs are stored on eachnodes Private Disk area. You must use different formats for naming the Archive logs to prevent overwriting. It is required to

use at least the %t in LOG_ARCHIVE_FORMAT to prevent this. The %t represents the Thread Number where the Log comes from.

 Thus, we have the following settings:

Instance ault1:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac1/archives1)LOG_ARCHIVE_FORMAT=arc_%s_%t

Instance ault2:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac2/archives2)LOG_ARCHIVE_FORMAT=arc_%s_%t

Next we add the Log Transport for each Primary RAC Instance to the corresponding Standby RAC Instance. As we haveStandby Redo logs created and want to have maximum Performance, we have now these settings:

Instance ault1:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac1/archives1)LOG_ARCHIVE_DEST_2=(SERVICE=ault3 LGWR ...)LOG_ARCHIVE_FORMAT=arc_%s_%t

Instance ault2:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac2/archives2)LOG_ARCHIVE_DEST_2=(SERVICE=ault4 LGWR ...)LOG_ARCHIVE_FORMAT=arc_%s_%t

Now we designate Instance ault4 as the Recovering Instance (it can be either ault3 or ault4 but for this example we designateault4). This means that the Archive logs from the Instance ault3 have to be transferred to Instance ault4. Instance ault4 is alsoperforming the Archiving to disk process (Again on the Shared Disk that is available for both standby instances). The resulting settings for Instance ault3 and ault4 now are:

Instance ault3:

LOG_ARCHIVE_DEST_1=(SERVICE=ault4 ARCH SYNC)

Instance ault4:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac4/archives4)

If all this is done, you can now STARTUP NOMOUNT and ALTER DATABASE MOUNT STANDBY DATABASE all

instances and put the Recovering Instance into RECOVERY Mode.LOG SHIPPING WHEN THE S TANDBY D ATABASE IS A SINGLE NODE S YSTEM

In the case when the RAC-Standby Configuration is on a Single Node System the prerequisites are the same. The difference isthat all Instances from the Primary Database have to ship their Archive logs to one single system. Due to the Information(SCN's) in the Archive logs, the Managed Recovery Process (MRP) on the Standby Database determines the correct order toapply the Archive logs from the different Threads on the Standby Database. So we have now as the Primary Instances in ourexample Instance ault1 and ault2. The Single Node Standby Instance is ault3. This results in these settings forLOG_ARCHIVE_DEST:

Instance ault1 and ault2:

ault1:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac1/archives1)

LOG_ARCHIVE_DEST_2=(SERVICE=ault3 LGWR ...)

Page 24: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 24/25

 Architecture and Infrastructure

Paper 215

ault2:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac2/archives2)

LOG_ARCHIVE_DEST_2=(SERVICE=ault3 LGWR ...)

 You must also create different Standby Redo Log Threads on the Standby Database. Instance ault3 now receives all the Logsand applies them to the Standby Database.

Instance ault3:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac3/archives3)

CROSS-INSTANCE A RCHIVAL

In addition to the Parameters shown above, you can configure Cross-instance-Archival. This means you can ship your Archivelogs from one Primary RAC Instance to another Primary RAC Instance. This could be helpful in case of Gap-resolution if oneInstance looses Network connectivity or Archive logs are deleted by fault.

 To enable Cross-instance Archival, you simply have to take one free LOG_ARCHIVE_DEST_n-parameter and configure itfor shipping to another Primary Instance. For Cross-Instance Archival only the Archive process is supported and can be used.If we use again our earlier example, the result would be:

Instance ault1:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac1/archives1)LOG_ARCHIVE_DEST_2=(SERVICE=ault3 LGWR ...)LOG_ARCHIVE_DEST_3=(SERVICE=ault2 ARCH ...)LOG_ARCHIVE_FORMAT=arc_%s_%t

If you also use Cross-instance Archival on Instance ault2:

Instance ault2:

LOG_ARCHIVE_DEST_1=(location=/usr/backup/ault_rac2/archives2)LOG_ARCHIVE_DEST_2=(SERVICE=ault3 LGWR ...)LOG_ARCHIVE_DEST_3=(SERVICE=ault1 ARCH ...)LOG_ARCHIVE_FORMAT=arc_%s_%t

 A RCHIVE LOG G AP R ESOLUTION AND FAL 

 There are two Modes of Gap Resolution:

1.   The automatic one, which is controlled by the Archiver and turned on automatically. The Automatic Gap resolution works proactively and will produce some Network overhead.

2.   The FAL (Fetch Archive Log) Method, which requests certain Logs again as required from the remote systems. On

the Receiving Instance you publish one or more FAL-Servers from which it receives its Archive Logs. On the Sending Instance, you publish one or more FAL-Clients, who receive Archive Logs from a specified Instance. The publishing is done by the Initialization Parameters: FAL_SERVER and FAL_CLIENT.

FAL is not supported for Logical Standby Databases. A typical configuration using our example RAC configuration could be:

Instance ault1 and ault2:

FAL_CLIENT=ault3

Instance ault3 and ault4:

FAL_SERVER=ault1,ault2

 When Cross-Instance Archival is configured, Instance ault3 would now receive missing Archive logs from Instance ault2 andInstance ault4 from Instance ault1.

Page 25: 1142483353236

7/27/2019 1142483353236

http://slidepdf.com/reader/full/1142483353236 25/25

 Architecture and Infrastructure

Paper 215

SUMMARY  

 Although RAC configurations can be complex, but the backup and recovery of RAC need not be a nightmare.Using RMAN, the configuration and, backup and recovery in RAC environments is greatly simplified.

Log transport in RAC can be configured to easily support standby or Data Guard configurations.

R EFERENCES

Metalink NOTE:180031.1:Creating a Data Guard Configuration

Metalink NOTE:68537.1: Init.ora Parameter "LOG_ARCHIVE_DEST_n"

Metalink NOTE:150584.1:Data Guard 9i Setup with Guaranteed Protection Mode

Oracle9i Data Guard Concepts and Administration, Release 2 (9.2), A96653-01

Oracle9i Database Reference, Release 2 (9.2), A96536-01

Oracle9i SQL Reference, Release 2 (9.2), A96540-01Oracle9i Real Application Clusters Administration, Release 2 (9.2), A96596-01, Chapters 6-8.

MetaLink NOTE: 203326.1: Data Guard 9i Log Transportation on RAC

Metalink Note: 186592.1: Backing Up Archivelogs in a RAC Cluster

Metalink Note: 145178.1: RAMN9i: Automatic Channel Allocation and Configuration

Metalink Note: 150584.1: Data Guard 9i Setup with Guaranteed Protection Mode