Wednesday, April 27, 2022

Recovery Continuity with Multitenant

 Recovery Continuity with Multitenant is something you need to understand as you migrate databases from one CDB to another






Above is from a presentation that I recently gave to my internal Oracle Team. This was such a big hit (and very eye opening) that I wanted to make sure I share the information on my Blog


The first thing to point out is that before we (Oracle) moved to the multitenant architecture, life was simple. Below is my slide showing how databases were moving around as they upgraded.  Regardless of whether it was an out-of-place upgrade, or migrating it to a different host, the DB name stays, and the backups stay contiguous.


But, like many things in life, new ideas came along that changed the way we do things.  Multitenant is one of those things.  Don't get me wrong, multitenant is a great feature giving DBAs a lot more flexibility.  Below are a couple of pictures that show all the wonderful things that multitenant can do.




Above are the 2 slides from my presentation.  These slides are often used to show the benefits of multitenant.  I did point out on the last slide encryption keys that are used to secure the database with TDE.

The use of Encryption keys is an important point to think about.  With Multitenant (if you think about how it works), the CDB has a different encryption key from the PDB.  If I create an encrypted backup of my CDB, it is encrypted with the CDB key. The backup (and actual datafiles) for my PDB are encrypted with the PDB key.

Below is the next slide I used.  All the information on multitenant talks about how easy it is to unplug/plug (which it is), but ensuring you maintain your recovery window is the hard part.



Database backup and recovery in a multitenant environment

Here are some things to keep in mind in a multitenant environment

  • Pluggable database backup pieces are ALWAYS kept independent of the CDB and other PDBs.  Even with a filesperset=1000 channels=1 each PDB, and the CDB will be individual backupsets.
  • Pluggable databases can be backed up independently of each other, and of the CDB. “backup pluggable database xxx.
  • You can perform a point in time recovery of a pluggable database independent of other PDBS. This requires local undo. “recover pluggable database until”
  • Recovering a pluggable database requires a backup of the CDB (for metadata), and backups of the archive logs.
  • All redo transactions for all PDBs are intertwined into a single redo stream. This will not change in the near future. 
  • Flashback can be set at the PDB level
  • You can create restore points within a PDB
When backing up a Multitenant environment, the item to keep in mind is that the RMAN catalog information is stored at the CDB level.  Pluggable databases are part of the CDB, and registration is done at the CDB.


The next image shows what a recovery of the Pluggable database looks like. Keep in mind that the datafiles for the  pluggable database get restored using the pluggable database backup, but to defuzzy them, the archive logs get restored from the CDB.  Remember that in a multitenant environment the redo/archive logs are intertwined at the CDB level.


The next image shows what is typically done to perform a PDB upgrade with unplug/plug. The pluggable database is migrated from 12c to 19c.


Now that the database is migrated, let's look at what happens to the RMAN catalog after the migration to ensure that we have a backup of the pluggable database.



You can see in the image above, that the pluggable database is now associated with the CDB that the pluggable database is plugged into.

Now to go back to the image at the beginning of this post, you can see what it takes to restore and recovery the database throughout it's lifecycle.

  • Backups that were taken through previous CDBs (for example an archival backup) needs to be restored through the CDB is was backed up through.
  • Backups that were taken in original CDB can only be restored back to the original CDB.
  • Pre-plugin backups provide a gateway between plugging in and when the first backup is taken
  • Backups to the new CDB will restored back to the new CDB.



Finally some parting thoughts on backups of pluggable databases when migrating.

  • Perform a full backup if possible (ZDLRA makes this easy) with the PDB mounted prior to unplugging. This is the best possible restore point after migrating.
  • Keep the RMAN catalog entries for the old CDB as long as there are valid backups pieces. This could be years for keep backups.
  • NOTE – On the ZDLRA you can execute “Pause Database” this will remove all backups, but leave the RMAN catalog entries.
  • Ensure you have the encryption keys for both CDBs and PDBs for the needed recovery window which may be years.
  • Keep track of CDB backups, as a PDB might be migrated between multiple CDBs throughout it’s backup cycle.
  • NEVER delete a CDB backup that has needed backups
  • NEVER delete any TDE keys or wallets that support needed backups.

Friday, April 15, 2022

Recovery Continuity of your Oracle Database

"Recovery Continuity" should be a critical part of your Oracle Database support plan.
As multitenant Oracle Databases becomes the standard for database implementations, you need to ensure that you maintain your recovery window even as your pluggable moves around your environment.

Above is the recommended practice we have all been hearing about to make upgrades of your Oracle Database easier. Unplug from your current CDB (CDBPROD122) plug into a new CDB (CDB19C) that has the new release.  What you need to think about however, is how am I  going to ensure that I can recover my pluggable database to any point in time, all the way this migration without a huge amount of downtime?

This is where preplugin backups, and some planning comes into play.
You can find out more about preplugin backups with some of the links below.
Let's take a look at what I am doing for my pluggable database PDBDWPROD before I migrate it from OLDCDB to NEWCDB.

Pre-unplug


In the picture PDBDWPROD is plugged into CDBPROD122.

In my environment I am testing my PDB (PDBDWPROD)  and it is plugged into OLDCDB,  migrating to NEWCDB.

To ensure that I have a good restore point I am going to perform a full backup of my pluggable database prior to unplugging, and I will also include an archive log backups. 

RMAN> backup incremental level 0 pluggable database PDBDWPROD plus archivelog delete input;


Starting backup at 15-APR-22
current log archived
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=426 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Database Backup Service Library VER=23.0.0.1
channel ORA_SBT_TAPE_1: starting compressed archived log backup set
channel ORA_SBT_TAPE_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=185 RECID=186 STAMP=1102096068
input archived log thread=1 sequence=186 RECID=187 STAMP=1102096071
input archived log thread=1 sequence=187 RECID=188 STAMP=1102096147
input archived log thread=1 sequence=188 RECID=189 STAMP=1102096166
input archived log thread=1 sequence=189 RECID=190 STAMP=1102096288
channel ORA_SBT_TAPE_1: starting piece 1 at 15-APR-22
channel ORA_SBT_TAPE_1: finished piece 1 at 15-APR-22
piece handle=6l0r19t1_213_1_1 tag=TAG20220415T175129 comment=API Version 2.0,MMS Version 23.0.0.1
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:03
channel ORA_SBT_TAPE_1: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_185_k5mcy4w8_.arc RECID=186 STAMP=1102096068
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_186_k5mcy7xv_.arc RECID=187 STAMP=1102096071
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_187_k5md0m8l_.arc RECID=188 STAMP=1102096147
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_188_k5md16rt_.arc RECID=189 STAMP=1102096166
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_189_k5md50qy_.arc RECID=190 STAMP=1102096288
Finished backup at 15-APR-22

Starting backup at 15-APR-22
using channel ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: starting compressed incremental level 0 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
input datafile file number=00066 name=/u01/app/oracle/oradata/OLDCDB/PDBDWPROD/sysaux01.dbf
input datafile file number=00065 name=/u01/app/oracle/oradata/OLDCDB/PDBDWPROD/system01.dbf
input datafile file number=00068 name=/u01/app/oracle/oradata/OLDCDB/PDBDWPROD/PDBDWPROD.dbf
input datafile file number=00067 name=/u01/app/oracle/oradata/OLDCDB/PDBDWPROD/undotbs01.dbf
channel ORA_SBT_TAPE_1: starting piece 1 at 15-APR-22
channel ORA_SBT_TAPE_1: finished piece 1 at 15-APR-22
piece handle=6m0r19t5_214_1_1 tag=TAG20220415T175132 comment=API Version 2.0,MMS Version 23.0.0.1
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:15
Finished backup at 15-APR-22

Starting backup at 15-APR-22
current log archived
using channel ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: starting compressed archived log backup set
channel ORA_SBT_TAPE_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=190 RECID=191 STAMP=1102096309
channel ORA_SBT_TAPE_1: starting piece 1 at 15-APR-22
channel ORA_SBT_TAPE_1: finished piece 1 at 15-APR-22
piece handle=6n0r19tm_215_1_1 tag=TAG20220415T175150 comment=API Version 2.0,MMS Version 23.0.0.1
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:03
channel ORA_SBT_TAPE_1: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_190_k5md5osn_.arc RECID=191 STAMP=1102096309
Finished backup at 15-APR-22

Starting Control File and SPFILE Autobackup at 15-APR-22
piece handle=c-1180802953-20220415-07 comment=API Version 2.0,MMS Version 23.0.0.1
Finished Control File and SPFILE Autobackup at 15-APR-22

RMAN>


Then right before the unplug I am going to execute another archive log backup, immediately followed by the unplug.

RMAN> backup archivelog all delete input;

Starting backup at 15-APR-22
current log archived
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=442 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Database Backup Service Library VER=23.0.0.1
channel ORA_SBT_TAPE_1: starting compressed archived log backup set
channel ORA_SBT_TAPE_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=191 RECID=192 STAMP=1102096412
input archived log thread=1 sequence=192 RECID=193 STAMP=1102096418
input archived log thread=1 sequence=193 RECID=194 STAMP=1102096424
input archived log thread=1 sequence=194 RECID=195 STAMP=1102096502
channel ORA_SBT_TAPE_1: starting piece 1 at 15-APR-22
channel ORA_SBT_TAPE_1: finished piece 1 at 15-APR-22
piece handle=6p0r1a3n_217_1_1 tag=TAG20220415T175503 comment=API Version 2.0,MMS Version 23.0.0.1
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:07
channel ORA_SBT_TAPE_1: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_191_k5md8w57_.arc RECID=192 STAMP=1102096412
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_192_k5md926n_.arc RECID=193 STAMP=1102096418
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_193_k5md9893_.arc RECID=194 STAMP=1102096424
archived log file name=/u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_194_k5mdcpko_.arc RECID=195 STAMP=1102096502
Finished backup at 15-APR-22

Starting Control File and SPFILE Autobackup at 15-APR-22
piece handle=c-1180802953-20220415-08 comment=API Version 2.0,MMS Version 23.0.0.1
Finished Control File and SPFILE Autobackup at 15-APR-22



Then the unplug

SQL>  alter pluggable database PDBDWPROD  close immediate;

Pluggable database altered.

SQL> ALTER PLUGGABLE DATABASE PDBDWPROD  UNPLUG INTO '/tmp/PDBDWPROD.xml';

Pluggable database altered.

SQL>


Plug


SQL>  create pluggable database PDBDWPROD using  '/tmp/PDBDWPROD.xml' nocopy tempfile reuse KEYSTORE IDENTIFIED BY "change-on-install" ;
Pluggable database created. SQL> alter pluggable database PDBDWPROD open;
Pluggable database altered.

Update database and set restore point


Now I am going create some objects in my PDB, set a restore point, and then create a few more objects to ensure I am restoring to a point in time.
SQL>  alter session set container=PDBDWPROD;

Session altered.

SQL> create table bgrenn.postmove as select * from dba_objects ;

Table created.

############################ perform a couple of log switches

SQL>  alter session set container=CDB$ROOT;
Session altered.

SQL> alter system archive log current;
System altered.

SQL> alter system archive log current;
System altered.

SQL>  alter session set container=PDBDWPROD;
Session altered.

############################ create a restore point

SQL> create restore point PDBDWPROD_restore;
Restore point created.

############################  create a second table

SQL> create table bgrenn.postrestorepoint as select * from dba_objects ;
Table created.

############################ perform a couple of log switches

SQL> alter session set container=CDB$ROOT;
Session altered.

SQL> alter system archive log current;
System altered.

SQL> alter system archive log current;
System altered.

SQL> alter system archive log current;
System altered.



Backups available post plugin

Now using the preplugin commands I can see the backups that we taken before the migration.

rman> SET PREPLUGIN CONTAINER=PDBDWPROD;
rman> list preplugin backup of pluggable database PDBDWPROD;

 

RMAN>  list preplugin backup of pluggable database PDBDWPROD;

starting full resync of recovery catalog
full resync complete

List of Backup Sets
===================


BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
209     Incr 0  284.50M    SBT_TAPE    00:00:07     15-APR-22
        BP Key: 209   Status: AVAILABLE  Compressed: YES  Tag: TAG20220415T175132
        Handle: 6m0r19t5_214_1_1   Media: objectstorage.us-ashburn-1.oraclecloud.com/n/xxx/oldcdb
  List of Datafiles in backup set 209
  Container ID: 5, PDB Name: PDBDWPROD
  File LV Type Ckp SCN    Ckp Time  Abs Fuz SCN Sparse Name
  ---- -- ---- ---------- --------- ----------- ------ ----
  59   0  Incr 6346380    15-APR-22              NO    /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/system01.dbf
  60   0  Incr 6346380    15-APR-22              NO    /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/sysaux01.dbf
  61   0  Incr 6346380    15-APR-22              NO    /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/undotbs01.dbf
  62   0  Incr 6346380    15-APR-22              NO    /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/PDBDWPROD.dbf


list preplugin backup of archivelog all;

List of Backup Sets
===================


BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
208     2.25M      SBT_TAPE    00:00:01     15-APR-22
        BP Key: 208   Status: AVAILABLE  Compressed: YES  Tag: TAG20220415T175129
        Handle: 6l0r19t1_213_1_1   Media: objectstorage.us-ashburn-1.oraclecloud.com/n/xxx/oldcdb

  List of Archived Logs in backup set 208
  Thrd Seq     Low SCN    Low Time  Next SCN   Next Time
  ---- ------- ---------- --------- ---------- ---------
  1    185     6345022    15-APR-22 6345387    15-APR-22
  1    186     6345387    15-APR-22 6345399    15-APR-22
  1    187     6345399    15-APR-22 6345803    15-APR-22
  1    188     6345803    15-APR-22 6345912    15-APR-22
  1    189     6345912    15-APR-22 6346322    15-APR-22

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
210     256.00K    SBT_TAPE    00:00:00     15-APR-22
        BP Key: 210   Status: AVAILABLE  Compressed: YES  Tag: TAG20220415T175150
        Handle: 6n0r19tm_215_1_1   Media: objectstorage.us-ashburn-1.oraclecloud.com/n/xxx/oldcdb

  List of Archived Logs in backup set 210
  Thrd Seq     Low SCN    Low Time  Next SCN   Next Time
  ---- ------- ---------- --------- ---------- ---------
  1    190     6346322    15-APR-22 6346391    15-APR-22

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
212     512.00K    SBT_TAPE    00:00:02     15-APR-22
        BP Key: 212   Status: AVAILABLE  Compressed: YES  Tag: TAG20220415T175503
        Handle: 6p0r1a3n_217_1_1   Media: objectstorage.us-ashburn-1.oraclecloud.com/n/id20skavsofo/oldcdb

  List of Archived Logs in backup set 212
  Thrd Seq     Low SCN    Low Time  Next SCN   Next Time
  ---- ------- ---------- --------- ---------- ---------
  1    191     6346391    15-APR-22 6346585    15-APR-22
  1    192     6346585    15-APR-22 6346593    15-APR-22
  1    193     6346593    15-APR-22 6346601    15-APR-22
  1    194     6346601    15-APR-22 6346663    15-APR-22



Restore from preplugin

I shutdown my pluggable database and start with "from preplugin" in the command in my rman session.

RAMN> alter pluggable database PDBDWPROD close;
RMAN> restore pluggable database PDBDWPROD   from preplugin;

RMAN> alter pluggable database PDBDWPROD close;

Statement processed
starting full resync of recovery catalog
full resync complete

RMAN> restore pluggable database PDBDWPROD   from preplugin;

Starting restore at 15-APR-22
using channel ORA_SBT_TAPE_1
using channel ORA_DISK_1

channel ORA_SBT_TAPE_1: starting datafile backup set restore
channel ORA_SBT_TAPE_1: specifying datafile(s) to restore from backup set
channel ORA_SBT_TAPE_1: restoring datafile 00059 to /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/system01.dbf
channel ORA_SBT_TAPE_1: restoring datafile 00060 to /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/sysaux01.dbf
channel ORA_SBT_TAPE_1: restoring datafile 00061 to /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/undotbs01.dbf
channel ORA_SBT_TAPE_1: restoring datafile 00062 to /u01/app/oracle/oradata/OLDCDB/PDBDWPROD/PDBDWPROD.dbf
channel ORA_SBT_TAPE_1: reading from backup piece 6m0r19t5_214_1_1
channel ORA_SBT_TAPE_1: piece handle=6m0r19t5_214_1_1 tag=TAG20220415T175132
channel ORA_SBT_TAPE_1: restored backup piece 1
channel ORA_SBT_TAPE_1: restore complete, elapsed time: 00:00:25
Finished restore at 15-APR-22


Recover from preplugin

Now I am running the recover from preplugin

 recover pluggable database PDBDWPROD   from preplugin;
RMAN>

Starting recover at 15-APR-22
using channel ORA_SBT_TAPE_1
using channel ORA_DISK_1

starting media recovery

channel ORA_SBT_TAPE_1: starting archived log restore to default destination
channel ORA_SBT_TAPE_1: restoring archived log
archived log thread=1 sequence=190
channel ORA_SBT_TAPE_1: reading from backup piece 6n0r19tm_215_1_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 04/15/2022 18:23:58
ORA-19870: error while restoring backup piece 6n0r19tm_215_1_1
ORA-19827: Restoring preplugin files to a recovery area is not supported.

RMAN>


You can see that it is not going to let me apply the archive logs by restoring them from backup to the local recovery area of my new CDB.

I need to catalog the archive logs themselves by restoring them.

By looking at the backup piece name, I can see it is looking for "sequence 190" and I restored it from my original CDB.


RMAN> restore archivelog sequence 190;

Starting restore at 15-APR-22
starting full resync of recovery catalog
full resync complete
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=449 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Database Backup Service Library VER=23.0.0.1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=71 device type=DISK

channel ORA_SBT_TAPE_1: starting archived log restore to default destination
channel ORA_SBT_TAPE_1: restoring archived log
archived log thread=1 sequence=190
channel ORA_SBT_TAPE_1: reading from backup piece 6n0r19tm_215_1_1
channel ORA_SBT_TAPE_1: piece handle=6n0r19tm_215_1_1 tag=TAG20220415T175150
channel ORA_SBT_TAPE_1: restored backup piece 1
channel ORA_SBT_TAPE_1: restore complete, elapsed time: 00:00:01
Finished restore at 15-APR-22

RMAN> list archivelog sequence 190;

List of Archived Log Copies for database with db_unique_name OLDCDB
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
8134    1    190     A 15-APR-22
        Name: /u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_190_k5mgd1h2_.arc




Now I need to catalog it in preplugin backup to continue the recovery.
I am able to copy the restored archive log to /tmp and catalog it, but I am still missing some pieces. I will continue restoring the rest of archivelogs that in the listing up to sequence 194

RMAN> restore archivelog sequence 190;

Starting restore at 15-APR-22
starting full resync of recovery catalog
full resync complete
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=449 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Database Backup Service Library VER=23.0.0.1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=71 device type=DISK

channel ORA_SBT_TAPE_1: starting archived log restore to default destination
channel ORA_SBT_TAPE_1: restoring archived log
archived log thread=1 sequence=190
channel ORA_SBT_TAPE_1: reading from backup piece 6n0r19tm_215_1_1
channel ORA_SBT_TAPE_1: piece handle=6n0r19tm_215_1_1 tag=TAG20220415T175150
channel ORA_SBT_TAPE_1: restored backup piece 1
channel ORA_SBT_TAPE_1: restore complete, elapsed time: 00:00:01
Finished restore at 15-APR-22

RMAN> list archivelog sequence 190;

List of Archived Log Copies for database with db_unique_name OLDCDB
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
8134    1    190     A 15-APR-22
        Name: /u01/app/oracle/fast_recovery_area/OLDCDB/archivelog/2022_04_15/o1_mf_1_190_k5mgd1h2_.arc




Now that I restored and catalog all the backup pieces up to sequence 194, I will continue the recovery.
RMAN>  recover pluggable database PDBDWPROD   from preplugin;

Starting recover at 15-APR-22
using channel ORA_SBT_TAPE_1
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 191 is already on disk as file /tmp/o1_mf_1_191_k5mgw2fr_.arc
archived log for thread 1 with sequence 192 is already on disk as file /tmp/o1_mf_1_192_k5mgw83q_.arc
archived log for thread 1 with sequence 193 is already on disk as file /tmp/o1_mf_1_193_k5mgwlf8_.arc
archived log for thread 1 with sequence 194 is already on disk as file /tmp/o1_mf_1_194_k5mgx0t1_.arc
unable to find archived log
archived log thread=1 sequence=195
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 04/15/2022 18:40:31
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 195 and starting SCN of 6346663



I am finding that there is still one last archivelog (hopefully). This was the redo log that was active while I unplugged my database.
In fact I can see on the source CDB, that it is still the active redo log, so I am going to have to do a log switch to grab a copy of the archive log and catalog it.

SQL> select sequence#,status from v$log;

 SEQUENCE# STATUS
---------- ----------------
       195 CURRENT
       193 INACTIVE
       194 INACTIVE


Now that I have the last archive log, my preplug recovery is completed to the time it was unplugged.

RMAN> recover pluggable database PDBDWPROD   from preplugin;

Starting recover at 15-APR-22
using channel ORA_SBT_TAPE_1
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 195 is already on disk as file /tmp/o1_mf_1_195_k5mhf79h_.arc
media recovery complete, elapsed time: 00:00:01
Finished recover at 15-APR-22



Recover post plugin

Now I can recover to my restore point, and open it up.


RMAN>

RMAN> recover pluggable database PDBDWPROD  until restore point PDBDWPROD_restore;

Starting recover at 15-APR-22
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=4 device type=DISK


starting media recovery

archived log for thread 1 with sequence 101 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_101_k5mf0rr5_.arc
archived log for thread 1 with sequence 102 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_102_k5mf0s8b_.arc
archived log for thread 1 with sequence 103 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_103_k5mf1wof_.arc
archived log for thread 1 with sequence 104 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_104_k5mf1zqm_.arc
archived log for thread 1 with sequence 105 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_105_k5mf22sk_.arc
archived log for thread 1 with sequence 106 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_106_k5mf91fn_.arc
archived log for thread 1 with sequence 107 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_107_k5mf94g2_.arc
archived log for thread 1 with sequence 108 is already on disk as file /u01/app/oracle/fast_recovery_area/NEWCDB/archivelog/2022_04_15/o1_mf_1_108_k5mf9wk2_.arc
media recovery complete, elapsed time: 00:00:01
Finished recover at 15-APR-22

RMAN> alter pluggable database PDBDWPROD open resetlogs;

Statement processed
starting full resync of recovery catalog
full resync complete



And let's make sure I can see that it was recovered until prior to my restore point, and that I can see the data in the table.

SQL>  alter session set container=PDBDWPROD;

Session altered.

SQL> select table_name from dba_tables where owner='BGRENN';

TABLE_NAME
--------------------------------------------------------------------------------
POSTMOVE

SQL> select count(1) from bgrenn.postmove;

  COUNT(1)
----------
     73610




Conclusion : 


Preplugin backups provide you with Recovery Continuity ensuring you can recovery your pluggable after migrating to a new PDB even before you take your first backup.  As you can tell by my example, you want to make sure you take the backup as close to the point in time you are unplugging as possible to lessen the work to catalog and apply the archive logs.  I would also recommend you take a backup on the new CDB as soon as possible.  

Tuesday, March 22, 2022

Backup Anywhere offers Expanded Replication for High Availability and More Flexibility

X

Backup Anywhere offers Expanded Replication for High Availability and More Flexibility







The previous release of the Zero Data Loss Recovery Appliance software (19.2.1.1.2) includes 3 new exciting features for replication. 

  • Backup Anywhere - Providing the ability to change roles (upstream vs downstream).
  • Read Only replication - Providing seamless migration to a different Recovery Appliance.
  • Request Only Replication - Providing a High Availability option for backups.

Backup Anywhere

 Backup Anywhere provides even more options for HADR (High Available/Disaster Recovery) with the ability to redirect backups and redo to another Recovery Appliance. In addition, Backup Anywhere provides the ability to perform a role reversal, removing the concept of upstream/downstream.  As the name implies, when replicating between two or more Zero Data Loss Recovery Appliances you can switch the Recovery Appliance that is receiving backups from your protected databases. 

With Backup Anywhere you configure two Recovery Appliance as pairs and create replication servers that point to each other.  The metadata synchronization will ensure backups are replicated to its pair and ensures the Replication Appliance pairs stay in sync.

NOTE: In order to use Backup Anywhere you must use the new REPUSER naming convention of REPUSER_FROM_<source>_TO_<destination>.

For my example, the diagram below depicts a three Zero Data Loss Appliance architecture with the primary databases in New York sending backups to the Recovery Appliance in the New York Data Center,  The Recovery Appliance in the New York Data Center replicates backups to the Recovery Appliance in the London Data Center. And finally, the Recovery Appliance in the London Data Center replicates backups to the Recovery Appliance in Singapore.

New York --> London --> Singapore



But what happens If I want to change which Recovery Appliance I am sending my backups to? With Backup Anywhere I can change the Recovery Appliance receiving backups, and the flow of replicated backups will be taken care of automatically.  With Backup Anywhere the Recover Appliances will seamlessly change the direction of the replication stream based on which Recover Appliance is currently receiving the backups.  Backup Anywhere does this automatically and will still ensure backups on the three Zero Data Loss Appliances are synchronized and available

Singapore --> London --> New York.


 


Read Only Replication

This is my favorite new feature included in the latest Recovery Appliance release. Read Only allows you to easily migrate your backups to a new Recovery Appliance while leaving the older backups still available.

Replication normally synchronizes the upstream catalog with the downstream catalog AND ensures that backups are replicated to the downstream. With Read Only Replication, only the synchronization occurs.  The upstream Recovery Appliance (typically the new RA) knows about the backups on the downstream Recovery Appliance (the old RA).  If a restore is requested that is not on the upstream Recovery Appliance, the upstream will pull the backup from the downstream.

The most common use case is retiring older pieces of equipment, but Read Only Replication can be used for additional use cases.

  • Migrating backups to a new datacenter
  • Migrating backups for a subset of database from an overloaded Recovery Appliance to a new Recovery Appliance to balance the workload

 Replace older Recovery Appliance

In this example I want to replace the current Recovery Appliance (ZDLRAOLD) with a new Recovery Appliance (ZDLRANEW).  During this transition period I want ensure that backups are always available from the protected database.  This example will show the migration of backups from ZDLRAOLD to ZDLRANEW. I am keeping 30 days of backups for my databases and I am starting the migration on September 1.

Step #1 - September 1, configure replication from ZDLRAOLD to ZDLRANEW

Create a replication server from ZDLRAOLD to ZDLRANEW and add the policy(s) for the databases to the replication server.  This will replicate the most current level 0 backup (FULL)  onto ZDLRANEW for all databases without changing the backup location from the protected databases.



Once you have ensured that all databases have replicated a level 0 backup to ZDLRANEW you can remove the replication server from ZDLRAOLD which will stop the replication.

Step #2 - September 2, configure Read Only replication from ZDLRANEW to ZDLRAOLD

Create a replication server from ZDLRANEW to ZDLRAOLD. Add the policies all databases to the replication server and ensure that the read only flag is set when adding the policy.

 

PROCEDURE add_replication_server (
   replication_server_name IN VARCHAR2,
   protection_policy_name IN VARCHAR2
   skip_initial_replication IN BOOLEAN DEFAULT FALSE,
   read_only IN BOOLEAN DEFAULT FALSE,
   request_only IN BOOLEAN DEFAULT FALSE);
 

Note: The Read Only flag must be set when adding the policy to the replication server to ensure backups are NOT replicated from ZDLRANEW to ZDLRAOLD.

 


 

Step #3 - September 3, configure backups from the protected databases to backup to ZDLRANEW.

At this point ZDLRANEW should contain at least 1 full backup for all databases, and the incremental backups will begin on September 3rd.  ZDLRANEW will now contain backups from September 1 (when replication began) until the most current Level 0 virtualized backup taken.  ZDLRAOLD will contain backups from August 4 until September 2nd when protected database backups to ZDLRAOLD were moved to be sent to ZDLRANEW.



Step #4 - September 4+, ZDLRANEW contains all new backups and old backups age off ZDLRAOLD

Below is a snapshot of what the backups would look like 15 days later on September 15th.  Backups are aging off of ZDLRAOLD and ZDLRANEW now contains 15 days of backups.



 

Step #5 - September 15, Restore backups

To restore the protected database using a point in time you would connect the protected database to ZDLRANEW and ZDLRANEW would provide the correct virtual full backup regardless of its location.

1.       If the Full backup prior to the point-in-time is on ZDLRANEW it is restored directly from there.

2.     If the Full backup is NOT on ZDLRANEW, it will get pulled from ZDLRAOLD through ZDLRANEW back to the protected database

The location of the backups is transparent to the protected database, and ZDLRANEW manages where to restore the backup from.



Step #6 - September 30  Retire ZDLRAOLD

At this point the new Recovery Appliance ZDLRANEW contains 30 days of backups and the old Recovery Appliance ZDLRAOLD can be retired.



  

Request Only Mode

 

Request Only Mode is used when Data Guard is present and both the Primary database and the Data Guard database are backing up to a local Recovery Appliance. The two Recovery Appliances synchronize only  the metadata, no backup pieces are actively replicated. But, in the event of a prolonged outage of either Recovery Appliance, this features provides the ability to fill gaps by replicating backups from its paired Recovery Appliance. 

To implement this feature, replication servers are configured on both Recovery Appliances, and the policies are added to the replication server specifying REQUEST_ONLY=TRUE.

 

PROCEDURE add_replication_server (
   replication_server_name IN VARCHAR2,
   protection_policy_name IN VARCHAR2
   skip_initial_replication IN BOOLEAN DEFAULT FALSE,
   read_only IN BOOLEAN DEFAULT FALSE,
   request_only IN BOOLEAN DEFAULT FALSE);
 

Below is my environment that is configured and running in a normal mode. I have my primary database in San Francisco, and my standby database in New York.  Both databases, Primary and Standby are backing up to the local Recovery Appliance in their respective same data center.  Request Only Mode is configured between the two Recovery Appliances.



 

To demonstrate what happens when a failure occurs, I will assume that the Recovery Appliance in the SFO datacenter is down for a period of time.  In this scenario, backups can no longer be sent to the SFO Recovery Appliance, but Data Guard Redo Traffic still occurs to the standby database in New York, and the standby database in New York is still backing up locally to the Recovery Appliance in New York.



When the SFO appliance comes back on-line, it will synchronize the backup information with that on the NYC Recovery Appliance.  The SFO appliance will request datafile backups and any controlfile backups that are older than 48 hours, from NYC appliance.

NOTE: The assumption is that a new backup will occur locally over a faster LAN network and fill any gaps within the last 48 hours. The backups requested from its pair will be transferred over a slower WAN and fill any gaps older than 48 hours

If Real-Time redo is configured, the protected databases will immediately begin the archived log gap fetch process, and fill any gaps in archive logs on SFO appliance that are available on the protected databases. The SFO appliance will also check for new logs to be requested from NYC appliance once per hour over the next 6 hours. This gives time for local arch log gap fetch to run via LAN, which is faster than replicating logs via WAN from NYC.

HADR Bonus Feature: Since the SFO appliance recovery catalog is immediately synchronized with the NYC recovery catalog, backup pieces on the NYC Recovery Appliance are available for recovery.  With this capability you have full recovery protection as soon as the catalog synchronization completes.

 



 

 



This ensures that the SFO Recovery Appliance will be able to provide a short Recovery Point Object without waiting for the next backup job to occur.

All of this happens transparently and quickly returns the Recovery Appliance to the expected level of protection for the database backups.

 

For more details on implementing different replication modes, refer to the Administrator’s Guide.

 

 

 


Tuesday, February 8, 2022

Managing your ZDLRA replication queue remotely

 With the rise of Cyber Crime, more and more companies are looking at an architecture with a second backup copy that is protected with an airgap.   Below is the common architecture that I am seeing.


In this post I will walk through an example of how to implement a simple Java program that performs the tasks necessary to manage the airgap for a ZDLRA that is implemented in a cyber vault (DC1 Vault in the picture).  Feel free to use this as a starting point to automate the process.

Commands

There are 3 commands that I need to be able execute remotely

  • PAUSE      -This will pause the replication server that I configured
  • RESUME - This will resume the replication server that I configured
  • QUERY    - This will query the queue on the upstream to determine how much is left in the queue.
First however I need to configure the parameters to execute the calls.

Config file (airgap.config).

I create config file to customize the script for my environment. Below are the parameters that I needed to connect to the ZDLRA and execute the commands.
  • HOST                    - This is name of the scan listener on upstream ZDLRA.
  • PORT                     - This is the Sqlnet port being used to connect to the upstream ZDLRA
  • SERVICE_NAME - Service name of the database on the upstream ZDLRA
  • USERNAME         - The username to connect to the upstream database
  • PASSWORD          - Password for the user. Feel free to encrypt this in java.
  • REPLICATION_SERVER - Replication server to manage

Below is what my config file looks like.

airgap.host=oracle-19c-test-tde
airgap.port=1521
airgap.service_name=ocipdb
airgap.username=bgrenn
airgap.password=oracle
airgap.replication_server=replairgap


Java code (airgap.java).

Java snippet start

The start of the Java Code will import the functions necessary and set up my class


import java.sql.*;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.FileInputStream;
import java.util.Date;
import java.util.Properties;

// Create a arigap class
public class airgap {

   private Properties prop = new Properties();


Java snippet get properties

The first method will get the airgap properties from the property files so that I can use them in the rest of the methods.

// Create a get_airgap_properties method
  public void get_airgap_properties()
        {
                String fileName = "airgap.config";
                try (FileInputStream fis = new FileInputStream(fileName)) {
                    prop.load(fis);
                } catch (FileNotFoundException ex) {
                    System.out.println("cannot find config file airgap.config");
                } catch (IOException ex) {
                    System.out.println("unknown issue finding config file airgap.config");
                }
        }



Java snippet pause replication server

The code below will connect to the database and execute DBMS_RA.PAUSE_REPLICATION_SERVER


// Create a pause_replication  method
  public void pause_replication()
        {
                try     {
                        //Loading driver
                        Class.forName("oracle.jdbc.driver.OracleDriver");

                        //creating connection
                        Connection con = DriverManager.getConnection
                                        ("jdbc:oracle:thin:@//"+
                                         prop.getProperty("airgap.host")+":"+
                                         prop.getProperty("airgap.port")+"/"+
                                         prop.getProperty("airgap.service_name"),
                                         prop.getProperty("airgap.username"),
                                         prop.getProperty("airgap.password"));

                        CallableStatement cs=con.prepareCall("{call dbms_ra.pause_replication_server(?)}");

                        //Set IN Parameters
                        String in1 = prop.getProperty("airgap.replication_server");
                        cs.setString(1,in1);

                        ResultSet rs = cs.executeQuery();   //executing statement


                        con.close();    //closing connection
                        System.out.println("replication server '"+ prop.getProperty("airgap.replication_server")+"' paused");
                        }
                catch(Exception e)      {
                        e.printStackTrace();
                                        }
        }



Java snippet resume replication server

The code below will connect to the database and execute DBMS_RA.RESUME_REPLICATION_SERVER


// Create a pause_replication  method
  public void resume_replication()
        {
                try     {
                        //Loading driver
                        Class.forName("oracle.jdbc.driver.OracleDriver");

                        //creating connection
                        Connection con = DriverManager.getConnection
                                        ("jdbc:oracle:thin:@//"+
                                         prop.getProperty("airgap.host")+":"+
                                         prop.getProperty("airgap.port")+"/"+
                                         prop.getProperty("airgap.service_name"),
                                         prop.getProperty("airgap.username"),
                                         prop.getProperty("airgap.password"));

                        CallableStatement cs=con.prepareCall("{call dbms_ra.resume_replication_server(?)}");

                        //Set IN Parameters
                        String in1 = prop.getProperty("airgap.replication_server");
                        cs.setString(1,in1);

                        ResultSet rs = cs.executeQuery();   //executing statement


                        con.close();    //closing connection
                        System.out.println("replication server '"+ prop.getProperty("airgap.replication_server")+"' resumed");
                        }
                catch(Exception e)      {
                        e.printStackTrace();
                                        }
        }


Java snippet query replication server

The java code below will query the replication queue in the upstream ZDLRA and return 4 columns
  • REPLICATION SERVER - name of the replication server
  • TASKS QUEUED - Number of tasks in the queue to be replicated
  • TOTAL GB QUEUED - Amount of data in the queue
  • MINUTES IN QUEUE - The number of minutes the oldest replication piece has been in the queue.
The last piece of information can be very useful to tell you how current the replication is. With real-time redo, the queue may never be empty.

// Create a queue_select method
  public void queue_select()
        {
                try     {
                        //Loading driver
                        Class.forName("oracle.jdbc.driver.OracleDriver");

                        //creating connection
                        Connection con = DriverManager.getConnection
                                        ("jdbc:oracle:thin:@//"+
                                         prop.getProperty("airgap.host")+":"+
                                         prop.getProperty("airgap.port")+"/"+
                                         prop.getProperty("airgap.service_name"),
                                         prop.getProperty("airgap.username"),
                                         prop.getProperty("airgap.password"));

                        Statement s=con.createStatement();      //creating statement

                        ResultSet rs=s.executeQuery("select replication_server_name,"+
                                                    "       count(*)  tasks_queued,"+
                                                    "       trunc(sum(total)/1024/1024/1024,0) AS TOTAL_GB_QUEUED,"+
                                                    "       round("+
                                                    "         (cast(current_timestamp as date) - cast(min(start_time) as date))"+
                                                    "             * 24 * 60"+
                                                    "         ) as queue_minutes "+
                                                    "from RA_SBT_TASK "+
                                                    "    join ra_replication_config on (lib_name = SBT_library_name) "+
                                                    "          where archived = 'N'"+
                                                    "group by replication_server_name");   //executing statement

                        System.out.println("Replication Server,Tasks Queued,Total GB Queued,Minutes in Queue");

                        while(rs.next()){
                                System.out.println(rs.getString(1)+","+
                                                   rs.getInt(2)+","+
                                                   rs.getInt(3)+","+
                                                   rs.getString(4));
                                        }

                        con.close();    //closing connection
                        }
                catch(Exception e)      {
                        e.printStackTrace();
                                        }
        }



Java snippet main section

Below is the main section, and as you can see you can pass one of the 3 parameters mentioned earlier.





  public static void main(String[] args)
        {

//      import java.sql.*;
         airgap airgap = new airgap();   // Create a airgap object


         airgap.get_airgap_properties();      // Call the queue_select() method
         switch(args[0]) {

                case "resume":
                        airgap.resume_replication();      // Call the resume_replication() method
                        break;
                case "pause":
                        airgap.pause_replication();      // Call the pause_replication() method
                        break;
                case "query":
                        airgap.queue_select();      // Call the queue_select() method
                        break;
                default:
                         System.out.println("parameter must be one of 'resume','pause' or 'query'");
                        }
        }
}


Executing the Java code (airgap.class).

Now if you take the snipets above and put them in a file airgap.java you can compile them into a class file.

javac airgap.java
This creates a class file airgap.class

In order to connect to my oracle database, I downloaded the jdbc driver.

"ojdbc8.jar"

Now I can execute it with the 3 parameters 

$ java -Djava.security.egd=file:/dev/../dev/urandom -cp ojdbc8.jar:. airgap pause
replication server 'replairgap' paused

$ java -Djava.security.egd=file:/dev/../dev/urandom -cp ojdbc8.jar:. airgap resume
replication server 'replairgap' resumed

$ java -Djava.security.egd=file:/dev/../dev/urandom -cp ojdbc8.jar:. airgap query
Replication Server,Tasks Queued,Total GB Queued,Minutes in Queue
ra_replication_config,4,95,58


It's that easy to create a simple java program that can manage your replication server from within an Airgap.