Showing posts with label RA. Show all posts
Showing posts with label RA. Show all posts

Wednesday, September 23, 2020

ZDLRA, Real-time redo and compression

 In this post I will go through what happens to archive logs sent to the ZDLRA through real-time redo.


The most common way to send archivelog backups to a ZDLRA is through real-time redo.

In this method the ZDLRA is treated just like a standby database destination.

The main difference with sending logs to the ZDLRA is that logs need to be sent (REDO_TRANSPORT_USER) as the VPC (virtual private catalog) account that is registered to send backups.

This is done by use of wallet containing the VPC user ID and Password and is included in the channel configuration parameter.

There is a great explanation of most of this from my colleague Fernando Simon and you can find it here.

ZDLRA, Real-Time Redo and Zero RPO


What I wanted to go through is the process of sending the logs (real-time), and the process of storing the logs on the ZDLRA.

The first thing to understand is the steps in the process of turning real-time redo into RMAN backupsets.


Step 1  The redo is captured real-time from the ZDLRA through the use of "shadow logs". Think of "shadow logs" as standby redo logs that are created for each database, and for each redo log that is being captured.  Just like standby redo logs, these are full size logs. To give you an example, lets say there are 6 databases sending real-time redo the the ZDLRA, 3 of these are 2 node RAC clusters.  Each database have a redo log size of 20 GB.

On the ZDLRA, these are mirrored (to disk) and will use storage which is included in the USAGE number for the database.  In my example there will be 9 logs



Step 2 - When a log switch occurs a task is created called BACKUP_ARCH. This task is responsible for taking the "shadow log" and turning it into an RMAN backupset containing the log.

The RMAN backupset can be compressed (and it uses BASIC by default, please change it) based on the policy that the Database is a member of.

One of the advantages of the ZDLRA is that the compression license is NOT needed to use other degrees of compression.

The suggestion I would make is.

TDE Databases - Put ALL TDE databases in their own policy and set compression to NONE.  TDE archive logs will not compress and will cause overhead.]

NON-TDE databases - Use LOW compression. this will give you best combination of compression ratio and elapsed time.


Now let's take a look at the tasks to see what I am talking about.


Below is a snippet from the currently running tasks (taken from a SAR report).

TASK_TYPE                 PRIORITY  STATE            CURRENT_COUNT  LAST_EXECUTE_TIME     WORK_TYPE    MIN_CREATION
----------------------  ----------  ---------------  -------------  --------------------  -----------  ------------
BACKUP_ARCH                    120  RUNNING                      7  03-OCT-2019 14:49:08  Work         03-OCT-2019

I can see there there are currently 7 redo logs that have switched, and are awaiting processing to become backupsets. This number should always be very small.

Below is a snippet from the tasks executed in the last 24 hours (also from a SAR report).

TASK_TYPE               STATE                   CNT     CREATED  MIN_COMPLETION_TIME     MAX_COMPLETION_TIME     OLD_CREATION_TIME
----------------------  ---------------  ----------  ----------  ----------------------  ----------------------  ----------------------
BACKUP_ARCH             COMPLETED             9,591       9,580  02-OCT-2019 18:50:35    03-OCT-2019 14:50:28    02-OCT-2019 18:49:49


This is telling me that there were 9,591 log switches on all my protected databases in the last 24 hours.

From a compression standpoint. PLEASE at least change the current setting in your policies for compression. and use the recommendations.

TDE - No compression
No TDE - LOW compression.

I point out in the my last post why this so important to get right.




Monday, September 21, 2020

ZDLRA, archivelog log backups and compression

 In this post I will go through what happens with Archive log Backupsets sent to the ZDLRA through log sweeps.


When you implement ZDLRA you have 2 choices in backing up archive logs.

1) Use real-time redo transport (RRT) which is the same mechanism that is used to send archive logs to a standby database.

2) Use traditional log sweeps (RMAN) that pick up the archive logs and send them to the ZDLRA as backupsets.

Today I am going to go through the second option, using RMAN log sweeps.

Before I go into detail please refer to this MOS note to ensure you understand best practice for backing up a database to the ZDLRA.

RMAN best practice recommendations for backing up to the Recovery Appliance (Doc ID 2176686.1)

As of writing this post, the best practice is

backup device type sbt cumulative incremental level 1 filesperset 1 section size 64g database plus archivelog filesperset 32 not backed up; 

When you execute the best practice command, there are 2 pieces to this backup script.

Database Backup - The best practice is filesperset=1 and section size 64G. This ensures that a large datafile backup (big file) is broken up into pieces, and each backup piece contains only a single datafile. This allows the virtualization process to start as soon as each backup piece is received

Archivelog Backup - Best practice is to use filesperset=32 and only backup archivelogs that have not been backed up.

Now to walk through the archive log backup process:

RMAN will create a backupset of 32 archive logs.  This backupset will be sent to the ZDLRA (through the libra.so library) and will be written to physical disk on the ZDLRA.  The RMAN catalog on the ZDLRA will be immediately updated with the location of the backupset.

Since there is no processing done on the ZDLRA once received (beyond what the RMAN client does), the file is written "as is" on the ZDLRA.

So what why do I point this out ?  As you may know the ZDLRA compresses Datafile backups received, but it does not compress archivelog backupsets through RMAN. If you want your archivelog backupset compressed (that came to the ZDLRA through an RMAN log sweep) you must perform compression through RMAN before sending the archive logs.,

There are a few items to think about before you rush into immediately compressing archive logs.

The first of which (and probably most important to your company) is that RMAN compression, other than basic (which is NOT recommended) requires the ACO (advanced Compression) option (license).  If the databases you support are NOT licensed for ACO usage, then you should stop right here, and consider using real-time redo.  Real-time redo can use all levels of compression without the ACO because the compression is done on the ZDLRA. This will be my next blog post.

#1 - ACO is required for RMAN compression. Use real-time redo to compress on the ZDLRA without the ACO license

The second thing to think about is what level of compression.  Below is some example compression ratio AND timings that have been achieved to give you an idea of the differences. Of course every one's data is different, so your mileage could vary. This does give you an idea however.


BASIC - The elapsed time is 5x longer than it is for NOCOMP. I would absolutely not recommend using BASIC compression.

LOW - The elapsed time was actually less than NOCOMP, most likely due to sending less traffic. The backup ratio was roughly 2:1 giving a great balance of similar execution time and reasonable compression

MEDIUM - The elapsed time was triple (3x) that of LOW or NOCOMP. The compression ratio was slightly better, but not significant.

HIGH - The elapsed time was 24x longer than it is for NOCOMP, and the compression ratio was only slightly better. I would absolutely not recommend using HIGH compression

#2 - LOW compression offers the best balance between elapsed time, and compression ratio.

As I point out that compression of archive logs is a good thing, there as a BIG CAVEAT to this. The ZDLRA has its own compression of datafile backups.  The ZDLRA compression is of each individual block, NOT the backupset. Because of this RMAN compression of datafiles is not recommended, and if TDE is implemented this will cause backups not to virtualize.  The 2 items are.

  • The ZDLRA will uncompress the RMAN backupset and recompress the blocks once virtualized.
  • TDE data will not be virtualized since RMAN compression re-encrypts the backupset.

#3 - DO NOT compress datafile backups.

The 4th item associated with the compression of archive log backupsets is replication. The replication of archivelogs on the ZDLRA is the "cascade" of backupsets.  The backupset containing the archive logs are sent to the downstream "as-is".  If you compress the archive logs with RMAN, then they get replicated compressed. The compressed backupsets not only use less network traffic when replicating, but they will also be stored on the downstream compressed.

#4 - Compression of archive logs means less network traffic with replication.

The 5th item associated with the compression of archive logs is validation on the ZDLRA. Compression of archive logs comes with a slight cost, and this is one of the trade-offs.  The ZDLRA (as you might know) does a "restore validate" of all backups on the ZDLRA on regular basis (typically once a week).  In order to validated archivelog backupsets, these backupsets need to be uncompressed. The uncompression of archivelog backupsets uses CPU on the ZDLRA and the higher the compression, the greater the overhead of this process. Believe it or not, weekly validation is one of the most intensive tasks performed on the ZDLRA.  Using LOW compression has minimal impact on CPU during validation and is recommended unless space is at a premium and MEDIUM compression can be tolerated.

NOTE: This can be monitored in the SAR report by looking at the VALIDATE task. You should see VALIDATE tasks completing, and when looking at executing tasks, the MIN_CREATION should with a day or 2 of executing the SAR report.  If the MIN_CREATION data is more than few days old, VALIDATION tasks are not keeping up and implementing compression will exasperate this situation.

#5 - Validation requires uncompressing archivelog backupsets, so be careful of too high a level of compression.

The final item associated with the compression of archive logs is the recovery of the database using archivelog backupsets.  During a recovery operation, any archivelogs restored through RMAN will have to be uncompressed. This uncompression may affect recovery time. LOW gives the best tradeoff since the elapsed time to uncompress is minimal.  If the network is saturated, restoring compressed archivelogs (which are typically 50% the size) may actually help with recovery time.

#6 - The DB host will have to uncompress archivelog backupsets during recovery. This may affect recovery time.

Now the question is.. How do I put this together to get LOW compression of archive logs AND not compress datafiles?

This is how it can be done.


1) Enable RMAN LOW compression option.
RMAN> CONFIGURE DEVICE TYPE 'SBT_TAPE' BACKUP TYPE TO BACKUPSET;


2) Ensure that compressed backupsets are NOT used by default
RMAN> CONFIGURE DEVICE TYPE 'SBT_TAPE' BACKUP TYPE TO BACKUPSET;

3) Daily incremental level 1 Backups.

run
{
backup as compressed backupset filesperset 8 archivelog all not backed up delete input;
backup as backupset cumulative incremental level 1 filesperset 1 section size 128G database;
backup as compressed backupset filesperset 8 archivelog all not backed up delete input;
}

4) Periodic log sweep Backups.

run
{
backup as compressed backupset filesperset 8 archivelog all not backed up delete input;
}


I am hoping this gives you everything you need to know about using RMAN log sweeps with the ZDLRA and you can decide if you want to use compression of archivelogs during those sweeps.






Thursday, July 16, 2020

ZDLRA and TDE wallet location - Part 2

TDE and SEPS security - how do I get there?
If you read my last blog post on TDE and SEPS security you might be asking yourself, how do I get there ?

Many customers use the default location for the TDE wallet (because they are new to TDE) and find that it the default location will cause conflicts with other Oracle features.

The basic question around this would be.

"all my TDE wallets are in the default location of $ORACLE_HOME/admin/DB_UNQUE_NAME/wallet 
                  or 
$ORACLE_BASE/admin/DB_UNQUE_NAME/wallet
and  I have multiple databases sharing the same $ORACLE_HOME location 
how do I get to a dedication location for TDE?

The challenge, especially if you want to use WALLET_LOCATION (which the ZDLRA requires for real-time redo) is how to get from the default to a dedicated location.
The issue is that WALLET_LOCATION overrides the default location, unless a dedicated TDE wallet location is specified.

First-- The SQLNET.ORA file is ONLY read by the database at startup. Any changes made to the sqlnet.ora file will be effective when a database instance bounces.  You do want to be careful with the coordination however, because a database instance can bounce at any time for any number of reasons so plan carefully.

Now let's start with the where to put the TDE wallet files.  There are many options

1) Leave the wallet files within the $ORACLE_HOME directory using the $ORACLE_SID. 
     PROS - This is less disruptive since it uses a variable already set
     CONS - Wallets have to be be moved to a new location with an out of place upgrade.
                   You need copy the wallet to this new location when implementing.
                    In a multi-node RAC cluster the location is different on each node

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created on each node for every instance.
  • Copy the wallet files to the appropriate subdirectory for each node and for each instance
  • Update the SQLNET.ORA file to point to $ORACLE_HOME/admin/$ORACLE_SID/tde_wallet
2) Leave the wallet files within the original location in $ORACLE_HOME that uses the $DB_UNIQUE_NAME.
     PROS - You don't have to move the wallet files
     CONS - You need to set a new variable
                    Wallets have to be be moved to a new location with an out of place upgrade.

    STEPS
  • For ALL databases sharing the same $ORACLE_HOME ensure that the variable $DB_UNIQUE_NAME is set through srvctl (if available). This ensures all nodes in a RAC cluster have the variable set.
  • Ensure all login scripts on all nodes (including the login script) have the variable $DB_UNIQUE_NAME set
  • Update the SQLNET.ORA file to point to the $ORACLE_HOME/admin/$DB_UNIQUE_NAME/wallet
3) Leave (or move) the wallet files within the $ORACLE_BASE directory using the $ORACLE_SID.  

     PROS - This is less disruptive since it uses a variable already set
     CONS - Wallets have to be be moved to a new location with an out of place upgrade.
                   You need copy the wallet to this new location when implementing.
                    In a multi-node RAC cluster the location is different on each node

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created on each node for every instance within the $ORACLE_BASE/admin directory (unless this was already the default)
  • If necessary, copy the wallet files to the appropriate subdirectory for each node and for each instance
  • Update the SQLNET.ORA file to point to $ORACLE_BASE/admin/$ORACLE_SID/wallet
4) Migrate to $ORACLE_BASE and use $DB_UNIQUE_NAME
     PROS - Once set, you can leave the wallets after out-of-place upgrades
     CONS -  You need copy the wallet to this new location when implementing.
                    You need to set a variable to be used

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created on each node for every $DB_UNIQUE_NAME within the $ORACLE_BASE/admin directory (unless this was already the default)
  • Copy the wallet files to the appropriate subdirectory for each node and for each instance
  • For ALL databases sharing the same $ORACLE_HOME ensure that the variable $DB_UNIQUE_NAME is set through srvctl (if available). This ensures all nodes in a RAC cluster have the variable set.
  • Ensure all login scripts on all nodes (including the login script) have the variable $DB_UNIQUE_NAME set
  • Update the SQLNET.ORA file to point to $ORACLE_BASE/admin/$DB_UNIQUE_NAME/tde_wallet

5) Migrate to ASM (Not available in 11.2) and use $DB_UNIQUE_NAME
     PROS - Once set, you can leave the wallets after out-of-place upgrades
                   You now have a central location for a RAC cluster
     CONS -  You need copy the wallet to this new location when implementing.
                    You need to set a variable to be used

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created in ASM for every $DB_UNIQUE_NAME 
  • Copy the wallet files to the appropriate subdirectory for each database
  • For ALL databases sharing the same $ORACLE_HOME ensure that the variable $DB_UNIQUE_NAME is set through srvctl (if available). This ensures all nodes in a RAC cluster have the variable set.
  • Ensure all login scripts on all nodes (including the login script) have the variable $DB_UNIQUE_NAME set
  • Update the SQLNET.ORA file to point to +DISKGROUP/$DB_UNIQUE_NAME/tde_wallet

It's your choice which path to take.  For me, the best (if ASM isn''t an option) is to put the TDE Wallets within $ORACLE_BASE/admin/$DB_UNIQUE_NAME/tde_wallet.  That way with each out-of-place upgrade I don't have do anything with the wallet. As long as the sqlnet.ora points to the $ORACLE_BASE there won't be any changes.


NOTE: for 18c and above just migrate to WALLET_ROOT which allows you set the value for each database individually.

Tuesday, July 14, 2020

ZDLRA and TDE wallet location

TDE and SEPS security



I am seeing TDE used more and more at customers as security concerns increase.
This blog post will go through configuring TDE and SEPS security (which ZDLRA uses) together.
If OID is used also, this post talks about how to combine OID and SEPS.

First off, the solution depends on the version of oracle you are using.  Depending on your configuration SEPS security and TDE may use the same wallet location. This is NOT recommended.
Below is the hierarchy of where Oracle expects the TDE wallet to be. As soon as it finds the setting it stops

TDE_WALLET_LOCATION
         WALLET_LOCATION
                    $ORACLE_HOME/admin/$DB_UNIQUE_NAME/wallet
                              $ORACLE_BASE/admin/$DB_UNIQUE_NAME/wallet

**NOTE: unless the TDE_WALLET_LOCATION is already set,
                 setting the WALLET_LOCATION will break TDE

When using SEPS security it is critical that you properly set the TDE wallet location first.

11.2

First let's talk through 11.2 and the recommendation for TDE encryption wallet. This is the most basic configuration setting.

Best practice is the set the ENCRYPTION_WALLET_LOCATION in the sqlnet.ora.
If there are multiple databases sharing the same $ORACLE_HOME (multi-homing), then the location needs to use a variable.

Single home example.



ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=/u01/app/oracle/tde_wallet)))


Multi-Home examples



Example 1 - using the $ORACLE_SID variable for the location



ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=/u01/app/oracle/admin/$ORACLE_SID/tde_wallet)))

Example 2 - using a new variable


First ensure that the variable set is set when servctl is used to restart the databases.

srvctl setenv database -db database_name -env "DB_UNIQUE_NAME=database_name"

Second ensure the variable is set during any scripts and when logging into the host

export $DB_UNIQUE_NAME=database_name

Then use this variable within the sqlnet.ora

ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=/u01/app/oracle/admin/$DB_UNIQUE_NAME/tde_wallet)))

** NOTE: you need to create the directories for all databases sharing that same $ORACLE_HOME even if they don't use TDE or SEPS.


12.1/12.2

The configuration for 12.1 is similar to 11.2 with one exception, 12.1 allows you to use ASM for the location of the wallet in a RAC environment.

Here are the examples of ASM based on the 11.2 information.

Single home example.


ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=+DATA/tde_wallet)))

Multi-Home example


ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=+DATA/$DB_UNIQUE_NAME/tde_wallet)))


18c+

Oracle version 18c adds more functionality for the TDE wallet.

18C introduces a new init parameter for TDE called "WALLET_ROOT". in fact, TDE_ENCRYPTION_LOCATION will be depreciated (see below from 18c docs).





WALLET_ROOT is set to the starting location of the TDE wallet, and uses the location as the starting location for wallets for both the CDB, and subdirectories for PDB wallets.

WALLET_ROOT can either be a local file system (or NAS).

          Example
                           WALLET_ROOT=wallet-root-directory-path

It can also be set to an ASM location

         Example
                           WALLET_ROOT=+disk-group-name/db-unique-name

SUMMARY : When implementing the ZDLRA (which uses SEPS security) with an existing TDE implementation, it is critical to ensure that TDE was configured using best practices.  If best practices were not followed, configuring the WALLET_LOCATION may cause wallet issues with databases.





Monday, October 21, 2019

Oracle Active Dataguard - More than just a read-only copy.


NOTE: I updated this on 1/16/20 with additional information.

Anyone that knows me, knows I'm a stickler for properly explaining technical topics and ensuring I use the correct term.

Dataguard  vs Active Dataguard is a topic that drives me crazy sometimes as people tend to use the two options interchangeably.

The differences appear to be subtle on the surface, but there are some major difference (other than the obvious) that you might not know about.

What I am hoping you get out this blog post is... If you have the license for Active Dataguard, then turn  on portions of it, even if the application isn't using it for queries. There are more benefits from Active Dataguard than just having a read-only database copy.


Dataguard  -  First let's talk about normal Dataguard.


Basically, a Dataguard database, is an exact copy of the primary/protected database, that is constantly in recovery.  Redo log information (used for recovery) is automatically sent from the primary/protected database by specifying the name/location of the Dataguard copy as an additional location for the redo logs.
This is a simple explanation, but Dataguard concepts are that simple.

Dataguard - What Dataguard does for me to protect my primary database.

Dataguard is used for a number of purposes.  The most common of which is to have a "Disaster Recovery" copy of the primary database available in the event of the loss of the primary copy.
For a "Disaster Recovery" copy, best practice is to have this copy be geographically isolated from the primary.  This ensures a disaster to the primary datacenter (flood, earthquake, etc.) doesn't affect the Dataguard copy.

When moving the application to use the Dataguard copy, there are 2 different ways to bring up the dataguard copy as the primary.

1) Switchover - In the case of the switchover, all transactions on the primary are applied to the dataguard copy before it is opened up.  This ensures no data loss. This isn't always possible since transactions on primary database need to be "drained" and transferred to the Dataguard copy.

2) Failover - In the case of a failover, it is not possible to "drain" transactions from the primary database.  All outstanding transactions that have been received from the primary database are applied on the Dataguard copy and it is opened with "resetlogs" and there is data loss.

Other uses for Dataguard

  • Dataguard can be used to create an up-to-date copy of production for testing/QA, etc.
  • A Dataguard database can be opened for write (snapshot standby) to test code releases etc and then flashed back to being a Datagaurd copy again.
  • A Dataguard database can have a delay in log apply, essentially providing a time gap, allowing data to be recovered within the time gap in case of user error.

Active Dataguard Option (Licensible option)- 

This contains many features now

Original feature  -- Dataguard copy is is open as read only.

  That is as simple as it is to use active dataguard (if you have the license).  Before starting the apply of redo log information, the database is open read only.

  The main advantage of Active Dataguard is that you can now use the DR copy of the database for queries.  This not only offloads activity from the primary to the mostly idle Dataguard copy, it also ensures that there is a readable copy of the data even while the primary is not available (patching, etc.).

What I wanted to go through in this post, is all the other features that comes with Active Dataguard that you might not realize.

Additional Active Dataguard Features.

First I am going to separate the features into 2 requirements.

Features that are available when the database is in mount mode (read-only not required).


  1. Far Sync -  Far sync allows you to create a shell instance that is used to capture real-time redo from a database, and send it on to standby database.  You can have multiple far sync instances for redundancy, and they are typically local to primary to provide a synchronous destination with very little network lag..
  2. BCT (Block Change Tracking). You can create a BCT file on your standby database, and it will be used for incremental backups. 
  3. Real-time- redo - This allows you to cascade redo from the standby to a destination real-time in the same manner that the primary DB does with Standby redologs.
Features that are available when the database is in read-only mode
  1. Automatic Block repair -  Corrupted blocks on the primary are repaired by automatically applying the "clean" block from the Dataguard copy.
  2. DML redirection.  Occasional updates to the dataguard copy are redirected to the primary database.
  3. Preserve Buffer cache during Role change - When a Dataguard database becomes the primary, the mode change is done without bouncing the database, thus preserving what is in memory.

Additional Active Dataguard Features affecting ZDLRA.



I wanted to call out there 2 features specific to Active Dataguard that have an affect on the ZDLRA recoverability.

  1. Block Change Tracking File - With Active Dataguard, the BCT file gets updated with the changes and is used for incremental backups.. This can be extremely important when using a ZDLRA to backup your Dataguard database.  With the ZDLRA, only incremental backups are performed.  Without an active BCT file, incremental backups will scan all database blocks with every incremental backup.  If you have the license for Active Dataguard be sure to create a BCT anyway, and ensure it is used.
  2. Far Sync Support -  You are probably wondering what this has to do with ZDLRA.  The Far Sync support (starting with 12c) is more than just support for Far Sync.  This feature changes when the applied updates are written to the destinations. Prior to 12c, changes were written to the destinations AFTER the log switch.  This means that the downstream Dataguard databases, and ZDLRA only got full archive logs.  With 12c, an Active Dataguard database, just like the primary sends changes to the destinations from memory as they are applied.  This can make a big difference as to the recovery point objective (RPO) of a dataguard database from the ZDLRA.  This feature, using real-time redo from a standby database to a ZDLRA is allowed as an exception under licensing.

Key takeaways.


Active Dataguard has a couple of great features that the ZDLRA takes advantage of. If you have the license for it, you should turn on BCT, even if the application isn't using any of the other features..


Sunday, April 1, 2018

ZDLRA "Store and Forward " feature

Most people didn't notice, but there was a new feature added to the ZDLRA called "store and forward".

Documentation on how to implement it is in the "ZDLRA Administration Guide" under the topic of implementing high availability strategies. 

Within that you will find the section on
"Managing Temporary Outages with a Backup and Redo Failover Strategy". This section describes what I have called “store and forward”.

ZDLRA offers the customer the ability to send backups (Redo logs and Level 1 backups) to an alternate ZDLRA location.  This provides an efficient HA solution for this information if the primary ZDLRA can't be reached.


Now in order to explain how Store and Forward works, first lets take a look at the architecture.  

  • We have a database we are backing up called "PROTDB"
  • We have 2 different ZDLRA's. Store and Forward requires a minimum of 2 ZDLRA appliances in a datacenter. In this case some of the databases have one of their ZDLRAs as their backup target and the remaining databases have the other ZDLRA as their backup target.
  • For databases backing up to ZDLRA #1 "RA01" will be the preferred ZDLRA that their Level 1 backups and the redo log stream will go to.  ZDLRA #2 "RA02" will be the alternate ZDLRA that Level 1 backups and the redo log stream will go to in the event of an outage communicating with preferred ZDLRA "RA01".
  • The reverse will be true for databases backing up to ZDLRA #2 with the alternate being ZDLRA #1

NOTE : A database has to be unique within a ZDLRA. What this means is that the alternate ZDLRA cannot already used for replication or to backup a dataguard copy of the same database. 



Now that we have defined the architecture let's go through the pieces that make up the store-and-forward methodology.

First however I will define what I mean by "Upstream" and "downstream".

UPSTREAM - This is the ZDLRA that sends replicated backup copies.  

DOWNSTREAM - This is the ZDLRA that receives the replicated backup copies.

A ZDLRA can act as both an UPSTREAM and a DOWNSTREAM. This is common when a customer has 2 active datacenters.  Each ZDLRA acts as both an Upstream (receiving backups directly) and as a Downstream (receiving replicated backups).

In the store-and-forward methodology backups are sent to the Downstream as the primary, and the Upstream as the Alternate.  This allows for backups to replicate from the Alternate (Upstream) to the Primary (Downstream).  This will be explained as you walk through flow.


Configuring Store-and-Forward




1) Configure "RA01" to be the down stream replicated pair of RA02. 
2) Ensure that the protected database ("PROTDB") is added to policies on both RAs (this process is described in the 12.2 admin guide)
3) Ensure "PROTDB"  has a wallet entries for both RAs, and that it the database is properly registered in both RMAN catalogs (using the admin guide).
3) Configure real-time redo apply using "RA01" as the primary RA and "RA02" as the alternate.

NOTE: Real-time redo isn't mandatory to use but it makes the switching over of redo a lot easier. I will show how the environment looks with real-time redo.  if you are manually sending archive logs and level 0 backups, the flow will be similar.


Real-time Redo flow


First lets take a look at the configuration for real-time redo.

Below is the configuration for a database with both a primary and and alternate ZDLRA. Working with an alternate destination is well described in this blog post.



Primary ZDLRA (RA01) configuration


LOG_ARCHIVE_DEST_3=“SERVICE=<"RA01" string from wallet>”, VALID_FOR=(ALL_LOGFILES, ALL_ROLES) ASYNC DB_UNIQUE_NAME=’<"RA01" ZDLRA DB>’ noreopen alternate=log_archive_dest_4;

log_archive_dest_state_3=enable;


Alternate ZDLRA (RA02) configuration


LOG_ARCHIVE_DEST_4=“SERVICE=<"RA02" string from wallet>”, VALID_FOR=(ALL_LOGFILES, ALL_ROLES) ASYNC DB_UNIQUE_NAME=’<"RA02" ZDLRA DB> ;
LOG_ARCHIVE_STATE__4=alternate;


Below is what the flow looks like.

Redo log traffic and backups are sent from "PROTDB" to "RA01".  "RA02" (since it is the upstream pair of "RA01") is aware of the backups in it's RMAN catalog.





Now let's take a look at the status of the destinations


SQL> select dest_id, dest_name, status from 
v$archive_dest_status where status <> 'INACTIVE';
DEST_ID DEST_NAME STATUS
---------- --------------------- ---------
 1 LOG_ARCHIVE_DEST_3 VALID
 2 LOG_ARCHIVE_DEST_4 UNKNOWN

You can see that the redo logs are sent to DEST_3 ("RA01") and DEST_4 ("RA02") is not active.



Now lets see what happens when "RA01" can't be reached.




SQL> select dest_id, dest_name, status from 
v$archive_dest_status where status <> 'INACTIVE';
DEST_ID DEST_NAME STATUS
---------- --------------------- ---------
 1 LOG_ARCHIVE_DEST_3 DISABLED
 2 LOG_ARCHIVE_DEST_4 VALID

After the second failed attempt, the original destination is marked as disabled, and the alternate is valid.

Below you can see that the redo logs, and the backups (Level 1) are being sent to "RA02".

"PROTDB" connects to the catalog on "RA02" which is aware of the previous backups and synchronizes its backup information with the control file.

This allows the next Level 1 incremental backup to be aware of the most current virtual full backup on "RA01".

This also allows the redo log stream to continue where it left off with "RA01".  The RMAN catalog on "RA02" is aware of all redo logs backups on "RA01" and is able to continue with the next log.



Now lets see what happens when "RA01" becomes available.


When "RA01" becomes available, you start the replication flow downstream. This will allow all the backups (redo and/or Level 1) to replicate to "RA01", be applied to the RA, and update the RMAN catalog.

Once this complete, RA01 will have virtualized any backups, along with storing and cataloging all redo logs captured.



BUT, at this point the primary log destination is still disabled so we need to renable it to start the redo log flow back.



SQL> alter system set log_archive_dest_state_3=enable;
System altered.
SQL> alter system set log_archive_dest_state_4=alternate;
System altered.

Once this is complete.  We are back to where we started.




That's it.

Store-and-forward is a great HA solution for capturing real-time redo log information to absorb any hiccups that may occur.