Validation of Oracle backups on ZDLRA is often one the most overlooked features of the product. With the rise of ransomware, the question "how to I ensure that I have validated Oracle backups" is critical.

I know there are a lot of vendors out there that provide a great solution for most generic backups. But, as you probably know, Oracle Database backups are different from other system backups and they provide unique challenges which include

The backup of a large database consists of 100s, if not 1000s of backup pieces. All of which are necessary to successfully restore the database.
Oracle Database backups won't contain "ransomware signatures" or any easy way of determining if the backup pieces are tainted.
Oracle Database backups are in a proprietary format that can only be validated by performing a "restore validate" which reads, and validates the contents of Oracle database backup pieces.

How ZDLRA provides superior validation

Backups land on flash during ingest

When backup pieces are sent to the ZDLRA during backup, they land on Flash Storage and are quarantined within the ZDLRA waiting to be validated.

Backup pieces are validated

The ZDLRA will then examine arriving backup pieces. The internal metadata is read and the contents of the backup pieces are validated block-by-block. This ensures that before storing the backup pieces, they are confirmed to be Oracle Database backup pieces, containing valid Oracle Blocks.

Backup pieces are stored and virtual full created

Once the backup piece is examined, and the metadata is read, the individual validated blocks are stored on disk compressed. The blocks are indexed, and a virtual full backup is built. The final step in the process is to update the RMAN catalog on the ZDLRA with an entry pointing to the virtual full.

Weekly validation for both block content, and restore continuity

On a weekly basis all backups on the ZDLRA undergo a "restore validate" which will validate that all the backup pieces are valid, usable backup pieces. This is critical with an "incremental forever" strategy to ensure that unchanged blocks are valid. Along with checking for the integrity of the backup piece, the ZDLRA also checks for "Restore Continuity". I know this a term I made up. The idea is that whatever time/SCN you choose within the recovery window, the ZDLRA ensures that ALL backup pieces needed to recover are available. This is similar to performing a "restore preview" of all time periods to ensure that all backup pieces are available for recovery.

Validation during replication

Replication of backup pieces from one ZDLRA to another takes this process one step further.

Along with all the same validation that occurs when the ZDLRA receives backups from databases, the upstream ZDLRA also catalogs the replicated copy of the backup pieces.

ZDLRA in a Cyber vault

This is where all the pieces come together. The ZDLRA not only utilizes it's validated, incremental forever strategy to keep replication traffic to a minimum, but it also ensures that backups pieces are validated PRIOR to cataloging them.

The ZDLRA has a number of advantages in a Cyber vault scenario

Replication traffic is much smaller than most solutions which require a Weekly Full backup. The ZDLRA uses incremental forever.
Backup pieces are quarantined after arrival in the vault to ensure tainted backups are not included in restore plans. This process is similar to what other vendors do to check for ransomware. The ZDLRA goes one step further by using the proprietary knowledge of Oracle Blocks to ensure all backup, and blocks within the backups are valid.
Backups stored within the ZDLRA in the vault are validated on a weekly basis for both content, and continuity to ensure a restore will be successful.
The upstream sending the backup pieces catalogs what backups are in the vault, and can resend any backup pieces if necessary.

I hope this helps you understand better why the ZDLRA provides superior ransomware protection.

One of the key features of the ZDLRA is the ability to capture changes from the database "real-time" just like a standby database does. In this blog post I am going to demonstrate what is happening during this process so that you can get a better understanding of how it works.

If you look at the GIF above, I will explain what is happening, and show what happens with a demo of the process.

The ZDLRA uses the same process as a standby database. In fact if you look at the flow of the real-time redo you will notice the redo blocks are sent to BOTH the local redo log files, AND to the staging area on the ZDLRA. The staging area on the ZDLRA acts just like a standby redo does on a standby database.

As the ZDLRA receives the REDO blocks from the protected database they are validated to ensure that they are valid Oracle Redo block information. This ensures that a man-in-the-middle attack does not change any of the backup information. The validation process also assures that if the database is attacked by ransomware (changing blocks), the redo received is not tainted.

The next thing that happens during the process is the logic when a LOG SWITCH occurs. As we all know, when a log switch occurs on a database instance, the contents of the redo log are written to an archive log. With real-time redo, this causes the contents of the redo staging area on the ZDLRA (picture a standby redo log) to become a backup set of an archive log. The RMAN catalog on the ZDLRA is then updated with the internal location of the backup set.

Log switch operation

I am going to go through a demo of what you see happen when this process occurs.

ZDLRA is configured as a redo destination

Below you can see that my database has a "Log archive destination" 3 configured. The destination itself is the database on the ZDLRA (zdl9), and also notice that the log information will be sent for ALL_ROLES, which will send the log information regardless if it is a primary database or a standby database.

List backup of recent archive logs from RMAN catalog

Before I demonstrate what happens with the RMAN catalog, I am going to list out the current archive log backup. Below you see that the current archive log backed up to the ZDLRA has the "SEQUENCE #10".

Perform a log switch

As you see in the animation at the top of the post, when a log switch occurs, the contents of the redo log in the "redo staging area" are used to create an archive log backup that is stored and cataloged. I am going to perform a log switch to force this process.

List backup of archive logs from RMAN catalog

Now that the log switch occurred, you can see below that there is a new backup set created from the redo staging area.

There are a couple of interesting items to note when you look at the backup set created.

The backup of the archive log is compressed. As part of the policy on the ZDLRA you have the option to have the backup of the archive log compressed when it is created from the "staged redo". This does NOT require the ACO (Advanced Compression) license. The compressed archive log will be sent back to the DB compressed during a restore operation, and the DB host will uncompress it. This is the default option (standard compression) and I recommend changing it. If you decide to compress, then MEDIUM or Low is recommended. Keep this in mind that he this may put more workload on the client to uncompress the backup sets which may affect recovery times. NOTE: When using TDE, there will be little to no compression possible.
The TAG is automatically generated. By looking at the timestamp in the RMAN catalog information, you can see that the TAG is automatically generated using the timestamp to make it unique.
The handle begins with "$RSCN_", this is because the backup piece was generated by the ZDLRA itself, and archivelog backup sets will begin with these characters.

Restore and Recovery using partial log information

Now I am going to demonstrate what happens when the database crashes, and there is no time for the database to perform a log switch.

List the active redo log and current SCN

Below you can see that my currently active redo log is sequence # 12. This is where I am going to begin my test.

Create a table

To demonstrate what happens when the database crashes I am going to create a new table. In the table I am going to store the current date, and the current SCN. Using the current SCN we will be able to determine the redo log that contains the table creation.

Abort the database

As you probably know, if I shut down the database gracefully, the DB will automatically clean out the redo logs and archive it's contents. Because I want to demonstrate what happens with crash I am going to shut the database down with an ABORT to ensure the log switch doesn't occur. Then start the database mount so I can look at the current redo log information

Verify that the log switch did not occur

Next I am going to look at the REDO Log information and verify that my table creation (SCN 32908369) is still in the active redo log and did not get archived during the shutdown.

Restore the database

Next I am going to restore the database from backup.

Recover the database

This is where the magic occurs so I am going to show that happens step by step.

Recover using archive logs on disk

The first step the database does is to use the current archive logs to recover the database. You can see in the screenshot below that the database recovers the database using archive logs on disk up to sequence #11 for thread 1. This contains all the changes for this thread, but does not include what is in the REDO log sequence #12. Sequence #12 contains the create table we are interested in.

Recover using partial redo log

This step is where the magic of the ZDLRA occurs. You can see from the screen shot below that the RMAN catalog on the ZDLRA returns the redo log information for Sequence #12 even though it was never archived. The ZDLRA was able to create an archive log backup from the partial contents it had in the Redo Staging area.

Open the database and display table contents.

This is where it all comes together. Using the partial redo log information from Redo Log sequence #12, you can see that when the database is opened, the table creation transaction is indeed in the database even though the redo did not become an archive log.

Conclusion : I am hoping this post gives you a better idea of how Real-time redo works on the ZDLRA, and how it handles recovering transactions after a database crash

Bryan's Oracle Blog

Thursday, May 11, 2023

ZDLRA Validation is your best protection against Ransomware