Monday, March 29, 2021

ZDRLA adds smart incremental to be even smarter.

 Recently version 19.1.1.2 of ZDLRA software was released, and one the features is something called "Smart Incremental".  I will walk through how this feature works, and help you understand why features like this are "ZDLRA Only".




I am going to start by walking through how incremental backups become "virtual full backups", and that will give you a better picture of how "smart incremental" is possible.

The most important thing to understand about these features is that the RMAN catalog itself is within the ZDLRA  AND the ZDLRA has the ability to update the RMAN catalog.

How does a normal backup strategy work ? 

That is probably the best place to start.  What DBAs typically do is perform a WFDI (Weekly Full Daily Incremental) backup.  To keep my example simple, I will use the following assumptions.
  • My database contains 3 datafile database. SYSTEM, SYSAUX, USERS, but I will only use the example of backing up datafile users.
  • Each of these 3 datafiles are 50 GB in size
  • I am only performing a differential backup which creates a backup containing the changes since the last backup (full OR incremental).
  • My database is in archivelog  *
* NOTE: With ZDLRA you can back up a nologging database, and still take advantage of virtual fulls. The database needs to be in a MOUNTED state when performing the incremental backup.

If placed in a table the backups for datafile USERS would look this. Checkpoint SCN is the current SCN number of the database at the start of the backup.



If I were to look at what is contained in the RMAN catalog (RC_BACKUP_DATAFILE), I would see the same backup information but I would see the SCN information 2 columns.
  • Incremental change # is the oldest SCN contained in the backupset. This is the starting SCN number of the previous backup, this backup is based on.
  • Checkpoint Change # is  starting SCN number of the backup.  Everything newer than this SCN (including this SCN) needs to be defuzzied.


Normal backup progression (differential)


When performing an incremental RMAN backup of a datafile, the first thing that RMAN does is decide which blocks needs to be backed up. Because you are performing an incremental backup,  you may be backing up all of the blocks, only some of the blocks, or even none of the blocks if the file has not changed.
This is a decision RMAN makes by querying the RMAN catalog entries (or the controlfile entries if you not using an RMAN catalog).

Now let's walk through this decision process.  Each RMAN incremental differential's starting SCN is based on the beginning SCN of the previous backup (except for the full).



By looking at the RMAN catalog (or controlfile), RMAN determines  which blocks need to be contained in each incremental backup.



Normal backup progression (cumulative differential)


Up to release 19.1.1.2, the recommendation was to perform a Cumulative Differential backup. The cumulative differential backup compares the starting SCN number of the last full backup to determine the starting point of the incremental backup (rather than the last incremental backup) .
The advantage of the cumulative over differential, is that a cumulative backups can be applied to the last full and take the place of applying multiple differential backups.  However, cumulative backups are bigger  every day that passes between full backups because they contain all blocks since the last full.

Below is what a cumulative schedule would look like and you can compare this to the differential above.
You can see that each cumulative backups starts with the Checkpoint SCN of the last full to ensure that all blocks changed since the full backup started are captured.



The RMAN catalog entries would look like this.




If you were astute, you would notice a few things happened with the cumulative differential vs the differential.
  • The backup size got bigger every day
  • The time it took to perform the incremental backup got longer
  • The range of SCNs contained in the incremental is larger for a cumulative backup.

ZDLRA backup progression (cumulative differential)

As  you most likely know, one the most important features of the ZDLRA is the ability to create a "virtual full" from an incremental backup.,

If we look at what happens with a cumulative differential (from above), I will fill in the virtual full RMAN catalog entries by shading them light green.

The process of performing backups on the ZDLRA is exactly the same as it is for the above cumulative, but the RMAN catalog looks like this.


What you will noticed by looking at this compared to the normal cumulative process that
  • For every cumulative incremental backup there is a matching virtual full backup  The Virtual full backup appears (from the newly inserted catalog entry) to have beeen taken at the same time, and the same starting SCN number as the cumulative incremental. Virtual full backups, and incremental backups match time, and SCN as catalog entries.
  • The size of the virtual full is 0 since it is virtual and does not take up any space.
  • The completion time for the cumulative incremental backup is the same as the differential backups.  Because the RMAN logic can see the virtual full entry in the catalog, it executes the cumulative incremental EXACTLY as if it is the first differential incremental following a full backup.
Smart Incremental backups -

Now all of this led us to smart incremental backups. Sometimes the cumulative backup process doesn't work quite right.  A few of the reasons this can happen are.

  • You perform a full backup to a backup location other than the ZDLRA. This could be because you are backing up to the ZDLRA for the first time replacing a current backup strategy, or maybe you created a special backup to disk to seed a test environment (Using a keep backup for this will alleviate this issue).  The cumulative incremental backup will compare against the last full regardless of where it was taken (there is exceptions if you always use tags to compare).
  • You implement TDE or rekey the database.  Implementing TDE (Transparent Data Encryption) changes the blocks, but does not change the SCN numbers of the blocks. A new full backup is required.
Previously, you would have to perform a special full backup to correct these issues. In the example below you can see what happens (without smart incremental) to the RMAN catalog if you perform a backup on Thursday at 12:00 to disk to refresh a development environment.



Since the cumulative backups are based on the last full backup, the Thursday - Saturday backups contain all the blocks that have changed since the disk backup started on Thursday at 12:00.
And, since it is cumulative, each days backup is larger, and takes longer.

This is when you would typically have to force a new level 0 backup of the datafile.


What the smart incremental does

Since the RMAN catalog is controlled by the ZDLRA it can correct the problem for you. You no longer need to perform cumulative backups as the ZDLRA can fill in any issues that occur.

In the case of the Full backup to disk, it can "hide" that entry, and continue to correctly perform differential backups. It would "hide" the disk backup that occured, and inform the RMAN client that the last full backup as of Thursday night is NOT the disk backup, but it is the previous virtual full backup.
\


 In the case of the TDE, it can "hide" all of the Level 0 virtual full backups, and the L1 differential backups (which will force a new level 0).





All of this is done without updating the DB client version. All the magic is done within the RMAN catalog on the ZDLRA.

Now isn't that smart ?



Friday, March 26, 2021

ZDLRA leverages the ZFS object store in the newest release

Yes, Using ZFSSA as an on-prem object store with ZDLRA is here, and How to configure Zero Data Loss Recovery Appliance to use ZFS OCI Object Storage as a cloud repository (Doc ID 2761114.1) shows you how.


Above is the diagram from Tim Chien's "Ask Tom" session on the new feature with ZDRA release 19.1.1.2.

 For those how have been reading my blog posts, and wondering why the sudden interest in ZFS as an object store, here is another reason.

The idea behind this is pretty simple,  many customers are looking for an additional tier of storage behind the ZDLRA for 2 reasons
  • They want to extend the the recovery window onto a lower tier of storage. This may include going from a full "any point in time" recovery to a set of "recovery points"
  • They want an archival backup for a long period of time that is a set backup point.  Keep backups are the perfect example of this. With Keep backups you get a self-contained restore point of your choosing.
Now for the magic of how all this works.

1. The first step is to configure your ZFSSA as an OCI object store. As long you are on the latest patched release of OS 8.8, this functionality is available to you.  If you are unfamiliar with how to do this, in previous posts, I have walked through the steps of configuring this. Below are some places to start.


Also, here is the documentation from ZFS.

2. The second step is to configure Key Vault (OKV), which is a licensed product. Key vault is a centralized Encryption Key management system that is used to store the master encryption key for the backups.  OKV is released as a virtual image, that can be installed on physical hardware, or in a virtual environment. the installation is self-contained and walks through a series of questions to finish the configuration.  Easy.
  WHY do I need TDE ?  I'm sure you are asking this question.  The "Copy-to-cloud" functionality of the ZDLRA is being utilized to present ZFS as an "OCI cloud store".  It acts just like an object store in the Oracle Public Cloud.  The only difference is that there is no "ARCHIVE" tier on ZFS.  Since ZFS is considered a "Cloud destination", it follows the Larry rule that "All data in the Cloud is encrypted.". Because of that, the backups going to ZFS will be RMAN encrypted (no license needed for this part).  The ZDLRA uses OKV to store the master keys used to encrypt the RMAN backupsets.

3. The third step is to configure the ZDLRA to utilize OKV as a client, and to point the ZDLRA to your ZFS.
  One of the great things of using this solution is that the process is exactly the same as configuring the ZDLRA to send backups to the Oracle public cloud. This link points to the documentation that makes it clear how to configure this process.

That's all there is to it. The most complicated task is configuring the authentication for the OCI object store on ZFS, as it requires setting up a public and private key.

Now to walk through the workflow.

Backups -- Below is the backup workflow from the presentation.  The ZDLRA creates an RMAN backupset from the backup pieces on the ZDLRA. This backupset is an RMAN encrypted backupset.




One item is NOT mentioned on this slide is compression.  If your Database is using TDE, then the backup cannot be compressed when sent to the ZFS because the ZDLRA does not have the encryption master key for the database.  BUT, if your database is NOT TDE enabled, then you should be using compression when sending the backups to the ZFS. As I've said earlier, the backset is an RMAN encrypted backupset. Because it is already encrypted when sent to ZFS, the ZFS will be unable to compress the backups.  You can find instructions to add compression in the documentation for creating a job template.  There is a setting for the template called

Compression_algorithm=> 
By implementing compression on the ZDLRA you are:
  • Decreasing the size of the backups on the ZFS..
  • Decreasing the networkwork traffic between the ZDLRA and ZFS as the data is compressed before it is sent to ZFS. This can double the throughput for backups and restores.
Keep in mind, that if you restore directly to your database host from the ZFS Object store, the database host will be performing the uncompression.

Restores - Below is the restore workflow. Typically you would utilize the catalog on the ZDLRA and let the ZDLRA be the conduit for uncompressing (if it was compressed when sent to ZFS), and unencrypting it, as the ZDLRA encrypted it.  The ZDLRA already has the credentials for the object store, and it has the Encryption master key available to it from OKV.



Alternately you can restore the backups directly from the ZFS object store.
This would be a 3 step process..

1) You would download the Oracle Database Cloud Backup Module . Once downloaded you would configure the database to utilize the OCI object store. The link above also contains links to documentation for the module, and to a MOS note containing the FAQ.  Keep in mind that in this case you are configuring the Module for the on-premise ZFS (rather than the Oracle public cloud), and the instructions may have to be modified. The table below gives you an idea of the differences.


2) You would catalog the backup pieces. If the RMAN catalog is not available (for some reason) the MOS note mentioned below contains detail on how to list what is in the object store, and how to clean it out.

How to report or delete backup pieces stored in Cloud Object Storage by Database Backup Cloud Service without using RMAN (Doc ID 2360800.1)

The script contained in the MOS note ( odbsrmt.py) should work with a few minor changes to the instructions (since we are talking about an on-prem ZFS).  I will continue to work through the changes and post the results in a future blog post.


3) You would register the restore location as an OKV endpoint (if it isn't already registered), OR you can alternately export the encryption key and create a wallet file.






Conclusion - This is a very exciting addition to the many features that the ZDLRA already provides.

Friday, March 12, 2021

ZFS Object Store - Why are there 3 APIs?

 In talking to others that are new to object stores, there is always a complicated conversation on why there are different API interfaces. I will try to go through the history of object stores and talk about the reason why.


First, I want to say up front I am not going to talk about WebDav.  From what I can find, Web Dav is more of a web page authoring platform.

Next I am going to define a few terms.

OPC - Oracle Public Cloud. This is the Oracle Public Cloud Offering, though there are flavors of the OPC that use the same GUI and interfaces (Cloud@Customer for example). When I refer to OPC, I am talking about anything that uses the standard Oracle Public Cloud BUI interfaces.

OCI - Oracle Cloud Interface.   This is one of the most confusing terms used when talking about the ZFS object store.  For most people, when referring to OCI, they are talking about the interface to the Oracle Public Cloud (OPC) offerings in general. On ZFS, this refers to a specific API for the OPC object store.  When I talk about OCI, I am talking about the object store interface.

OCI Gen 1/OCI Gen 2.  In the history of the Oracle Public Cloud OCI, Generation 1 was Version one of the Oracle Public Cloud.  The object store in the first version utilized the Swift interface (which I will get into later).  Of course, following Generation 1 (Gen 1), there was Generation 2 (Gen 2) which uses a different API.  When I refer to these terms, I am referring to the Object store APIs available in the OPC.

S3 or S3 API. When you think of S3, you are probably thinking AWS. The reality is, AWS built the standard for the cloud object store, but many other vendors offer an object store, on-prem or publicly, that follow the AWS standard.  This is the most commonly used Object Store API standard.

Large objects. This is a special term when talking about an object store. Object stores typically have a limit of 5GB on the size of objects. This made sense in the beginning as object stores where not as widely used for all kinds of objects as they are today.  As data grew the need for Object Stores to handle "Large Objects" becomes clear. When I go through the history, along with features of  object stores, "Large Objects" will refer to any object greater than 5 GB.

Bucket : There may be other terms used to describe a "bucket" or container, but a bucket is the high level identifier where objects are stores.  You can think of it as file drawer, or anything thing else that reminds you that it is a level of separation. This is what I mean  when I refer to a Bucket.

Tenancy : In todays cloud, resources are shared, and each user is a "Tenant" in the Multi-tenant cloud paradigm.  This allows for the sharing of resources while still providing isolation.

Now some history

When the object store world began, there was Swift.  Swift was a simple object store, with a simple interface. OCI Gen 1 uses swift, and ZFS offers Swift as an API interface. Swift was designed more for command line interaction than GUI.  If you look for tools to access a Swift object store you find that "curl", and the "swift" CLI (built in python) are the most common.

Swift  : Below are the highlights of Swift.

  • Swift V1 requires a 2 step authentication, though V2 removed this restriction. A username and password are passed to swift and an authentication token is returned. The token is then used to all subsequent calls.
  • Swift multi-tenancy. Because Swift uses a simple Username/Password authentication (though the idea of tenancy was added later), it does not work well as a shared cloud resource. V1 of swift had no concept of multi-tenancy so every bucket name had to be unique. There was no easy way to tie storage utilization to a specific "tenant", especially when multiple users shared a tenancy.
  • Support of large objects was originally an issue for Swift, and there multiple ways of dealing with the support.  Swift eventually added Dynamic Large Object (DLO) support which allowed for the storage of large objects.  Some vendors/applications using Swift took advantage of DLO, some wrote their own.  The swift CLI for example, uses it's own method of storing large objects by created a "shadow" bucket containing individual pieces (5GB per piece) and then storing a manifest file that tells swift where the objects are. Many other vendors (including Oracle) wrote their own large object support that can only be read by them.
Issues - As you can see there were many issues with Swift and this explains why most vendors have moved away from swift. OCI Gen 1 was a swift V2 interface and it is still available to access object stores. ZFS uses the swift V1 interface.
  • Authentication was difficult
  • no multi-tenancy support for users
  • inability to create cost models based on tenancy
  • no true standard for large objects.
S3 - Along came S3 to provide solutions to these issues.

To solve these issues, AWS came up with a standard that solved these issues and below are the highlights.

  • Authentication is based on Key/passcode. The Key/passcode is uniquely generated for each user of a tenancy.  When accessing the object store, the Key/passcode allows the object store to identify the tenancy and provide the necessary isolation, and of course billing.
  • Large object support was provided.  S3 added the idea of "multi-part uploads". When the client prepares to upload an object, it tells the S3 object store that this is a multi-part upload, along with how many pieces are being uploaded.  This allows the client to break the upload into multi parts (of 5GB or less) and upload each part individually even in parallel. the object store will then join all the parts into a single large object.  The process is reversed for downloads.
  • Having a standard provides for tools like rsync, and cloudberry to be able to synchronize the object store (regardless of vendor) with a file system, upload files through a windows client, even mount the object store a file system through Fuse.

Issues - As you can see many of the issues with swift were corrected, and this is now the most widely used API for an object store.


OCI Gen 2 - Along with the second version of the Oracle Public Cloud came a new API fo the object store.

There was one remaining issue with the AWS API that was solved by this interface. The idea of compartments within a tenancy.  Within a tenancy, the bucket must be unique, but when a new bucket is created, it is created in a compartment. This gives an additional level of organization for objects.


A few of the highlights of OCI Gen 2 are.

  • Authentication uses the RSA public/private key model which is more secure than AWS authentication.
  • The idea of compartments is supported.

Issues - As you can see many of the issues with swift were corrected, and the concept of compartments was added.

  • The only issue I've encountered is the lack of a GUI interface for uploading objects.


SUMMARY :


In summary, on ZFS, all 3 object store are available as separate object stores. pick your object store.

Also to note, in the OPC, all 3 object stores are compatible and can be used to access the same object. This is not the case with ZFS.