When building a cyber vault, one of the most important items to manage is encryption keys. Encrypting your data is a fundamental pillar of ransomware protection, but encryption key management is often forgotten.
Ensuring your Cyber Vault has a current copy of your encryption keys associated with your Oracle Databases is critically important for a successful restore and recover after an attack.
Starting with the Oracle Key Vault (OKV) 21.11 release, Oracle Key Vault includes a preview of the Key Transfer Across Oracle Key Vault Clusters Integration Accelerator. You can use this feature to transfer security objects from one OKV cluster to another.
You can find more detail on this feature here, and I will describe it's benefits in this post.
The diagram below shows a typical cyber vault architecture to protect Oracle Databases with the Zero Data Loss Recovery Appliance (ZDLRA) and OKV .
Transferring keys into a cyber vault with OKV
Encryption Key Architecture
Encryption key management is a critical piece of data protection, and it is important to properly manage your keys. Good cyber protection begins with proper encryption key management.
Local Wallets
With Oracle databases, the default (and simplest) location for encryption keys is in a locally management wallet file. The keys are often stored in an auto-login wallet, which is automatically opened by the database at startup making key management transparent and simple, but not very secure.
Why aren't wallets secure ?
They are often auto-login (cwallet.sso) which allows the database to open the wallet without requiring a password. This wallet file gives full access to the encryption keys.
The wallet file is stored with the database files. A privileged administrator, such as a DBA has access to both the database and the keys to decrypt the data directly from the hosts. They also have the ability to delete the wallet file.
Often the wallet file is backed up with the database, which includes an auto-login wallet. This allows anyone who has access to backups, to also be able to decrypt the data.
Securely backing up the wallet file separate from the database is often forgotten, especially when ASM is used as the wallet location. Not having the wallet file when restoring the database makes restoration and recovery impossible.
Steps you can take
Create both a passworded wallet (ewallet.p12) and local auto-login wallet (cwallet.sso). With a local auto-login wallet, the wallet can only be opened on the host where the wallet was created.
Backup the passworded wallet only (ewallet.p12). The auto-login wallet can always be recreated from the passworded wallet.
Properly store the password for you passworded encryption wallet. You will need the password to rotate your encryption keys, and create the auto-login wallet.
Oracle Key Vault (OKV)
The best way to securely manage encryption keys is with OKV.
Why ?
Keys are managed outside of the database and cannot be accessed locally outside of the database.
Access to keys is granted to a specific database instance.
OKV is clustered for High Availability, and the OKV cluster can be securely backed up.
Key access is audited to allow for early detection of access.
Encryption Keys in a cyber vault architecture
Below is a diagram of a typical cyber vault architecture using ZDLRA. Because the backups are encrypted, either because the databases are using TDE and/or they are creating RMAN encrypted backups sent to the ZDLRA, the keys need to be available in the vault also.
Not only is the cyber vault a separate, independent database and backup architecture, the vault also contains a separate, independent OKV cluster.
This isolates the cyber vault from any attack to the primary datacenter, including any attack that could compromise encryption key availability.
Encryption Keys in an advanced cyber vault architecture
Below is a diagram of an advanced cyber vault architecture using ZDLRA. Not only are the backups replicated to a separate ZDLRA in the vault, they are internally replicated to an Isolated Recovery Environment (IRE). In this architecture, the recover area is further isolated, and the OKV cluster is even further isolated from the primary datacenter. This provides the highest level of protection.
OKV Encryption Key Transfer
This blog post highlights the benefit of the newly released (21.11) OKV feature to allow for the secure transfer of encryption keys.
Periodic rotation of encryption keys is a required practice to protect encryption keys, and ensuring you have the current key available in a cyber vault is challenging.
OKV solves this challenge by providing the ability to transfer any new or changed keys between clusters.
Implementing OKV in a cyber Vault
When building a cyber vault, it is recommend to build an independent OKV cluster. The OKV cluster in the vault is isolated from the primary datacenter and protected by an airgap. The nodes in this cluster are not be able to communicate with the OKV cluster outside of the vault.
The OKV cluster in the vault can be created using a full, secure, backup from the OKV cluster in the primary datacenter. The backup can be transferred into the vault, and then restored to the new, independent OKV cluster providing a current copy of the encryption keys.
Keeping OKV managed keys updated in a cyber Vault
The challenge, once creating an isolated OKV cluster has been keeping the encryption keys within the cluster current when new keys are created. This was typically accomplished by transferring a full backup of OKV into the vault, and rebuilding the cluster using this backup.
OKV 21.11 provides the solution with secure Encryption Key Transfer. Leveraging this feature you can securely transfer just the keys that have recently changed allowing you to manage independent OKV clusters that are synchronized on a regular basis.
The diagram below shows the flow of the secure Encryption Key Transfer package from the primary OKV cluster into the vault when the air-gap is opened.
This new OKV feature provides a much better way to securely manage encryption keys in a Cyber Vault.
Summary
As ransomware attacks increase, it is critical to protect a backup copy of your critical database in a cyber vault. It is also critical to protect a copy of your encryption keys to ensure you can recovery those databases. OKV provides the architecture for key management in both your primary datacenter and in a cyber vault. The new secure key transfer feature within OKV allows you to synchronize keys across independent OKV clusters.
In this blog post I am sharing a script that I wrote that will give you the list of databases running on a DB node. The information provided by the script is
DB_UNIQUE_NAME
ORACLE_SID
DB_HOME
WHY
I have been working on a script to automatically configure OKV for all of the Oracle Databases running on a DB host. In order to install OKV in a RAC cluster, I want to ensure the unique OKV software files are in the same location on every host when I set the WALLET_ROOT variable for my database. The optimal location is to put the software under $ORACLE_BASE/admin/${DB_NAME} which should exist on single instance nodes, and RAC nodes.
Easy right?
I thought it would be easy to determine the name of all of the databases on a host so that I could make sure the install goes into $ORACLE_BASE/admin/{DB_NAME}/okv directory on each DB node.
The first item I realized is that the directory structure under $ORACLE_BASE/admin is actually the DB_UNIQUE_NAME rather than DB_NAME. This allows for 2 different instances of the same DB_NAME (primary and standby) to be running on the same DB node without any conflicts.
Along with determining the DB_UNIQUE_NAME, I wanted to take the following items into account
A RAC environment with, or without srvctl properly configured
A non-RAC environment
Exclude directories that are under $ORACLE_BASE/admin that are not a DB_UNQUE_NAME running on the host.
Don't match on ORACLE_SID. The ORACLE_SID name on a DB node can be completely different from the DB_UNIQUE_NAME.
Answer:
After searching around Google and not finding a good answer I checked with my colleagues. Still no good answer.. There were just suggestions like "srvctl config", which would only work on a RAC node where all databases are properly registered.
The way I decided to this was to
Identify the possible DB_UNIQUE_NAME entries by looking in $ORACLE_BASE/admin
Match the possible DB_UNIQUE_NAME with ORACLE_SIDs by looking in $ORACLE_BASE/diag/rdbms/${DB_UNIQUE_NAME} to find the ORACLE_SID name. I would only include DB_UNIQUE_NAMEs that exist in this directory structure and have a subdirectory.
Find the possible ORACLE_HOME by matching the ORACLE_SID to the /etc/oratab. If there is no entry in /etc/oratab still include it.
Script:
Below is the script I came up with, and it displays a report of the database on the host. This can be changed to store the output in a temporary file and read it into a script that loops through the databases.
Output:
Below is the sample output from the script.. You can see that it doesn't require the DB to exist in the /etc/oratab file.
Data Sharing has become a big topic recently, and Oracle Cloud has added some new services to allow you to share data from an Autonomous Database. But how do you do this with your on-premises database ? In this post I show you how to use ZFS as your data sharing platform.
Data Sharing
Being able to securely share data between applications is a critical feature in todays world. The Oracle Database is often used to consolidate and summarize collected data, but is not always the platform for doing analysis. The Oracle Database does have the capability to analyze data, but tools such as Jupyter Notebooks, Tableu, Power Bi, etc are typically the favorites of Data Scientists and data analysts.
The challenge is how to give access to specific pieces of data in the database without providing access to the database itself. The most common solution is to use object storage and pre-authenticated URLs. Data is extracted from the database based on the user and stored in object storage in a sharable format (JSON for example). With this paradigm, and pictured above, you can create multiple datasets that contain a subset of the data specific to the users needs and authorization. The second part is the use of a pre-authenticated URL. This is a dynamically created URL that allows access to the object without authentication. Because it contains a long string of random characters, and is only valid for a specified amount of time, it can be securely shared with the user.
My Environment
For this post, I started with an environment I had previously configured to use DBMS_CLOUD. My database is a 19.20 database. In that database I used the steps specified in the MOS note and my blog (information can be found here) to use DBMS_CLOUD.
My ZFSSA environment is using 8.8.63, and I did all of my testing in OCI using compute instances.
For preparation to test I had
Installed DBMS_CLOUD packages into my database using MOS note #2748362.1
Downloaded the certificate for my ZFS appliance using my blog post and added them to wallet.
Added the DNS/IP addresses to the DBMS_CLOUD_STORE table in the CDB.
Created a user in my PDB with authority to use DBMS_CLOUD
Created a user on ZFS to use for my object storage authentication (Oracle).
Configured the HTTP service for OCI
Added my public RSA key from my key par to the OCI service for authentication.
Created a bucket
NOTE: In order to complete the full test, there were 2 other items I needed to do.
1) Update the ACL to also access port 80. The DBMS_CLOUD configuration sets ACLs to access websites using port 443. During my testing I used port 80 (http vs https).
2) I granted execute on DBMS_CRYPTO to my database user for the testing.
Step #1 Create Object
The first step was to create an object from a query in the database. This simulated pulling a subset of data (based on the user) and writing it to a object so that it could be shared. To create the object I used the DBMS_CLOUD.EXPORT_DATA package. Below is the statement I executed.
BEGIN
DBMS_CLOUD.EXPORT_DATA(
credential_name => 'ZFS_OCI2',
file_uri_list =>'https://zfs-s3.zfsadmin.vcn.oraclevcn.com/n/zfs_oci/b/db_back/o/shareddata/customer_sales.json',
format => '{"type" : "JSON" }',
query => 'SELECT OBJECT_NAME,OBJECT_TYPE,STATUS FROM user_objects');
END;
/
In this example:
CREDENTIAL_NAME - Refers to my authentication credentials I had previously created in my database.
FILE_URI_LIST - The name and location of the object I want to create on the ZFS object storage.
FORMAT - The output is written in JSON format
QUERY - This is the query you want to execute and store the results in the object storage.
As you can see, it would be easy to create multiple objects that contain specific data by customizing the query, and naming the object appropriately.
In order to get the proper name of the object I then selected the list of objects from object storage.
set pagesize 0
SET UNDERLINE =
col object_name format a25
col created format a20
select object_name,to_char(created,'MM/DD/YY hh24:mi:ss') created,bytes/1024 bytes_KB
from dbms_cloud.list_objects('ZFS_OCI2', 'https://zfs-s3.zfsadmin.vcn.oraclevcn.com/n/zfs_oci/b/db_back/o/shareddata/');
customer_sales.json_1_1_1.json 12/19/23 01:19:51 3.17382813
From the output, I can see that my JSON file was named 'customer_sales.json_1_1_1.json'.
Step #2 Create Pre-authenticated URL
The Package I ran to do this is below. I am going to break down the pieces into multiple sections. Below is the full code.
Step #2a Declare variables
The first part of the pl/sql package declares the variables that will be used in the package. Most of the variables are normal VARCHAR variables, but there a re a few other variable types that are specific to the packages used to encrypt and send the URL request.
sType,kType - These are constants used to sign the URL request with RSA 256 encryption
utl_http.req,utl_http.resp - These are request and response types used when accessing the object storage
json_obj - This type is used to extract the url from the resulting JSON code returned from the object storage call.
Step #2b Set variables
In this section of code I set the authentication information along with the host, and the private key part of my RSA public/private key pair.
I also set a variable with the current date time, in the correct GMT format.
NOTE: This date time stamp is compared with the date time on the ZFSSA. It must be within 5 minutes of the ZFSSA date/time or the request will be rejected.
Step #2c Set JSON body
In this section of code, I build the actual request for the pre-authenticated URL. The parameters for this are...
accessType - I set this to "ObjectRead" which allows me to create a URL that points to a specific object. Other options are Write, and ReadWrite.
bucketListingAction - I set this to "Deny", This disallows the listing of objects.
bucketName - Name of the bucket
name - A name you give the request so that it can be identified later
namespace/namepaceName - This is the ZFS share
objectName - This is the object name on the share that I want the request to refer to.
timeExpires - This is when the request expires.
NOTE: I did not spend the time to do a lot of customization to this section. You could easily make the object name a parameter that is passed to the package along with the bucketname etc. You could also dynamically set the expiration time based on sysdate. For example you could have the request only be valid for 24 hours by dynamically setting the timeExpires.
The last step in this section of code is to create a sha256 digest of the JSON "body" that I am sending with the request. I created it using the dbms_crypto.hash.
Step #2d Create the signing string
This section builds the signing string that is encrypted. This string is set in a very specific format. The string that is build contains.
(request-target): post /oci/n/{share name}/b/{bucket name}/p?compartment={compartment}
date: {date in GMT}
host: {ZFS host}
x-content-sha256: {sha256 digest of the JSON body parameters}
content-type: application/json
content-length: {length of the JSON body parameters}
NOTE: This signing string has to be created with the line feeds.
The final step in this section is sign the signing string with the private key.
In order to sign the string the DBMS_CRYPTO.SIGN package is used.
Step #2e Build the signature from the signing string
This section takes the signed string that was built in the prior step and encodes the string in Base 64. This section uses the utl_encode.base64_encode package to sign the raw string and it is then converted to a varchar.
Note: The resulting base64 encoded string is broken into 64 character sections. After creating the encoded string, I loop through the string, and combine the 64 character sections into a single string.
This took the most time to figure out.
Step #2f Create the authorization header
This section dynamically builds the authorization header that is sent with the call. This section includes the authentication parameters (tenancy OCID, User OCID, fingerprint), the headers (these must be in the order they are sent), and the signature that was created in the last 2 steps.
Step #2g Send a post request and header information
The next section sends the post call to the ZFS object storage followed by each piece of header information. After header parameters are sent, then the JSON body is sent using the utl_http.write_text call.
Step #2h Loop through the response
This section gets the response from the POST call, and loops through the response. I am using the json_object_t.parse call to create a JSON type, and then use the json_obj.get to retrieve the unique request URI that is created.
Finally I display the resulting URI that can be used to retrieve the object itself.
Documentation
There were a few documents that I found very useful to give me the correct calls in order to build this package.
Signing request documentation : This document gave detail on the parameters needed to send get or post requests to object storage. This document was very helpful to ensure that I had created the signature properly.
OCI rest call walk through : This post was the most helpful as it gave an example of a GET call and a PUT call. I was first able to create a GET call using this post, and then I built on it to create a GET call.
This blog post demonstrates a process to create KEEP archival backups dynamically by using backups pieces within a weekly full/daily incremental backup strategy.
First let's go through what keep a keep backup is and how it affects your backup strategy.
A KEEP backup is a self-contained backupset. The archive logs needed to de-fuzzy the database files are automatically included in the backupset.
The archive logs included in the backup are only the archive logs needed to de-fuzzy.
The backup pieces in the KEEP backup (both datafile backups and included archive log pieces) are ignored in the normal incremental backup strategy, and in any log sweeps.
When a recovery window is set in RMAN, KEEP backup pieces are ignored in any "Delete Obsolete" processing.
KEEP backup pieces, once past the "until time" are removed using the "Delete expired" command.
Normal process to create an archival KEEP backup.
Perform a weekly full backup and a daily incremental backup that are deleted using an RMAN recovery window.
Perform archive log backups with the full/incremental backups along with log sweeps. These are also deleted using the an RMAN recovery window.
One of these processes are used to create an archival KEEP backup.
A separate full KEEP backup is performed along with the normal weekly full backup
The weekly full backup (and archive logs based on tag) are copied to tape with "backup as backupset" and marked as "KEEP" backup pieces.
Issues with this process
The process of copying the full backup to tape using "backup as backupset" requires 2 copies of the same backup for a period of time. You don't want to wait until the end of retention to copy it to tape.
If the KEEP full backups are stored on disk, along with the weekly full backups you cannot use the backup as backupset, you must perform a second, separate backup.
Proposal to create a weekly KEEP backup
Problems with simple solution
The basic idea is that you perform a weekly full backup, along with daily incremental backups that are kept for 30 days. After the 30 day retention, just the full backups (along with archive logs to defuzzy) are kept for an additional 30 days.
The most obvious way to do this is to
Set the RMAN retention 30 days
Create a weekly full backup that is a KEEP backup with an until time of 60 days in the future.
Create a daily incremental backup that NOT a keep backup.
Create archive backups as normal.
Allow delete obsolete to remove the "non-KEEP" backups after 30 days.
.
Unfortunately when you create an incremental backups, and there is only KEEP backups proceeding it, the incremental Level 1 backup is forced into an incremental level 0 backups. And with delete obsolete, if you look through MOS note "RMAN Archival (KEEP) backups and Retention Policy (Doc ID 986382.1)" you find that the incremental backups and archive logs are kept for 60 days because there is no proceeding non-KEEP backup.
Solution
The solution is to use tags, mark the weekly full as a keep after a week, and use the "delete backups completed before tag='xx'" command.
Weekly full backup scripts
run
{
backup archivelog all filesperset=20 tag ARCHIVE_ONLY delete input;
change backup tag='INC_LEVEL_0' keep until time 'sysdate+53';
backup incremental level 0 database tag='INC_LEVEL_0' filesperset=20 plus archivelog filesperset=20 tag='INC_LEVEL_0';
delete backup completed before 'sysdate-61' tag= 'INC_LEVEL_0';
delete backup completed before 'sysdate-31' tag= 'INC_LEVEL_1';
delete backup completed before 'sysdate-31' tag= 'ARCHIVE_ONLY';
}
Daily Incremental backup scripts
run
{
backup incremental level 1 database tag='INC_LEVEL_1' filesperset=20 plus archivelog filesperset=20 tag='INC_LEVEL_1';
}
Archive log sweep backup scripts
run
{
backup archivelog all tag='ARCHIVE_ONLY' delete input;
}
Example
I then took these scripts, and built an example using a 7 day recovery window. My full backup commands are below.
run
{
backup archivelog all filesperset=20 tag ARCHIVE_ONLY delete input;
change backup tag='INC_LEVEL_0' keep until time 'sysdate+30';
backup incremental level 0 database tag='INC_LEVEL_0' filesperset=20 plus archivelog filesperset=20 tag='INC_LEVEL_0';
delete backup completed before 'sysdate-30' tag= 'INC_LEVEL_0';
delete backup completed before 'sysdate-8' tag= 'INC_LEVEL_1';
delete backup completed before 'sysdate-8' tag= 'ARCHIVE_ONLY';
}
First I am going to perform a weekly backup and incremental backups for 7 days to see how the settings affect the backup pieces in RMAN.
for Datafile #1.
File# Checkpoint Time Incr level Incr chg# Chkp chg# Incremental Typ Keep Keep until Keep options Tag
------ ----------------- ---------- --------- --------- --------------- ---- ---------- --------------- ---------------
3 06-01-23 00:00:06 0 0 3334337 FULL NO INC_LEVEL_0
3 06-02-23 00:00:03 1 3334337 3334513 INCR1 NO INC_LEVEL_1
3 06-03-23 00:00:03 1 3334513 3334665 INCR1 NO INC_LEVEL_1
3 06-04-23 00:00:03 1 3334665 3334805 INCR1 NO INC_LEVEL_1
3 06-05-23 00:00:03 1 3334805 3334949 INCR1 NO INC_LEVEL_1
3 06-06-23 00:00:03 1 3334949 3335094 INCR1 NO INC_LEVEL_1
3 06-07-23 00:00:03 1 3335094 3335234 INCR1 NO INC_LEVEL_1
for archive logs
Sequence# First chg# Next chg# Create Time Keep Keep until Keep options Tag
--------- ---------- --------- ----------------- ---- ---------- --------------- ---------------
625 3333260 3334274 15-JUN-23 NO ARCHIVE_ONLY
626 3334274 3334321 01-JUN-23 NO INC_LEVEL_0
627 3334321 3334375 01-JUN-23 NO INC_LEVEL_0
628 3334375 3334440 01-JUN-23 NO ARCHIVE_ONLY
629 3334440 3334490 01-JUN-23 NO INC_LEVEL_1
630 3334490 3334545 02-JUN-23 NO INC_LEVEL_1
631 3334545 3334584 02-JUN-23 NO ARCHIVE_ONLY
632 3334584 3334633 02-JUN-23 NO INC_LEVEL_1
633 3334633 3334695 03-JUN-23 NO INC_LEVEL_1
634 3334695 3334733 03-JUN-23 NO ARCHIVE_ONLY
635 3334733 3334782 03-JUN-23 NO INC_LEVEL_1
636 3334782 3334839 04-JUN-23 NO INC_LEVEL_1
637 3334839 3334876 04-JUN-23 NO ARCHIVE_ONLY
638 3334876 3334926 04-JUN-23 NO INC_LEVEL_1
639 3334926 3334984 05-JUN-23 NO INC_LEVEL_1
640 3334984 3335023 05-JUN-23 NO ARCHIVE_ONLY
641 3335023 3335072 05-JUN-23 NO INC_LEVEL_1
642 3335072 3335124 06-JUN-23 NO INC_LEVEL_1
643 3335124 3335162 06-JUN-23 NO ARCHIVE_ONLY
644 3335162 3335211 06-JUN-23 NO INC_LEVEL_1
645 3335211 3335273 07-JUN-23 NO INC_LEVEL_1
646 3335273 3335311 07-JUN-23 NO ARCHIVE_ONLY
Next I'm going to execute the weekly full backup script that changes the last backup to a keep backup to see how the settings affect the backup pieces in RMAN.
for Datafile #1.
File# Checkpoint Time Incr level Incr chg# Chkp chg# Incremental Typ Keep Keep until Keep options Tag
------ ----------------- ---------- --------- --------- --------------- ---- ---------- --------------- ---------------
3 06-01-23 00:00:06 0 0 3334337 FULL YES 08-JUL-23 BACKUP_LOGS INC_LEVEL_0
3 06-02-23 00:00:03 1 3334337 3334513 INCR1 NO INC_LEVEL_1
3 06-03-23 00:00:03 1 3334513 3334665 INCR1 NO INC_LEVEL_1
3 06-04-23 00:00:03 1 3334665 3334805 INCR1 NO INC_LEVEL_1
3 06-05-23 00:00:03 1 3334805 3334949 INCR1 NO INC_LEVEL_1
3 06-06-23 00:00:03 1 3334949 3335094 INCR1 NO INC_LEVEL_1
3 06-07-23 00:00:03 1 3335094 3335234 INCR1 NO INC_LEVEL_1
3 06-08-23 00:00:07 0 0 3335715 FULL NO INC_LEVEL_0
for archive logs
Sequence# First chg# Next chg# Create Time Keep Keep until Keep options Tag
--------- ---------- --------- ----------------- ---- ---------- --------------- ---------------
625 3333260 3334274 15-JUN-23 NO ARCHIVE_ONLY
626 3334274 3334321 01-JUN-23 YES 08-JUL-23 BACKUP_LOGS INC_LEVEL_0
627 3334321 3334375 01-JUN-23 YES 08-JUL-23 BACKUP_LOGS INC_LEVEL_0
628 3334375 3334440 01-JUN-23 NO ARCHIVE_ONLY
629 3334440 3334490 01-JUN-23 NO INC_LEVEL_1
630 3334490 3334545 02-JUN-23 NO INC_LEVEL_1
631 3334545 3334584 02-JUN-23 NO ARCHIVE_ONLY
632 3334584 3334633 02-JUN-23 NO INC_LEVEL_1
633 3334633 3334695 03-JUN-23 NO INC_LEVEL_1
634 3334695 3334733 03-JUN-23 NO ARCHIVE_ONLY
635 3334733 3334782 03-JUN-23 NO INC_LEVEL_1
636 3334782 3334839 04-JUN-23 NO INC_LEVEL_1
637 3334839 3334876 04-JUN-23 NO ARCHIVE_ONLY
638 3334876 3334926 04-JUN-23 NO INC_LEVEL_1
639 3334926 3334984 05-JUN-23 NO INC_LEVEL_1
640 3334984 3335023 05-JUN-23 NO ARCHIVE_ONLY
641 3335023 3335072 05-JUN-23 NO INC_LEVEL_1
642 3335072 3335124 06-JUN-23 NO INC_LEVEL_1
643 3335124 3335162 06-JUN-23 NO ARCHIVE_ONLY
644 3335162 3335211 06-JUN-23 NO INC_LEVEL_1
645 3335211 3335273 07-JUN-23 NO INC_LEVEL_1
646 3335273 3335311 07-JUN-23 NO ARCHIVE_ONLY
647 3335311 3335652 07-JUN-23 NO ARCHIVE_ONLY
648 3335652 3335699 08-JUN-23 NO INC_LEVEL_0
649 3335699 3335760 08-JUN-23 NO INC_LEVEL_0
650 3335760 3335833 08-JUN-23 NO ARCHIVE_ONLY
Finally I'm going to execute the weekly full backup script that changes the last backup to a keep backup and this time it will delete the older backup pieces to see how the settings affect the backup pieces in RMAN.
for Datafile #1.
File# Checkpoint Time Incr level Incr chg# Chkp chg# Incremental Typ Keep Keep until Keep options Tag
------ ----------------- ---------- --------- --------- --------------- ---- ---------- --------------- ---------------
3 06-01-23 00:00:06 0 0 3334337 FULL YES 15-JUL-23 BACKUP_LOGS INC_LEVEL_0
3 06-08-23 00:00:07 0 0 3335715 FULL YES 15-JUL-23 BACKUP_LOGS INC_LEVEL_0
3 06-09-23 00:00:03 1 3335715 3336009 INCR1 NO INC_LEVEL_1
3 06-10-23 00:00:03 1 3336009 3336183 INCR1 NO INC_LEVEL_1
3 06-11-23 00:00:03 1 3336183 3336330 INCR1 NO INC_LEVEL_1
3 06-12-23 00:00:03 1 3336330 3336470 INCR1 NO INC_LEVEL_1
3 06-13-23 00:00:03 1 3336470 3336617 INCR1 NO INC_LEVEL_1
3 06-14-23 00:00:04 1 3336617 3336757 INCR1 NO INC_LEVEL_1
3 06-15-23 00:00:07 0 0 3336969 FULL NO INC_LEVEL_0
for archive logs
Sequence# First chg# Next chg# Create Time Keep Keep until Keep options Tag
--------- ---------- --------- ----------------- ---- ---------- --------------- ---------------
626 3334274 3334321 01-JUN-23 YES 15-JUL-23 BACKUP_LOGS INC_LEVEL_0
627 3334321 3334375 01-JUN-23 YES 15-JUL-23 BACKUP_LOGS INC_LEVEL_0
647 3335311 3335652 07-JUN-23 NO ARCHIVE_ONLY
648 3335652 3335699 08-JUN-23 YES 15-JUL-23 BACKUP_LOGS INC_LEVEL_0
649 3335699 3335760 08-JUN-23 YES 15-JUL-23 BACKUP_LOGS INC_LEVEL_0
650 3335760 3335833 08-JUN-23 NO ARCHIVE_ONLY
651 3335833 3335986 08-JUN-23 NO INC_LEVEL_1
652 3335986 3336065 09-JUN-23 NO INC_LEVEL_1
653 3336065 3336111 09-JUN-23 NO ARCHIVE_ONLY
654 3336111 3336160 09-JUN-23 NO INC_LEVEL_1
655 3336160 3336219 10-JUN-23 NO INC_LEVEL_1
656 3336219 3336258 10-JUN-23 NO ARCHIVE_ONLY
657 3336258 3336307 10-JUN-23 NO INC_LEVEL_1
658 3336307 3336359 11-JUN-23 NO INC_LEVEL_1
659 3336359 3336397 11-JUN-23 NO ARCHIVE_ONLY
660 3336397 3336447 11-JUN-23 NO INC_LEVEL_1
661 3336447 3336506 12-JUN-23 NO INC_LEVEL_1
662 3336506 3336544 12-JUN-23 NO ARCHIVE_ONLY
663 3336544 3336594 12-JUN-23 NO INC_LEVEL_1
664 3336594 3336639 13-JUN-23 NO INC_LEVEL_1
665 3336639 3336677 13-JUN-23 NO ARCHIVE_ONLY
666 3336677 3336734 13-JUN-23 NO INC_LEVEL_1
667 3336734 3336819 14-JUN-23 NO INC_LEVEL_1
668 3336819 3336857 14-JUN-23 NO ARCHIVE_ONLY
669 3336857 3336906 14-JUN-23 NO ARCHIVE_ONLY
670 3336906 3336953 15-JUN-23 NO INC_LEVEL_0
671 3336953 3337041 15-JUN-23 NO INC_LEVEL_0
672 3337041 3337113 15-JUN-23 NO ARCHIVE_ONLY
Result
For my datafiles, I still have the weekly full backup, and it is a keep backup. For my archive logs, I still have the archive logs that were part of the full backup which are needed to de-fuzzy my backup.
Restore Test
Now for the final test using the next chg# on the June 1st archive logs 3334375;
RMAN> restore database until scn=3334375;
Starting restore at 15-JUN-23
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=259 device type=DISK
...
channel ORA_DISK_1: piece handle=/u01/ocidb/backups/da1tiok6_1450_1_1 tag=INC_LEVEL_0
channel ORA_DISK_1: restored backup piece 1
...
channel ORA_DISK_1: reading from backup piece /u01/ocidb/backups/db1tiola_1451_1_1
channel ORA_DISK_1: piece handle=/u01/ocidb/backups/db1tiola_1451_1_1 tag=INC_LEVEL_0
channel ORA_DISK_1: restored backup piece 1
RMAN> recover database until scn=3334375;
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=627
channel ORA_DISK_1: reading from backup piece /u01/ocidb/backups/dd1tiom8_1453_1_1
channel ORA_DISK_1: piece handle=/u01/ocidb/backups/dd1tiom8_1453_1_1 tag=INC_LEVEL_0
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archived log file name=/u01/app/oracle/product/19c/dbhome_1/dbs/arch1_627_1142178912.dbf thread=1 sequence=627
media recovery complete, elapsed time: 00:00:00
Finished recover at 15-JUN-23
RMAN> alter database open resetlogs;
Statement processed
One of the key features of the ZDLRA is the ability to capture changes from the database "real-time" just like a standby database does. In this blog post I am going to demonstrate what is happening during this process so that you can get a better understanding of how it works.
If you look at the GIF above, I will explain what is happening, and show what happens with a demo of the process.
The ZDLRA uses the same process as a standby database. In fact if you look at the flow of the real-time redo you will notice the redo blocks are sent to BOTH the local redo log files, AND to the staging area on the ZDLRA. The staging area on the ZDLRA acts just like a standby redo does on a standby database.
As the ZDLRA receives the REDO blocks from the protected database they are validated to ensure that they are valid Oracle Redo block information. This ensures that a man-in-the-middle attack does not change any of the backup information. The validation process also assures that if the database is attacked by ransomware (changing blocks), the redo received is not tainted.
The next thing that happens during the process is the logic when a LOG SWITCH occurs. As we all know, when a log switch occurs on a database instance, the contents of the redo log are written to an archive log. With real-time redo, this causes the contents of the redo staging area on the ZDLRA (picture a standby redo log) to become a backup set of an archive log. The RMAN catalog on the ZDLRA is then updated with the internal location of the backup set.
Log switch operation
I am going to go through a demo of what you see happen when this process occurs.
ZDLRA is configured as a redo destination
Below you can see that my database has a "Log archive destination" 3 configured. The destination itself is the database on the ZDLRA (zdl9), and also notice that the log information will be sent for ALL_ROLES, which will send the log information regardless if it is a primary database or a standby database.
List backup of recent archive logs from RMAN catalog
Before I demonstrate what happens with the RMAN catalog, I am going to list out the current archive log backup. Below you see that the current archive log backed up to the ZDLRA has the "SEQUENCE #10".
Perform a log switch
As you see in the animation at the top of the post, when a log switch occurs, the contents of the redo log in the "redo staging area" are used to create an archive log backup that is stored and cataloged. I am going to perform a log switch to force this process.
List backup of archive logs from RMAN catalog
Now that the log switch occurred, you can see below that there is a new backup set created from the redo staging area.
There are a couple of interesting items to note when you look at the backup set created.
The backup of the archive log is compressed. As part of the policy on the ZDLRA you have the option to have the backup of the archive log compressed when it is created from the "staged redo". This does NOT require the ACO (Advanced Compression) license. The compressed archive log will be sent back to the DB compressed during a restore operation, and the DB host will uncompress it. This is the default option (standard compression) and I recommend changing it. If you decide to compress, then MEDIUM or Low is recommended. Keep this in mind that he this may put more workload on the client to uncompress the backup sets which may affect recovery times. NOTE: When using TDE, there will be little to no compression possible.
The TAG is automatically generated. By looking at the timestamp in the RMAN catalog information, you can see that the TAG is automatically generated using the timestamp to make it unique.
The handle begins with "$RSCN_", this is because the backup piece was generated by the ZDLRA itself, and archivelog backup sets will begin with these characters.
Restore and Recovery using partial log information
Now I am going to demonstrate what happens when the database crashes, and there is no time for the database to perform a log switch.
List the active redo log and current SCN
Below you can see that my currently active redo log is sequence # 12. This is where I am going to begin my test.
Create a table
To demonstrate what happens when the database crashes I am going to create a new table. In the table I am going to store the current date, and the current SCN. Using the current SCN we will be able to determine the redo log that contains the table creation.
Abort the database
As you probably know, if I shut down the database gracefully, the DB will automatically clean out the redo logs and archive it's contents. Because I want to demonstrate what happens with crash I am going to shut the database down with an ABORT to ensure the log switch doesn't occur. Then start the database mount so I can look at the current redo log information
Verify that the log switch did not occur
Next I am going to look at the REDO Log information and verify that my table creation (SCN 32908369) is still in the active redo log and did not get archived during the shutdown.
Restore the database
Next I am going to restore the database from backup.
Recover the database
This is where the magic occurs so I am going to show that happens step by step.
Recover using archive logs on disk
The first step the database does is to use the current archive logs to recover the database. You can see in the screenshot below that the database recovers the database using archive logs on disk up to sequence #11 for thread 1. This contains all the changes for this thread, but does not include what is in the REDO log sequence #12. Sequence #12 contains the create table we are interested in.
Recover using partial redo log
This step is where the magic of the ZDLRA occurs. You can see from the screen shot below that the RMAN catalog on the ZDLRA returns the redo log information for Sequence #12 even though it was never archived. The ZDLRA was able to create an archive log backup from the partial contents it had in the Redo Staging area.
Open the database and display table contents.
This is where it all comes together. Using the partial redo log information from Redo Log sequence #12, you can see that when the database is opened, the table creation transaction is indeed in the database even though the redo did not become an archive log.
'
Conclusion : I am hoping this post gives you a better idea of how Real-time redo works on the ZDLRA, and how it handles recovering transactions after a database crash
File Retention Lock was recently released on ZFSSA and I wanted to take the time to explain how to set the retention time and view the retention of locked files.
Below is an example of what happens. You can see that the files are locked until January 1st 2025
The best place to start for information on how this works is by looking at my last blog post on authorizations.
First I will go through the settings that available at the share/project level
Grace period
The grace period is used to automatically lock a file when there has not been updates to the file for this period of time.
If the automatic file retention grace period is "0" seconds, then the default retention is NOT in effect.
NOTE: even with a grace period of "0" seconds files can be locked by manually setting a retention period.
Also, once a grace period is set (> "0") it cannot be increased or disabled.
Finally, if you set the grace period to a long period (to ensure all writes are to a file are completed), you can lock the file by removing the write bit. This does the same thing as expiring the grace period.
Below is an example
chmod ugo-w *
Running the "chmod" will remove the write bit, and immediate cause all files to lock.
Default retention
The most common method to implement file retention is by using the default retention period. This causes the file to be locked for the default retention when the grace period expires for a file.
Note that the file is locked as of the time the grace period expires. For example, if I have a grace period of 1 day (because I want the ability to clean up a failed backup) and a default file retention period of 14 days, the file will be locked for 14 days AFTER the 1 day grace period. The lock on the file will expire 15 days after the file was last written to.
In the example above you can see that all files created on this share are created with a default retention of 1 day (24 hours).
NOTE: If the grace period is not > "0' these settings will be ignored and files will not be locked by default.
.
Minimum/Maximum File retention
The second settings you see on the image above are the "minimum file retention period" and the "maximum file retention period".
These control the retention settings on files which follows the rules below.
The default retention period for files MUST be at least the minimum file retention period, and not greater than the maximum file retention period.
If the retention date is set manually on a file, the retention period must fall within the minimum and maximum retention period.
Display current Lock Expirations.
In order to display the lock expiration on Linux the first thing you need to do is to change the share/project setting "Update access time on read" to off . Through the CLI this is "set atime=false" on the share.
Once this settings is made, the client will then display the lock time as the "atime". In my example at the top of the blog, you can see by executing "ls -lu" the file lock time is displayed.
NOTE: you can also use the find command to search for files using the "atime" This will allow to find all the locked files.
Below is an example of using the find command to list files that have an lock expiration time in the future.
It is possible to set a specific date/time that a file is locked until. You can even set the retention date on a file that currently locked (it must be a date beyond the current lock data).
NOTE: If you try to change the retention date on a specific file, the new retention date has to be greater than current retention date (and less than or equal to the maximum file retention period). This makes sense. You cannot lower the retention period for a locked file.
Now how do you manually set the retention date ? Below is an example of how it is set for a file.
Setting File retention lock
There are 3 steps that are needed to lock the file with a specific lock expiration date.
1. Touch the file and set the access date. This can be done with
"-a" to change the access date/time
"-d" or "-t" to specify the date format
2. Remove the write bit with chmod guo-2
3. execute a cmod to make the file read only.
Below is an example where I am taking a file that does not contain retention, and setting the date to January 1, 2025.
First I am going to create a file and touch it setting the atime to a future data.