Wednesday, December 2, 2020

Advanced Compression and TDE

 This blog post details what happens when you utilize ACO to mitigate the effects of TDE.




I started researching this topic as I thought about what happens when ACO is implemented on a table.

The first thought on Implementing ACO is that it will mitigate the effects of TDE by reducing the size of the data, and thus reducing the size of the backup RMAN creates (without using RMAN compression). Most backup strategies use compression, and TDE data will not encrypt. Implementing ACO to reduce the database size, reduces the backup size and mitigates the effect of TDE.

This all sounds good, and at first glance makes sense.

Of course there is a lot more to ACO, then just mitigating the loss of compression on my backup strategy.

  • The size of the level 0 is smaller, thus my restore time can be reduced AND the amount of storage needed for my Level 0 is lessened.
  • I am reading fewer blocks for the same amount of data. Less physical reads typically means faster queries (that perform disk reads).

If I am able to achieve 2x compression through ACO, my backup size is reduced to be 1/2 the size it would be without ACO right ?

Well .... There are few things to think about.

First - Indexes may be a large part of your database. I have seen applications where the indexes take up more space than the data itself.  Also, you can only compress indexes that have more than 1 column. In a multi-column index, the prefix columns can be compressed. Single column indexes cannot be compressed.


Second- Let's take a close look at how ACO works with my table. What happens with my level 0 backup is pretty clear. I would expect to get at least a 2x compression ratio. What I wanted to explore is what happens to my daily incrementals and archive logs ?

Here is the documentation that best describes it.  Below is a picture that shows what happens with OLTP compression. This is where I looked close at.


What caught my eye in this is that the block starts out initially uncompressed. As the block gets full, a background process will compress the rows and the resulting block will be compressed. The thoughts I had were

  • That's great for my incremental backup size.  More data in each block means less blocks to backup.
  • With my redo, the row is initially inserted in the block at the current scn, and the batch process then changes the block and that row is moved. The movement of the row needs to be captured in the redo, resulting in more redo.
Now to put this to the test and see what happens.

I started by creating my own copy of DBA_OBJECTS and continued to replicate the data until it was ~ 100MB. This seemed like a good size to work with. In order to facility updates to this dataset (since there were duplicates), I added a column that is the rownum. This gives each row a unique key from 1 to the number of rows.

My table is named "bgrenn.myobjects"
My tablespace is ENCRYPTED_DATA and was created with the ENCRYPTION clause.

Step 1 - I created a table to hold the results of the changes to analyze.  Below is the definition (if you want to follow along at home).

create table bgrenn.tests(test_type varchar(10),
              test_table varchar(20),
              test_number integer,
              test_start_seq# number,
              test_end_seq# number,
                          rows_affected number,
                          blocks_changed number,
                          incr_size number,
                          archive_size number,
                          total_size number) tablespace encrypted_data;
Step 2 - I created 2 empty tables in the encrypted tablespace. One defined with OLTP compression, one without.

create table bgrenn.uncompressed tablespace encrypted_data as select * from bgrenn.myobjects where 1=0;
create table bgrenn.compressed tablespace encrypted_data  ROW STORE COMPRESS ADVANCED as select * from bgrenn.myobjects where 1=0;

INSERTS


Step 3
- Execute my procedure to insert into the uncompressed table committing every 100 rows. At the end of the procedure I insert into my "tests" table the information about this action.

DECLARE
   TYPE mytest_cur IS REF CURSOR;
   commit_count number := 0;
   insert_count number := 0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   start_scn number;
   archive_size number;
  intable bgrenn.myobjects%ROWTYPE;
  CURSOR c1 IS SELECT
     * from bgrenn.myobjects;

BEGIN
   OPEN c1;
   execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;
   select current_scn into start_scn from v$database;
   LOOP
      FETCH c1 INTO intable;
      EXIT WHEN c1%NOTFOUND;
       insert into bgrenn.uncompressed values intable ;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    insert_count := insert_count + 1;
   END LOOP;
   dbms_output.put_line('# of rows inserted = ' || insert_count);
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','UNCOMPRESSED');
   select sum(blocks) into blocks_changed from dba_segments where owner= 'BGRENN' and segment_name in ( 'UNCOMPRESSED','UNCOMPRESSED_IDX');
   select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.uncompressed where ora_rowscn >= start_scn;
   select max(sequence#) - 1 into end_seq_number from v$log;
   select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Insert','Uncompressed',1,start_seq_number,end_seq_number,insert_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   CLOSE c1;
END;
/

Step 4 - Execute my procedure to insert into the compressed table committing every 100 rows. 



set serveroutput on;
DECLARE
   TYPE mytest_cur IS REF CURSOR;
   commit_count number := 0;
   insert_count number := 0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   archive_size number;
   start_scn number;
  intable bgrenn.myobjects%ROWTYPE;
  CURSOR c1 IS SELECT
     * from bgrenn.myobjects;

BEGIN
   OPEN c1;
   execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;
   select current_scn into start_scn from v$database;
   LOOP
      FETCH c1 INTO intable;
      EXIT WHEN c1%NOTFOUND;
       insert into bgrenn.compressed values intable ;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    insert_count := insert_count + 1;
   END LOOP;
   dbms_output.put_line('# of rows inserted = ' || insert_count);
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','COMPRESSED');
   select sum(blocks) into blocks_changed from dba_segments where owner= 'BGRENN' and segment_name in ( 'COMPRESSED','COMPRESSED_IDX');
   select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.compressed where ora_rowscn >= start_scn;
   select max(sequence#) - 1 into end_seq_number from v$log;
   select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Insert','Compressed',1,start_seq_number,end_seq_number,insert_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   CLOSE c1;
END;
/

Now that we have done our inserts, let's take a look and see what happened.

TEST_TYPE  TEST_TABLE           ROWS_AFFECTED  INCR_SIZE ARCHIVE_SIZE TOTAL_SIZE
---------- -------------------- ------------- ---------- ------------ ----------
Insert     Uncompressed                585640   96075776    282468864  378544640
Insert     Compressed                  585640   50331648    476089856  526421504

I inserted the same number of rows into both tables, in fact I inserted the same exact rows.

When looking at the size of the incremental backup (these will become part of the level 0 backup) the comparison is.
    Compressed      - 50331648    
    Uncompressed - 96075776    
    Compression ratio  1.9X
Perfect ! My incremental backup size of my compressed table size is about 1/2 of that of uncompressed data. This makes up for the loss of compression in my backup.

Now let's take a look at the Archive Log Backup. These will be kept for the retention window.

Compressed        -    476089856  
Uncompressed    -    282468864  
Compression ratio    .60x

Wow, it's less than 1. The archive logs for compressed data is almost double what they are for uncompressed.

Finally let's take a look at the Total Change Backup size.

Compressed        -    526421504  
Uncompressed    -    378544640
Compression ratio    .71x

INSERTS - The daily backup size for inserts is bigger for compressed data.

UPDATES


Step 5 - Execute my procedure to update 1% of the data in  the uncompressed table committing every 100 rows. 


DECLARE
   commit_count number := 0;
   update_count number := 0;
   max_updates number := 0;
   max_rows number := 0;
   random_row number :=0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   archive_size number;
   start_scn number;
BEGIN
   select current_scn into start_scn from v$database;
   select count(1) into max_rows from bgrenn.uncompressed;
   max_updates := max_rows * .01;
  execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;

   LOOP
      EXIT WHEN update_count > max_updates;
       select trunc(dbms_random.value(1,max_rows),0) into random_row from dual;
       UPDATE bgrenn.uncompressed SET SUBOBJECT_NAME= OBJECT_NAME WHERE myrownum = random_row;
       insert into bgrenn.myblockchanges select dbms_rowid.rowid_block_number(rowid) from  bgrenn.uncompressed where myrownum = random_row;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    update_count := update_count + 1;
   END LOOP;
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','UNCOMPRESSED');
   select count(distinct block_number) into blocks_changed from bgrenn.myblockchanges;
   select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.uncompressed where ora_rowscn >= start_scn;
   select max(sequence#) - 1 into end_seq_number from v$log;
   select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Update','Uncompressed',1,start_seq_number,end_seq_number,update_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   dbms_output.put_line('# of rows updated = ' || update_count);

 END;
/

Step 6 - Execute my procedure to update 1% of the data in  the compressed table committing every 100 rows. 

DECLARE
   commit_count number := 0;
   update_count number := 0;
   max_updates number := 0;
   max_rows number := 0;
   random_row number :=0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   archive_size number;
   start_scn number;
BEGIN
   select current_scn into start_scn from v$database;
   select count(1) into max_rows from bgrenn.compressed;
   max_updates := max_rows * .01;
  execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;

   LOOP
      EXIT WHEN update_count > max_updates;
       select trunc(dbms_random.value(1,max_rows),0) into random_row from dual;
       UPDATE bgrenn.compressed SET SUBOBJECT_NAME= OBJECT_NAME WHERE myrownum = random_row;
       insert into bgrenn.myblockchanges select dbms_rowid.rowid_block_number(rowid) from  bgrenn.compressed where myrownum = random_row;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    update_count := update_count + 1;
   END LOOP;
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','COMPRESSED');
   select count(distinct block_number) into blocks_changed from bgrenn.myblockchanges;
   select max(sequence#) - 1 into end_seq_number from v$log;
  select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.compressed where ora_rowscn >= start_scn;
    select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Update','Compressed',1,start_seq_number,end_seq_number,update_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   dbms_output.put_line('# of rows updated = ' || update_count);

 END;
/

Now that we have done our updates, let's take a look and see what happened.

TEST_TYPE  TEST_TABLE           ROWS_AFFECTED  INCR_SIZE ARCHIVE_SIZE TOTAL_SIZE
---------- -------------------- ------------- ---------- ------------ ----------
Update     Uncompressed                  5857   37699584      5346816   43046400
Update     Compressed                    5857   30203904     11993600   42197504

I updated the same number of rows into both tables.

When looking at the size of the incremental backup (these will become part of the level 0 backup) the comparison is.
    Compressed      - 30,203,904         
    Uncompressed -  37,699,584      
    Compression ratio  1.2X

Perfect ! My incremental backup size of my compressed table size isn't much different than that of uncompressed data. This isn't surprising. with a 1% change rate, the same block might not be updated more than once. Same number updates = same number of blocks backed up.


Now let's take a look at the Archive Log Backup. These will be kept for the retention window.

Compressed        -    11,993,600   
Uncompressed    -      5,346,816     
Compression ratio    .44x

Wow, it's less than 1/2. The archive logs for compressed data is about double what they are for uncompressed.

Finally let's take a look at the Total Change Backup size.

Compressed        -     42197504
Uncompressed    -     43046400
Compression ratio    1x

UPDATES - The daily backup size for updates is about the same for both compressed data and uncompressed data.

SUMMARY.


Implementing ACO does have a large number of benefits. One of which is decreasing the size of a full backup. When implementing TDE, this can help mitigate the loss of compression in your backup strategy.
However . If you have a high change rate, and a long retention window, the backups may end up being the same size as they were before, and could possibly be bigger. This is especially relevant in a backup strategy that includes deduplication/incremental merge/virtual fulls.





Friday, November 6, 2020

Cloud restores to a RAC cluster with RMAN.

 This post is about an RMAN command you probably never thought much about,"Autolocate". I know I never did until I started testing restores from a cloud store.


Now let's see what it is, what it does.

First let's see what the documentation says it does.

"RMAN automatically performs autolocation of all files that it must back up or restore. If you use the noncluster file system local archiving scheme, then a node can only read the archived redo logs that were generated by an instance on that node. RMAN never attempts to back up archived redo logs on a channel it cannot read.

During a restore operation, RMAN automatically performs the autolocation of backups. A channel connected to a specific node only attempts to restore files that were backed up to the node. For example, assume that log sequence 1001 is backed up to the drive attached to node1, while log 1002 is backed up to the drive attached to node2. If you then allocate channels that connect to each node, then the channel connected to node1 can restore log 1001 (but not 1002), and the channel connected to node2 can restore log 1002 (but not 1001)."

After reading this, you are probably wondering why this matters when restoring from a cloud store.

To show you, I will walk through what happens during the "autolocate" process.

First, as you would guess, the "autolocation" occurs before any restore operations can start.

Also, this only comes into play when restoring to multiple nodes in the RAC cluster using channels allocated to the different nodes..

To show why it's important to understand it, I will walk through the test case that I had.

EXAMPLE environment:

MYDB - I have a very large Database, 100 TB composed of 8,000 individual datafiles.

DB Host - I have 8 nodes in my RAC cluster

BACKUP - I backed up to a cloud store with a filesperset 1, section size 32G. since my datafiles where all 32G, they would be individual pieces even with a different filesperset.  My backup is composed of ~8,000 individual backup pieces.

RESTORE - In order to improve my restore performance I configured my restore across all 8 nodes.


What happens:

What the "autolocate" does, by default, is it validates that each backup piece is available from each node.  This is a serial process for each backup, AND for each node.

This turned out ot be a slow process due to the # of validations that needed to be performed. For my example it validated 8,000 X 8 = 64,000 validations.  

Also, I found that this serial process took a lot of time.  Even though each validation takes a fraction of a second, the total time for the validation becomes significant.  In my test case, 8 pieces/second were being validated.

This added up because below is what was happening.

START RESTORE :  00:00

    Node 1 - validate 500 pieces : 01:02

    ......

    Node 1 - validate 5000 pieces : 10:25

    ....

    Node 1 - validate 8000 pieces : 16:40

    Node 2 - start validation : 16:40

    ....

    ....

     Node 8 - validate 8000 pieces    2 : 13:20

BEGIN Restoring files.


So how to get around this issue? If you are sure that all backupieces you are restoring are available from every node in your cluster, you can set it off at the beginning of your restore operation.

RMAN> set autolocate off;

During my testing, it bypassed the validation step, and started restoring the database within a few seconds.

This is something to keep in mind, if you see a gap in time between starting a restore on a RAC cluster, and when it starts assigning datafiles to channels.

Tuesday, October 13, 2020

ZFS Appliance - Your on-premise cloud store

 This is the first in a series of posts about using ZFS as a cloud store for your S3 bucket, or your OCI (Oracle Cloud Infrastructure) bucket in your datacenter.



I am going to walk through in the next few posts how to utilize a ZFS appliance as a cloud store.

There maybe a number of reasons why you might want to do this. Maybe you want to learn more about using cloud buckets for storage, or maybe you are already familiar with the the concept, and you want to jump right in.

As my starting point for this I am going to go through how I configured this in my home office lab so you can too.. That is provided that you have enough hardware sitting around to set it up.


Step 1 : Download an Virtual box. 

If you haven't worked with virtualbox yet, now is a good time to get started.  In order to go through and emulate what I am explaining, you are going to need at least 1 virtualbox environment for the ZFS emulator, and a second environment (virtual or bare metal) for a DB server.  Virtual box will allow you to download an image (which includes the software install) and start it up immediately.  This saves a lot of time and trouble in downloading and configuring an environment (patches, libraries, settings) etc.

Step 2 : Download Database image, and ZFS image


At a bare minimum to go through my demo, you will need 2 environments, The ZFS environment must be a virtual image, but the DB environment is up to you. Below are the 2 Virtual Box images to get you started

VB 1 : Database Virtual Box Appliance / Virtual Machine. DB Version 19.3 (as of writing)
VB 2 : Oracle ZFS Storage Simulator. Version 8.8 (as of writing)

Step 3: Patch ZFS image to latest release.


This can be done by downloading the latest patch from My Oracle support, and applying it to the running ZFS image.  This is critical for the ZFS image, as there have been recent patches to to make these features work seamlessly.

At the end of completing these 3 steps, you will have a ZFS Simulator running on at least release 8.26 (the current release as of writing) , and a database running release 19.3 (mine is running 19.6)

FULL DISCLOSURE : Since I use my DB host for a lot of other testing, I am actually executing my demos on a bare metal environment using OEL (Oracle Enterprise Linux) and  DB version 19.6.

In this first post I am going to go through the steps to configure my ZFS simulator to present itself with both an S3 bucket and an OCI bucket.

From there I will write 2 more posts on setting up security, connecting to the buckets, and backing up my database.

For documentation, these are few of the places I found start with.

To make it easy, I created one big storage pool "mypool" Double Parity. I used 10 disks, and the pool is around 35 GB. Enough to perform some simple tests with. I also created a single share "myshare" using all the default values.

Step 5: Enable S3 and OCI services.

I logged onto the ZFS storage console through my web browser as root, using the IP address I assigned during the first startup, and the password I entered during the first startup of the VB image .

Start by going to the services page for the appliance with "Configuration --> Services"
On this page you will notice that the "HTTP" service (under Data services) is greyed out and disabled.
Click on the "power button" to enable the service.



Once the service starts, it will appear green like other online Data Services.
Now click on the "HTTP" itself bringing up the configuration screen.

First let's enable S3 for the appliance.  

Note: under HTTP, I clicked on the S3 tab bringing up the screen below.   
Click on "Enable S3" and press the "APPLY" button in the top right under "Logs".


Next let's enable OCI for the appliance.

Note: under HTTP, I clicked on the S3 tab bringing up the screen below.   
Click on "Enable OCI" and press the "APPLY" button in the top right under "Logs".


Now that I enabled the S3, and OCI bucket at the appliance I need to enable it for my share. This can be done at different levels, at the Project, or at the Share.  To make things easy (since I only have 1 project and 1 share), I enabled it at the project level.
You can see below that I changed both the "S3 API Mode" and the "OCI API Mode" from "None" to "Read/Write" and applied the changes.





You can also see the URL from the project level

S3    : http://10.0.0.115/s3/v1/export/
OCI : http://10.0.0.115/oci/n/export/

When I look at the share and the protocol setting, I can see the HTTP settings are inherited from the project, and my S3 and OCI bucket URLs are shown on the page.


You can also see the URL from the share level (inherited from the project)

S3    : http://10.0.0.115/s3/v1/export/myshare
OCI : http://10.0.0.115/oci/n/export/myshare

That's all there is to configuring my ZFS appliance to act as a cloud store for both S3 buckets and OCI buckets.

My next 2 posts (I am reserving the spots here to add the links later) will cover.




Tuesday, September 29, 2020

ZDLRA - DISK_RESERVED_SPACE checkup

 I wanted to go through some very basic items to think about on disk reserved space for databases backed up to a ZDLRA.


There are couple of posts that have been written on this by both myself and Sudhakar Kotagiri.

One of the key items to concentrate on is the DISK_RESERVED_SPACE for each database.  A simple explanation of the DISK_RESERVED_SPACE is this setting represents the amount of space you set aside on a database-by-database basis to keep a backup window to support the recovery window goal.

A simple example of the DISK_RESERVED_SPACE is.....

My Full backup takes up 40 TB.  Keeping 14 days of recovery (which is my recovery window goal) takes up an additional 1 TB/day.

In this example, I need 54 TB of storage to keep 14 days of recovery.

For this database, I would set the reserved space to be  56.5 TB to ensure I have an extra 5% of space available to handle any unexpected peaks.

Easy right ?  The value for RECOVERY_WINDOW_SPACE in the RA_DATABASE view gives you the space needed to support the Recovery Window.


But.. the reason I called this a checkup is that I wanted to make sure some thought is given to the setting. If your database is changing (which it almost always is), then this needs to be reviewed and adjusted.


Below are some simple rules of thumb of what to think about when adjusting DISK_RESERVED_SPACE


  • Stable Mature Databases - If your database is mature, and the workload is consistent, the DISK_RESERVED_SPACE should be 5-10% larger than the RECOVERY_WINDOW_SPACE. This setting should be reviewed at least 4 times year to be sure it is accurate.
  • Actively Changing Databases -  If the database has a changing workload. Maybe it is still growing, or maybe new features are being added to the application.  The DISK_RESERVED_SPACE should be set at 5-10% larger than RECOVERY_WINDOW_SPACE  + Include a percentage for growth. This should be reviewed monthly (at a minimum) OR if a big growth spurt is planned.
  • Databases with Peaks -  For some business, there may be databases that have peaks. Maybe they support "Black Friday", or maybe they support huge sales around the superbowl.  The DISK_RESERVED_SPACE should be 5-10% larger than the RECOVERY_WINDOW_SPACE needed during this peak.  This will ensure that the space is available when the peak comes.
  • TDE databases - when a database migrates to TDE, there is a time period where storage is needed for the Pre-TDE backup, and the Post-TDE backup.  You need to adjust the DISK_RESERVED_SPACE to take this into account.  NOTE: Staggering the migration when you migrate to TDE can avoid running out of DISK_RESERVED_SPACE for all databases.
  • Databases with ILM - if you have databases performing ILM activities this affects backup space needed.  A simple example would be a database whose historical data is moved to an HCC tablespace when it becomes inactive.  Space needs to be reserved in DISK_RESERVED_SPACE to hold the old structure, the new structure, and the archive logs created during this change.

My suggestion to simplify this is to use PROTECTION POLICIES.  Each type of database can be in it's own protection policy.  Review the DISK_RESERVED_SPACE at the appropriate time for each policy.

It's that easy. :)


Thursday, September 24, 2020

ZDLRA - How to do a storage checkup

 One of the items that comes up with the ZDLRA is a storage checkup.  The DBAs want to know more detail about the storage utilization of each database.



Once you understand the above concepts you realize that are there 2 major pieces that affect the storage utilization for a database.

1) How much space a level 0 backup takes.  Since the ZDLRA virtualizes full backups, each database has at least 1 copy of each block on the ZDLRA.  It would be only 1 if it doesn't change, or it could 30 copies of the same block if it changes every day (like system tablespace data). What you are interested is the size of 1 full backup

2) The amount of storage 1 day of changes takes up (on average).  This would be the stored size of an incremental backup (if you perform an incremental every day), and it would be the stored size of the archive logs for a day of workload.

By combining these 2 pieces you can then calculate how much storage is needed for X number of days of backups.

Now how do I do this ? below is the query I use, and I will explain the columns in it.

select db_unique_name,
               trunc(size_estimate,0) estimated_db_size,
               recovery_window_goal,
               trunc(space_usage,0) space_usage,
               trunc(estimate_zero_day_space - ((estimate_seven_day_space - estimate_one_day_space)/6),0) level_0_size,
               trunc((estimate_seven_day_space - estimate_one_day_space)/6,1) one_day_space,
               trunc(recovery_window_space,0) recovery_window_space,
               disk_reserved_space,
estimate_rwg_space
from
(Select db_unique_name,
       Space_usage,
       extract(day from recovery_window_goal) recovery_window_goal,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,numtodsinterval(1,'day')) estimate_zero_day_space,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,numtodsinterval(1,'day')) estimate_one_day_space,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,numtodsinterval(7,'day')) estimate_seven_day_space,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,recovery_window_goal) estimate_rwg_space,
        RECOVERY_WINDOW_SPACE,
       disk_reserved_space,
       size_estimate
               from ra_database);


 What's returned

DB_UNIQUE_NAME

DB name

RECOVERY_WINDOW_GOAL

How long backups are kept

SPACE_USAGE

How much space (GB) is the DB using in total ?

LEVEL_0_SIZE

Estimated size (GB) of just the full backup

ONE_DAY_SPACE

Estimated space usage (GB) for a single day of backups

RECOVERY_WINDOW_SPACE

How much space is needed for a 14 day recovery window.

DISK_RESERVED_SPACE

How much space is set aside for backups ?

ESTIMATE_DB_SIZE

How big is the database (GB) estimated to be ?

ESTIMATE_RWG_SPACE

This returns the space (GB) needed for the recovery window from RA_DATABASE which may not match calculating using the columns returned.


Now let's take a look at what I can do with this..

This is an example, where I summarize the space utilization for a couple of  ZDLRAs.



And here I looked at the detail for the databases.


This report (just above this) gives me some really useful information.

  • I can see DB02 has a really big change rate. The database is only about 2.5 TB, but it is using 14 TB of storage.
  • I can see the disk_reserved_space is way too small for DB01.  DB01 needs about 15 TB to keep it's recovery window goal, but the disk_reserved_space is only 500 GB
  • DB03 looks good.  Change rate is not significant, and disk_reserved_space is set a little higher than the RECOVERY_WINDOW_SPACE.


Now finally, I was able to use the one_day_space, and graph out the space utilization for each days Recovery window.
This graph shows each day of RWG and it's storage needs,  the currently USED, and the USABLE space.  I can see that even though my used space is close to the usable space, there is still room for growth. I can also use this to see what happens to my storage utilization if I changed the RWG to 10 days.

I highly recommend periodically doing a health check on your storage utilization, and review the disk_reserved_space.

I hope this gives some information you can use to take a closer look.

**NOTE ** this query is accurate as 9/30/20.  It might need to be adjusted with future releases.

Also, it is only as accurate as the data. The longer a database has been backing up with a consistent workload, the better the estimate.

 

Wednesday, September 23, 2020

ZDLRA, Real-time redo and compression

 In this post I will go through what happens to archive logs sent to the ZDLRA through real-time redo.


The most common way to send archivelog backups to a ZDLRA is through real-time redo.

In this method the ZDLRA is treated just like a standby database destination.

The main difference with sending logs to the ZDLRA is that logs need to be sent (REDO_TRANSPORT_USER) as the VPC (virtual private catalog) account that is registered to send backups.

This is done by use of wallet containing the VPC user ID and Password and is included in the channel configuration parameter.

There is a great explanation of most of this from my colleague Fernando Simon and you can find it here.

ZDLRA, Real-Time Redo and Zero RPO


What I wanted to go through is the process of sending the logs (real-time), and the process of storing the logs on the ZDLRA.

The first thing to understand is the steps in the process of turning real-time redo into RMAN backupsets.


Step 1  The redo is captured real-time from the ZDLRA through the use of "shadow logs". Think of "shadow logs" as standby redo logs that are created for each database, and for each redo log that is being captured.  Just like standby redo logs, these are full size logs. To give you an example, lets say there are 6 databases sending real-time redo the the ZDLRA, 3 of these are 2 node RAC clusters.  Each database have a redo log size of 20 GB.

On the ZDLRA, these are mirrored (to disk) and will use storage which is included in the USAGE number for the database.  In my example there will be 9 logs



Step 2 - When a log switch occurs a task is created called BACKUP_ARCH. This task is responsible for taking the "shadow log" and turning it into an RMAN backupset containing the log.

The RMAN backupset can be compressed (and it uses BASIC by default, please change it) based on the policy that the Database is a member of.

One of the advantages of the ZDLRA is that the compression license is NOT needed to use other degrees of compression.

The suggestion I would make is.

TDE Databases - Put ALL TDE databases in their own policy and set compression to NONE.  TDE archive logs will not compress and will cause overhead.]

NON-TDE databases - Use LOW compression. this will give you best combination of compression ratio and elapsed time.


Now let's take a look at the tasks to see what I am talking about.


Below is a snippet from the currently running tasks (taken from a SAR report).

TASK_TYPE                 PRIORITY  STATE            CURRENT_COUNT  LAST_EXECUTE_TIME     WORK_TYPE    MIN_CREATION
----------------------  ----------  ---------------  -------------  --------------------  -----------  ------------
BACKUP_ARCH                    120  RUNNING                      7  03-OCT-2019 14:49:08  Work         03-OCT-2019

I can see there there are currently 7 redo logs that have switched, and are awaiting processing to become backupsets. This number should always be very small.

Below is a snippet from the tasks executed in the last 24 hours (also from a SAR report).

TASK_TYPE               STATE                   CNT     CREATED  MIN_COMPLETION_TIME     MAX_COMPLETION_TIME     OLD_CREATION_TIME
----------------------  ---------------  ----------  ----------  ----------------------  ----------------------  ----------------------
BACKUP_ARCH             COMPLETED             9,591       9,580  02-OCT-2019 18:50:35    03-OCT-2019 14:50:28    02-OCT-2019 18:49:49


This is telling me that there were 9,591 log switches on all my protected databases in the last 24 hours.

From a compression standpoint. PLEASE at least change the current setting in your policies for compression. and use the recommendations.

TDE - No compression
No TDE - LOW compression.

I point out in the my last post why this so important to get right.




Monday, September 21, 2020

ZDLRA, archivelog log backups and compression

 In this post I will go through what happens with Archive log Backupsets sent to the ZDLRA through log sweeps.


When you implement ZDLRA you have 2 choices in backing up archive logs.

1) Use real-time redo transport (RRT) which is the same mechanism that is used to send archive logs to a standby database.

2) Use traditional log sweeps (RMAN) that pick up the archive logs and send them to the ZDLRA as backupsets.

Today I am going to go through the second option, using RMAN log sweeps.

Before I go into detail please refer to this MOS note to ensure you understand best practice for backing up a database to the ZDLRA.

RMAN best practice recommendations for backing up to the Recovery Appliance (Doc ID 2176686.1)

As of writing this post, the best practice is

backup device type sbt cumulative incremental level 1 filesperset 1 section size 64g database plus archivelog filesperset 32 not backed up; 

When you execute the best practice command, there are 2 pieces to this backup script.

Database Backup - The best practice is filesperset=1 and section size 64G. This ensures that a large datafile backup (big file) is broken up into pieces, and each backup piece contains only a single datafile. This allows the virtualization process to start as soon as each backup piece is received

Archivelog Backup - Best practice is to use filesperset=32 and only backup archivelogs that have not been backed up.

Now to walk through the archive log backup process:

RMAN will create a backupset of 32 archive logs.  This backupset will be sent to the ZDLRA (through the libra.so library) and will be written to physical disk on the ZDLRA.  The RMAN catalog on the ZDLRA will be immediately updated with the location of the backupset.

Since there is no processing done on the ZDLRA once received (beyond what the RMAN client does), the file is written "as is" on the ZDLRA.

So what why do I point this out ?  As you may know the ZDLRA compresses Datafile backups received, but it does not compress archivelog backupsets through RMAN. If you want your archivelog backupset compressed (that came to the ZDLRA through an RMAN log sweep) you must perform compression through RMAN before sending the archive logs.,

There are a few items to think about before you rush into immediately compressing archive logs.

The first of which (and probably most important to your company) is that RMAN compression, other than basic (which is NOT recommended) requires the ACO (advanced Compression) option (license).  If the databases you support are NOT licensed for ACO usage, then you should stop right here, and consider using real-time redo.  Real-time redo can use all levels of compression without the ACO because the compression is done on the ZDLRA. This will be my next blog post.

#1 - ACO is required for RMAN compression. Use real-time redo to compress on the ZDLRA without the ACO license

The second thing to think about is what level of compression.  Below is some example compression ratio AND timings that have been achieved to give you an idea of the differences. Of course every one's data is different, so your mileage could vary. This does give you an idea however.


BASIC - The elapsed time is 5x longer than it is for NOCOMP. I would absolutely not recommend using BASIC compression.

LOW - The elapsed time was actually less than NOCOMP, most likely due to sending less traffic. The backup ratio was roughly 2:1 giving a great balance of similar execution time and reasonable compression

MEDIUM - The elapsed time was triple (3x) that of LOW or NOCOMP. The compression ratio was slightly better, but not significant.

HIGH - The elapsed time was 24x longer than it is for NOCOMP, and the compression ratio was only slightly better. I would absolutely not recommend using HIGH compression

#2 - LOW compression offers the best balance between elapsed time, and compression ratio.

As I point out that compression of archive logs is a good thing, there as a BIG CAVEAT to this. The ZDLRA has its own compression of datafile backups.  The ZDLRA compression is of each individual block, NOT the backupset. Because of this RMAN compression of datafiles is not recommended, and if TDE is implemented this will cause backups not to virtualize.  The 2 items are.

  • The ZDLRA will uncompress the RMAN backupset and recompress the blocks once virtualized.
  • TDE data will not be virtualized since RMAN compression re-encrypts the backupset.

#3 - DO NOT compress datafile backups.

The 4th item associated with the compression of archive log backupsets is replication. The replication of archivelogs on the ZDLRA is the "cascade" of backupsets.  The backupset containing the archive logs are sent to the downstream "as-is".  If you compress the archive logs with RMAN, then they get replicated compressed. The compressed backupsets not only use less network traffic when replicating, but they will also be stored on the downstream compressed.

#4 - Compression of archive logs means less network traffic with replication.

The 5th item associated with the compression of archive logs is validation on the ZDLRA. Compression of archive logs comes with a slight cost, and this is one of the trade-offs.  The ZDLRA (as you might know) does a "restore validate" of all backups on the ZDLRA on regular basis (typically once a week).  In order to validated archivelog backupsets, these backupsets need to be uncompressed. The uncompression of archivelog backupsets uses CPU on the ZDLRA and the higher the compression, the greater the overhead of this process. Believe it or not, weekly validation is one of the most intensive tasks performed on the ZDLRA.  Using LOW compression has minimal impact on CPU during validation and is recommended unless space is at a premium and MEDIUM compression can be tolerated.

NOTE: This can be monitored in the SAR report by looking at the VALIDATE task. You should see VALIDATE tasks completing, and when looking at executing tasks, the MIN_CREATION should with a day or 2 of executing the SAR report.  If the MIN_CREATION data is more than few days old, VALIDATION tasks are not keeping up and implementing compression will exasperate this situation.

#5 - Validation requires uncompressing archivelog backupsets, so be careful of too high a level of compression.

The final item associated with the compression of archive logs is the recovery of the database using archivelog backupsets.  During a recovery operation, any archivelogs restored through RMAN will have to be uncompressed. This uncompression may affect recovery time. LOW gives the best tradeoff since the elapsed time to uncompress is minimal.  If the network is saturated, restoring compressed archivelogs (which are typically 50% the size) may actually help with recovery time.

#6 - The DB host will have to uncompress archivelog backupsets during recovery. This may affect recovery time.

Now the question is.. How do I put this together to get LOW compression of archive logs AND not compress datafiles?

This is how it can be done.


1) Enable RMAN LOW compression option.
RMAN> CONFIGURE DEVICE TYPE 'SBT_TAPE' BACKUP TYPE TO BACKUPSET;


2) Ensure that compressed backupsets are NOT used by default
RMAN> CONFIGURE DEVICE TYPE 'SBT_TAPE' BACKUP TYPE TO BACKUPSET;

3) Daily incremental level 1 Backups.

run
{
backup as compressed backupset filesperset 8 archivelog all not backed up delete input;
backup as backupset cumulative incremental level 1 filesperset 1 section size 128G database;
backup as compressed backupset filesperset 8 archivelog all not backed up delete input;
}

4) Periodic log sweep Backups.

run
{
backup as compressed backupset filesperset 8 archivelog all not backed up delete input;
}


I am hoping this gives you everything you need to know about using RMAN log sweeps with the ZDLRA and you can decide if you want to use compression of archivelogs during those sweeps.






Thursday, July 16, 2020

ZDLRA and TDE wallet location - Part 2

TDE and SEPS security - how do I get there?
If you read my last blog post on TDE and SEPS security you might be asking yourself, how do I get there ?

Many customers use the default location for the TDE wallet (because they are new to TDE) and find that it the default location will cause conflicts with other Oracle features.

The basic question around this would be.

"all my TDE wallets are in the default location of $ORACLE_HOME/admin/DB_UNQUE_NAME/wallet 
                  or 
$ORACLE_BASE/admin/DB_UNQUE_NAME/wallet
and  I have multiple databases sharing the same $ORACLE_HOME location 
how do I get to a dedication location for TDE?

The challenge, especially if you want to use WALLET_LOCATION (which the ZDLRA requires for real-time redo) is how to get from the default to a dedicated location.
The issue is that WALLET_LOCATION overrides the default location, unless a dedicated TDE wallet location is specified.

First-- The SQLNET.ORA file is ONLY read by the database at startup. Any changes made to the sqlnet.ora file will be effective when a database instance bounces.  You do want to be careful with the coordination however, because a database instance can bounce at any time for any number of reasons so plan carefully.

Now let's start with the where to put the TDE wallet files.  There are many options

1) Leave the wallet files within the $ORACLE_HOME directory using the $ORACLE_SID. 
     PROS - This is less disruptive since it uses a variable already set
     CONS - Wallets have to be be moved to a new location with an out of place upgrade.
                   You need copy the wallet to this new location when implementing.
                    In a multi-node RAC cluster the location is different on each node

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created on each node for every instance.
  • Copy the wallet files to the appropriate subdirectory for each node and for each instance
  • Update the SQLNET.ORA file to point to $ORACLE_HOME/admin/$ORACLE_SID/tde_wallet
2) Leave the wallet files within the original location in $ORACLE_HOME that uses the $DB_UNIQUE_NAME.
     PROS - You don't have to move the wallet files
     CONS - You need to set a new variable
                    Wallets have to be be moved to a new location with an out of place upgrade.

    STEPS
  • For ALL databases sharing the same $ORACLE_HOME ensure that the variable $DB_UNIQUE_NAME is set through srvctl (if available). This ensures all nodes in a RAC cluster have the variable set.
  • Ensure all login scripts on all nodes (including the login script) have the variable $DB_UNIQUE_NAME set
  • Update the SQLNET.ORA file to point to the $ORACLE_HOME/admin/$DB_UNIQUE_NAME/wallet
3) Leave (or move) the wallet files within the $ORACLE_BASE directory using the $ORACLE_SID.  

     PROS - This is less disruptive since it uses a variable already set
     CONS - Wallets have to be be moved to a new location with an out of place upgrade.
                   You need copy the wallet to this new location when implementing.
                    In a multi-node RAC cluster the location is different on each node

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created on each node for every instance within the $ORACLE_BASE/admin directory (unless this was already the default)
  • If necessary, copy the wallet files to the appropriate subdirectory for each node and for each instance
  • Update the SQLNET.ORA file to point to $ORACLE_BASE/admin/$ORACLE_SID/wallet
4) Migrate to $ORACLE_BASE and use $DB_UNIQUE_NAME
     PROS - Once set, you can leave the wallets after out-of-place upgrades
     CONS -  You need copy the wallet to this new location when implementing.
                    You need to set a variable to be used

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created on each node for every $DB_UNIQUE_NAME within the $ORACLE_BASE/admin directory (unless this was already the default)
  • Copy the wallet files to the appropriate subdirectory for each node and for each instance
  • For ALL databases sharing the same $ORACLE_HOME ensure that the variable $DB_UNIQUE_NAME is set through srvctl (if available). This ensures all nodes in a RAC cluster have the variable set.
  • Ensure all login scripts on all nodes (including the login script) have the variable $DB_UNIQUE_NAME set
  • Update the SQLNET.ORA file to point to $ORACLE_BASE/admin/$DB_UNIQUE_NAME/tde_wallet

5) Migrate to ASM (Not available in 11.2) and use $DB_UNIQUE_NAME
     PROS - Once set, you can leave the wallets after out-of-place upgrades
                   You now have a central location for a RAC cluster
     CONS -  You need copy the wallet to this new location when implementing.
                    You need to set a variable to be used

    STEPS

  • For each database sharing the $ORACLE_HOME ensure there is a wallet subdirectory created in ASM for every $DB_UNIQUE_NAME 
  • Copy the wallet files to the appropriate subdirectory for each database
  • For ALL databases sharing the same $ORACLE_HOME ensure that the variable $DB_UNIQUE_NAME is set through srvctl (if available). This ensures all nodes in a RAC cluster have the variable set.
  • Ensure all login scripts on all nodes (including the login script) have the variable $DB_UNIQUE_NAME set
  • Update the SQLNET.ORA file to point to +DISKGROUP/$DB_UNIQUE_NAME/tde_wallet

It's your choice which path to take.  For me, the best (if ASM isn''t an option) is to put the TDE Wallets within $ORACLE_BASE/admin/$DB_UNIQUE_NAME/tde_wallet.  That way with each out-of-place upgrade I don't have do anything with the wallet. As long as the sqlnet.ora points to the $ORACLE_BASE there won't be any changes.


NOTE: for 18c and above just migrate to WALLET_ROOT which allows you set the value for each database individually.

Tuesday, July 14, 2020

ZDLRA and TDE wallet location

TDE and SEPS security



I am seeing TDE used more and more at customers as security concerns increase.
This blog post will go through configuring TDE and SEPS security (which ZDLRA uses) together.
If OID is used also, this post talks about how to combine OID and SEPS.

First off, the solution depends on the version of oracle you are using.  Depending on your configuration SEPS security and TDE may use the same wallet location. This is NOT recommended.
Below is the hierarchy of where Oracle expects the TDE wallet to be. As soon as it finds the setting it stops

TDE_WALLET_LOCATION
         WALLET_LOCATION
                    $ORACLE_HOME/admin/$DB_UNIQUE_NAME/wallet
                              $ORACLE_BASE/admin/$DB_UNIQUE_NAME/wallet

**NOTE: unless the TDE_WALLET_LOCATION is already set,
                 setting the WALLET_LOCATION will break TDE

When using SEPS security it is critical that you properly set the TDE wallet location first.

11.2

First let's talk through 11.2 and the recommendation for TDE encryption wallet. This is the most basic configuration setting.

Best practice is the set the ENCRYPTION_WALLET_LOCATION in the sqlnet.ora.
If there are multiple databases sharing the same $ORACLE_HOME (multi-homing), then the location needs to use a variable.

Single home example.



ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=/u01/app/oracle/tde_wallet)))


Multi-Home examples



Example 1 - using the $ORACLE_SID variable for the location



ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=/u01/app/oracle/admin/$ORACLE_SID/tde_wallet)))

Example 2 - using a new variable


First ensure that the variable set is set when servctl is used to restart the databases.

srvctl setenv database -db database_name -env "DB_UNIQUE_NAME=database_name"

Second ensure the variable is set during any scripts and when logging into the host

export $DB_UNIQUE_NAME=database_name

Then use this variable within the sqlnet.ora

ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=/u01/app/oracle/admin/$DB_UNIQUE_NAME/tde_wallet)))

** NOTE: you need to create the directories for all databases sharing that same $ORACLE_HOME even if they don't use TDE or SEPS.


12.1/12.2

The configuration for 12.1 is similar to 11.2 with one exception, 12.1 allows you to use ASM for the location of the wallet in a RAC environment.

Here are the examples of ASM based on the 11.2 information.

Single home example.


ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=+DATA/tde_wallet)))

Multi-Home example


ENCRYPTION_WALLET_LOCATION=
 (SOURCE=
  (METHOD=FILE)
   (METHOD_DATA=
    (DIRECTORY=+DATA/$DB_UNIQUE_NAME/tde_wallet)))


18c+

Oracle version 18c adds more functionality for the TDE wallet.

18C introduces a new init parameter for TDE called "WALLET_ROOT". in fact, TDE_ENCRYPTION_LOCATION will be depreciated (see below from 18c docs).





WALLET_ROOT is set to the starting location of the TDE wallet, and uses the location as the starting location for wallets for both the CDB, and subdirectories for PDB wallets.

WALLET_ROOT can either be a local file system (or NAS).

          Example
                           WALLET_ROOT=wallet-root-directory-path

It can also be set to an ASM location

         Example
                           WALLET_ROOT=+disk-group-name/db-unique-name

SUMMARY : When implementing the ZDLRA (which uses SEPS security) with an existing TDE implementation, it is critical to ensure that TDE was configured using best practices.  If best practices were not followed, configuring the WALLET_LOCATION may cause wallet issues with databases.





Wednesday, May 13, 2020

Sharing an RMAN Catalog with multiple users on ZDLRA


One of the topics that comes when implementing ZDLRA is the new RMAN feature that allows multiple users to share an RMAN catalog, but still be isolated.
This is NOT a ZDLRA specific feature, but I find that most customers have never used it until they move to a shared BaaS (Backup as a Service) environment like ZDLRA.

Here is the basic explanation.  I am a DBA in the ACME corporation. The ACME corporation has 2 divisions.
The white division and the black division.


I work for the black division of ACME.
Since the ACME corporation has a shared infrastructure, there is a single ZDLRA and thus a single RMAN catalog for all database backups.

As you can guess my division (black) wants to be isolated from the white division. We don't want the white division to be able to access the backups for any database that my division supports.  The White division feel the same way about my division.

Now this where the idea of VPC users in the RMAN catalog comes in.  First here is some documentation on it to do your own reading.

For my example I have 2 databases.

  • OEMDB (White division maintained)
  • RMAN19C (Black division maintained)

The ZDLRA already has a catalog created, and in order to manage VPC users, you need to use the Command Line Interface (racli).  

With the naming, on the ZDLRA the users are referred to as VPC (Virtual Private Catalog users).  The documentation for this feature refers to it as a VPD (Virtual Private Database) model.

In order to demonstrate how this works, I am going to walk through the steps with just RMAN.

Step # 1 - Create an RMAN user in a database to own the RMAN catalog.



SQL> create user rman identified by oracle
            temporary tablespace temp
            default tablespace rman_data
            quota unlimited on rman_data;

grant recovery_catalog_owner to rman;
  2    3    4
User created.

SQL> SQL>
Grant succeeded.


Step #2 - create the catalog for the RMAN user.


[oracle@oracle-server admin]$ rman target / catalog rman/oracle

Recovery Manager: Release 19.0.0.0.0 - Production on Wed May 13 08:10:23 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: OEMDB (DBID=700723428)
connected to recovery catalog database

RMAN> create catalog tablespace rman_data;

recovery catalog created

RMAN>



Step #3 - Enable the VPD model within the recovery catalog. The VPD model is not available by default.



SQL> @/$ORACLE_HOME/rdbms/admin/dbmsrmanvpc.sql -vpd rman

Checking the operating user... Passed

Granting VPD privileges to the owner of the base catalog schema RMAN

========================================
VPD SETUP STATUS:
VPD privileges granted successfully!
Connect to RMAN base catalog and perform UPGRADE CATALOG.

Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0


Step #4 - Upgrade the RMAN catalog.


I didn't notice this at first, but the last message at the end of the script indicates that you need to execute UPGRADE CATALOG.

-vpd command grants required privileges to support VPD protected catalog.
Connect to RMAN base catalog and perform UPGRADE CATALOG after the VPD
privileges are granted.


After UPGRADE CATALOG is performed for the base catalog schemas a cleanup
of VPC schemas has to take place for that the RMAN base catalog schema
names have to be supplied as command line parameters.  Up to 10 schema
names can be supplied per script execution.  When -all is specified the
script attempts to detect the RMAN base catalog schemas automatically
and perform the upgrade.

Now to upgrade the catalog.

[oracle@oracle-server admin]$ rman catalog rman/oracle

Recovery Manager: Release 19.0.0.0.0 - Production on Wed May 13 08:22:18 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to recovery catalog database

RMAN> upgrade catalog;

recovery catalog owner is RMAN
enter UPGRADE CATALOG command again to confirm catalog upgrade

RMAN> upgrade catalog;

recovery catalog upgraded to version 19.06.00.00.00
DBMS_RCVMAN package upgraded to version 19.06.00.00
DBMS_RCVCAT package upgraded to version 19.06.00.00.



Step #5 - Create my 2 VPC users within my RMAN catalog which is now VPD enabled.



SQL>
create user black identified by oracle
            temporary tablespace temp;

grant create session to black;

create user white identified by oracle
            temporary tablespace temp;

grant create session to white;
SQL>   2
User created.

SQL> SQL>
Grant succeeded.

SQL> SQL>   2
User created.

SQL> SQL>
Grant succeeded.




Now I have 2 choices for registering my 2 databases.

  • I can reserve the "REGISTER DATABASE" for the catalog owner (RMAN) only
  • I can grant VPC users the authority to register their own databases

Step #5 - I have decided to grant the VPC users the ability to register databases themselves.



[oracle@oracle-server ~]$ rman catalog rman/oracle

Recovery Manager: Release 19.0.0.0.0 - Production on Wed May 13 08:25:29 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to recovery catalog database

RMAN> grant register database to black;
grant register database to white;

Grant succeeded.

RMAN>
Grant succeeded.



NOTE : When I first ran the commands I received the error below.

RMAN-07543: recovery catalog does not have VPD support enabled

f you see this error, it is most likely because you didn't execute the" UPGRADE CATALOG" after enabling the VPD model.

Step #6 - Now lets register the databases with separate VPC users and backup a datafile.


I am going to register database OEMDB as "white" vpc user and backup datafile 1;


[oracle@oracle-server ~]$ rman target / catalog white/oracle@zdlra_sf
Recovery Manager: Release 19.0.0.0.0 - Production on Wed May 13 08:39:57 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: OEMDB (DBID=700723428)
connected to recovery catalog database

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

RMAN> backup incremental level 0 datafile 1;

Starting backup at 05/13/20 08:40:29
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=398 device type=DISK
channel ORA_DISK_1: starting incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/home/oracle/app/oracle/oradata/OEMDB/datafile/o1_mf_system_h4llc7kt_.dbf
channel ORA_DISK_1: starting piece 1 at 05/13/20 08:40:30
channel ORA_DISK_1: finished piece 1 at 05/13/20 08:41:15
piece handle=/home/oracle/app/oracle/fast_recovery_area/OEMDB/backupset/2020_05_13/o1_mf_nnnd0_TAG20200513T084030_hcqtoz3z_.bkp tag=TAG20200513T084030 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:45
Finished backup at 05/13/20 08:41:15

Starting Control File and SPFILE Autobackup at 05/13/20 08:41:15
piece handle=/home/oracle/app/oracle/fast_recovery_area/OEMDB/autobackup/2020_05_13/o1_mf_s_1040287283_hcqtqnbm_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 05/13/20 08:41:26



I am going to register database RMAN19C as "black" vpc user and backup datafile 1;

[oracle@oracle-server ~]$ rman target / catalog black/oracle@zdlra_sf
Recovery Manager: Release 19.0.0.0.0 - Production on Wed May 13 08:42:49 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: RMAN19C (DBID=4290218304)
connected to recovery catalog database

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

RMAN> backup incremental level 0 datafile 1;

Starting backup at 05/13/20 08:43:12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=274 device type=DISK
channel ORA_DISK_1: starting incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/home/oracle/app/oracle/oradata/RMAN19C/datafile/system.dbf
channel ORA_DISK_1: starting piece 1 at 05/13/20 08:43:13
channel ORA_DISK_1: finished piece 1 at 05/13/20 08:44:08
piece handle=/home/oracle/app/oracle/fast_recovery_area/RMAN19C/backupset/2020_05_13/o1_mf_nnnd0_TAG20200513T084313_hcqtv1fc_.bkp tag=TAG20200513T084313 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:55
Finished backup at 05/13/20 08:44:08

Starting Control File and SPFILE Autobackup at 05/13/20 08:44:08
piece handle=/home/oracle/app/oracle/fast_recovery_area/RMAN19C/autobackup/2020_05_13/o1_mf_s_1040287449_hcqtwtmj_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 05/13/20 08:44:12



Step #7 - Now lets see what happens when I try to cross VPC users within the catalog.


NOTE -below are the 2 databases and DBIDs. I will try to access opposite database to restore it from RMAN using the DBID.


white VPC user    ---->  OEMDB (DBID=700723428)
black VPC user    ---->  RMAN19C (DBID=4290218304)

First lets connect as "white" to a new instance and try to set the dbid for each of the databases. You can see that "white" can only see the metadata for the OEMDB.

[oracle@oracle-server admin]$ rman target / catalog white/oracle@zdlra_sf

Recovery Manager: Release 19.0.0.0.0 - Production on Wed May 13 08:50:39 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database (not started)
connected to recovery catalog database

RMAN> set dbid=4290218304;

executing command: SET DBID
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of set command at 05/13/2020 08:51:15
RMAN-06063: DBID is not found in the recovery catalog

RMAN> set dbid=700723428;

executing command: SET DBID
database name is "OEMDB" and DBID is 700723428

RMAN>


Now lets connect as "black" to a new instance and try to set the dbid for each of the databases. You can see that "black" can only see the metadata for the RMAN19C.

[oracle@oracle-server admin]$ rman target / catalog black/oracle@zdlra_sf

Recovery Manager: Release 19.0.0.0.0 - Production on Wed May 13 08:54:20 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database (not started)
connected to recovery catalog database

RMAN> set dbid=4290218304;
set dbid=700723428;
executing command: SET DBID
database name is "RMAN19C" and DBID is 4290218304

RMAN>

executing command: SET DBID
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of set command at 05/13/2020 08:55:01
RMAN-06063: DBID is not found in the recovery catalog


CONCLUSION : Implementing the VPD model in an RMAN catalog allows you to share a single RMAN catalog while still allowing backup metadata to be isolated within same catalog.

This feature is leveraged by the ZDLRA to allow a single ZDLRA to be shared across a corporation, but still allow isolation by internal organizations.

Final Note : As Multi-tenant becomes the standard, support for isolation will start to be at the PDB level also. This will allow for different VPC users to manage PDBs within the same CPD.  This is dependent on the version of RMAN/DB/ZDLRA.