Wednesday, December 2, 2020

Advanced Compression and TDE

 This blog post details what happens when you utilize ACO to mitigate the effects of TDE.




I started researching this topic as I thought about what happens when ACO is implemented on a table.

The first thought on Implementing ACO is that it will mitigate the effects of TDE by reducing the size of the data, and thus reducing the size of the backup RMAN creates (without using RMAN compression). Most backup strategies use compression, and TDE data will not encrypt. Implementing ACO to reduce the database size, reduces the backup size and mitigates the effect of TDE.

This all sounds good, and at first glance makes sense.

Of course there is a lot more to ACO, then just mitigating the loss of compression on my backup strategy.

  • The size of the level 0 is smaller, thus my restore time can be reduced AND the amount of storage needed for my Level 0 is lessened.
  • I am reading fewer blocks for the same amount of data. Less physical reads typically means faster queries (that perform disk reads).

If I am able to achieve 2x compression through ACO, my backup size is reduced to be 1/2 the size it would be without ACO right ?

Well .... There are few things to think about.

First - Indexes may be a large part of your database. I have seen applications where the indexes take up more space than the data itself.  Also, you can only compress indexes that have more than 1 column. In a multi-column index, the prefix columns can be compressed. Single column indexes cannot be compressed.


Second- Let's take a close look at how ACO works with my table. What happens with my level 0 backup is pretty clear. I would expect to get at least a 2x compression ratio. What I wanted to explore is what happens to my daily incrementals and archive logs ?

Here is the documentation that best describes it.  Below is a picture that shows what happens with OLTP compression. This is where I looked close at.


What caught my eye in this is that the block starts out initially uncompressed. As the block gets full, a background process will compress the rows and the resulting block will be compressed. The thoughts I had were

  • That's great for my incremental backup size.  More data in each block means less blocks to backup.
  • With my redo, the row is initially inserted in the block at the current scn, and the batch process then changes the block and that row is moved. The movement of the row needs to be captured in the redo, resulting in more redo.
Now to put this to the test and see what happens.

I started by creating my own copy of DBA_OBJECTS and continued to replicate the data until it was ~ 100MB. This seemed like a good size to work with. In order to facility updates to this dataset (since there were duplicates), I added a column that is the rownum. This gives each row a unique key from 1 to the number of rows.

My table is named "bgrenn.myobjects"
My tablespace is ENCRYPTED_DATA and was created with the ENCRYPTION clause.

Step 1 - I created a table to hold the results of the changes to analyze.  Below is the definition (if you want to follow along at home).

create table bgrenn.tests(test_type varchar(10),
              test_table varchar(20),
              test_number integer,
              test_start_seq# number,
              test_end_seq# number,
                          rows_affected number,
                          blocks_changed number,
                          incr_size number,
                          archive_size number,
                          total_size number) tablespace encrypted_data;
Step 2 - I created 2 empty tables in the encrypted tablespace. One defined with OLTP compression, one without.

create table bgrenn.uncompressed tablespace encrypted_data as select * from bgrenn.myobjects where 1=0;
create table bgrenn.compressed tablespace encrypted_data  ROW STORE COMPRESS ADVANCED as select * from bgrenn.myobjects where 1=0;

INSERTS


Step 3
- Execute my procedure to insert into the uncompressed table committing every 100 rows. At the end of the procedure I insert into my "tests" table the information about this action.

DECLARE
   TYPE mytest_cur IS REF CURSOR;
   commit_count number := 0;
   insert_count number := 0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   start_scn number;
   archive_size number;
  intable bgrenn.myobjects%ROWTYPE;
  CURSOR c1 IS SELECT
     * from bgrenn.myobjects;

BEGIN
   OPEN c1;
   execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;
   select current_scn into start_scn from v$database;
   LOOP
      FETCH c1 INTO intable;
      EXIT WHEN c1%NOTFOUND;
       insert into bgrenn.uncompressed values intable ;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    insert_count := insert_count + 1;
   END LOOP;
   dbms_output.put_line('# of rows inserted = ' || insert_count);
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','UNCOMPRESSED');
   select sum(blocks) into blocks_changed from dba_segments where owner= 'BGRENN' and segment_name in ( 'UNCOMPRESSED','UNCOMPRESSED_IDX');
   select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.uncompressed where ora_rowscn >= start_scn;
   select max(sequence#) - 1 into end_seq_number from v$log;
   select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Insert','Uncompressed',1,start_seq_number,end_seq_number,insert_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   CLOSE c1;
END;
/

Step 4 - Execute my procedure to insert into the compressed table committing every 100 rows. 



set serveroutput on;
DECLARE
   TYPE mytest_cur IS REF CURSOR;
   commit_count number := 0;
   insert_count number := 0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   archive_size number;
   start_scn number;
  intable bgrenn.myobjects%ROWTYPE;
  CURSOR c1 IS SELECT
     * from bgrenn.myobjects;

BEGIN
   OPEN c1;
   execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;
   select current_scn into start_scn from v$database;
   LOOP
      FETCH c1 INTO intable;
      EXIT WHEN c1%NOTFOUND;
       insert into bgrenn.compressed values intable ;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    insert_count := insert_count + 1;
   END LOOP;
   dbms_output.put_line('# of rows inserted = ' || insert_count);
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','COMPRESSED');
   select sum(blocks) into blocks_changed from dba_segments where owner= 'BGRENN' and segment_name in ( 'COMPRESSED','COMPRESSED_IDX');
   select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.compressed where ora_rowscn >= start_scn;
   select max(sequence#) - 1 into end_seq_number from v$log;
   select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Insert','Compressed',1,start_seq_number,end_seq_number,insert_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   CLOSE c1;
END;
/

Now that we have done our inserts, let's take a look and see what happened.

TEST_TYPE  TEST_TABLE           ROWS_AFFECTED  INCR_SIZE ARCHIVE_SIZE TOTAL_SIZE
---------- -------------------- ------------- ---------- ------------ ----------
Insert     Uncompressed                585640   96075776    282468864  378544640
Insert     Compressed                  585640   50331648    476089856  526421504

I inserted the same number of rows into both tables, in fact I inserted the same exact rows.

When looking at the size of the incremental backup (these will become part of the level 0 backup) the comparison is.
    Compressed      - 50331648    
    Uncompressed - 96075776    
    Compression ratio  1.9X
Perfect ! My incremental backup size of my compressed table size is about 1/2 of that of uncompressed data. This makes up for the loss of compression in my backup.

Now let's take a look at the Archive Log Backup. These will be kept for the retention window.

Compressed        -    476089856  
Uncompressed    -    282468864  
Compression ratio    .60x

Wow, it's less than 1. The archive logs for compressed data is almost double what they are for uncompressed.

Finally let's take a look at the Total Change Backup size.

Compressed        -    526421504  
Uncompressed    -    378544640
Compression ratio    .71x

INSERTS - The daily backup size for inserts is bigger for compressed data.

UPDATES


Step 5 - Execute my procedure to update 1% of the data in  the uncompressed table committing every 100 rows. 


DECLARE
   commit_count number := 0;
   update_count number := 0;
   max_updates number := 0;
   max_rows number := 0;
   random_row number :=0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   archive_size number;
   start_scn number;
BEGIN
   select current_scn into start_scn from v$database;
   select count(1) into max_rows from bgrenn.uncompressed;
   max_updates := max_rows * .01;
  execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;

   LOOP
      EXIT WHEN update_count > max_updates;
       select trunc(dbms_random.value(1,max_rows),0) into random_row from dual;
       UPDATE bgrenn.uncompressed SET SUBOBJECT_NAME= OBJECT_NAME WHERE myrownum = random_row;
       insert into bgrenn.myblockchanges select dbms_rowid.rowid_block_number(rowid) from  bgrenn.uncompressed where myrownum = random_row;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    update_count := update_count + 1;
   END LOOP;
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','UNCOMPRESSED');
   select count(distinct block_number) into blocks_changed from bgrenn.myblockchanges;
   select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.uncompressed where ora_rowscn >= start_scn;
   select max(sequence#) - 1 into end_seq_number from v$log;
   select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Update','Uncompressed',1,start_seq_number,end_seq_number,update_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   dbms_output.put_line('# of rows updated = ' || update_count);

 END;
/

Step 6 - Execute my procedure to update 1% of the data in  the compressed table committing every 100 rows. 

DECLARE
   commit_count number := 0;
   update_count number := 0;
   max_updates number := 0;
   max_rows number := 0;
   random_row number :=0;
   start_seq_number number;
   end_seq_number number;
   blocks_changed number;
   archive_size number;
   start_scn number;
BEGIN
   select current_scn into start_scn from v$database;
   select count(1) into max_rows from bgrenn.compressed;
   max_updates := max_rows * .01;
  execute immediate 'alter system archive log current';
   select max(sequence#) into start_seq_number from v$log;

   LOOP
      EXIT WHEN update_count > max_updates;
       select trunc(dbms_random.value(1,max_rows),0) into random_row from dual;
       UPDATE bgrenn.compressed SET SUBOBJECT_NAME= OBJECT_NAME WHERE myrownum = random_row;
       insert into bgrenn.myblockchanges select dbms_rowid.rowid_block_number(rowid) from  bgrenn.compressed where myrownum = random_row;
       if commit_count=100 then
     commit;
     commit_count:=0;
     end if;
    commit_count := commit_count + 1;
    update_count := update_count + 1;
   END LOOP;
   execute immediate 'alter system archive log current';
   dbms_stats.gather_table_stats('BGRENN','COMPRESSED');
   select count(distinct block_number) into blocks_changed from bgrenn.myblockchanges;
   select max(sequence#) - 1 into end_seq_number from v$log;
  select count(distinct dbms_rowid.rowid_block_number(rowid)) into blocks_changed from bgrenn.compressed where ora_rowscn >= start_scn;
    select sum(blocks*block_size) into archive_size from v$archived_log where sequence#>=start_seq_number and sequence# <=end_seq_number;
   insert into bgrenn.tests values('Update','Compressed',1,start_seq_number,end_seq_number,update_count,blocks_changed,blocks_changed*8192,archive_size,blocks_changed*8192+archive_size);
   commit;
   dbms_output.put_line('# of rows updated = ' || update_count);

 END;
/

Now that we have done our updates, let's take a look and see what happened.

TEST_TYPE  TEST_TABLE           ROWS_AFFECTED  INCR_SIZE ARCHIVE_SIZE TOTAL_SIZE
---------- -------------------- ------------- ---------- ------------ ----------
Update     Uncompressed                  5857   37699584      5346816   43046400
Update     Compressed                    5857   30203904     11993600   42197504

I updated the same number of rows into both tables.

When looking at the size of the incremental backup (these will become part of the level 0 backup) the comparison is.
    Compressed      - 30,203,904         
    Uncompressed -  37,699,584      
    Compression ratio  1.2X

Perfect ! My incremental backup size of my compressed table size isn't much different than that of uncompressed data. This isn't surprising. with a 1% change rate, the same block might not be updated more than once. Same number updates = same number of blocks backed up.


Now let's take a look at the Archive Log Backup. These will be kept for the retention window.

Compressed        -    11,993,600   
Uncompressed    -      5,346,816     
Compression ratio    .44x

Wow, it's less than 1/2. The archive logs for compressed data is about double what they are for uncompressed.

Finally let's take a look at the Total Change Backup size.

Compressed        -     42197504
Uncompressed    -     43046400
Compression ratio    1x

UPDATES - The daily backup size for updates is about the same for both compressed data and uncompressed data.

SUMMARY.


Implementing ACO does have a large number of benefits. One of which is decreasing the size of a full backup. When implementing TDE, this can help mitigate the loss of compression in your backup strategy.
However . If you have a high change rate, and a long retention window, the backups may end up being the same size as they were before, and could possibly be bigger. This is especially relevant in a backup strategy that includes deduplication/incremental merge/virtual fulls.