Tuesday, August 31, 2021

TDE–How to implement TDE in your database and what to think about (part 4)

 In this post, I am going to include some lessons learned from implementing "Restore as encrypted" of a large database with over 500,000 objects.

 


The error we were receiving when trying open our database was

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-25153: Temporary Tablespace is Empty
Process ID: 133196
Session ID: 1769 Serial number: 6805

    

And in the alert log we saw.

      Parallel first-pass transaction recovery timed out. Switching to serial recovery.
Undo initialization recovery: Parallel FPTR failed: start:685625075 end:685692452 diff:67377 ms (67.4 seconds)
2021-08-27T10:02:39.567998-04:00
Undo initialization recovery: err:0 start: 685625075 end: 685693406 diff: 68331 ms (68.3 seconds)
2021-08-27T10:02:43.015891-04:00
[339055] Successfully onlined Undo Tablespace 17.
Undo initialization online undo segments: err:0 start: 685693406 end: 685696854 diff: 3448 ms (3.4 seconds)
Undo initialization finished serial:0 start:685625075 end:685697235 diff:72160 ms (72.2 seconds)
Dictionary check beginning
2021-08-27T10:02:44.819881-04:00
TT03 (PID:360221): Sleep 80 seconds and then try to clear SRLs in 6 time(s)
2021-08-27T10:02:54.759120-04:00
Tablespace 'PSAPTEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
2021-08-27T10:02:55.826700-04:00
Errors in file /u02/app/oracle/diag/rdbms/bsg/BSG1/trace/BSG1_ora_339055.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-25153: Temporary Tablespace is Empty
2021-08-27T10:02:55.826827-04:00
Errors in file /u02/app/oracle/diag/rdbms/bsg/BSG1/trace/BSG1_ora_339055.trc:
ORA-00604: error occurred at recursive SQL level 1

    


What we found is that there some work the database has to do when opening for the first time after encrypting tablespaces offline.

Background:

Movement of data to disk that includes any objects that reside in an encrypted tablespace is encrypted. This means that if an object resides in an encrypted tablespace, the following data is also encrypted.

  • TEMP - If an object resides in an encrypted tablespace, any sort information in the TEMP tablespace is encrypted. This includes joins to other tables.  Any piece of data in a sort operation on disk causes the whole set of data to be encrypted.
  • UNDO - If an object resides in an encrypted tablespace, the blocks stored in UNDO are encrypted.
  • REDO/Archive - If an object resides in an encrypted tablespace, the changes to that object are encrypted in the redo stream (including redo sent through the network to a standby database).

How this happens:


The way the database manages encryption is to internally mark an object as an encrypted object so that it ensures the objects data stays encrypted on disk. 
Now back to "restore as encrypted".  Since we restored the database and encrypted the tablespaces, the database needs to mark all the  objects in the "newly encrypted" tablespaces as encrypted.
This is part of the database open operation.  The open database operation will sort through the internal object metadata to determine what objects now reside in "newly encrypted" tablespaces.
There are a few things to be aware of around this process.
  1. It requires a sorting of objects.  Because of this you may need a much bigger sort_area_size or PGA_TARGET.  This is only needed to open the database after encrypting, but this was cause of the issue I was seeing.
  2. It may take some time. Lots of time depending on the # of objects.

How to mitigate it:


Since we know this is going to happen, there are a few ways to mitigate it.

  1. Empty out your recycle bin to limit the # of objects to update.
  2. Proactively increase your PGA (or sort_area_size) for opening the database for the first time after encrypting.
  3. Encrypt the database in sections. Do not encrypt every tablespace at once to decrease the # of objects that will be marked encrypted. After encrypting a tablespace, open the database, shut it and do the next tablespace. NOTE: this may not be practical.
  4. Encrypt the tablespace online, as this will mark object as the processing of each tablespace completes.
  5. Check the number of objects that will need to be updated. This can be done by look at the TAB$ internal table using the TS# matching to the tablespaces that will be encrypted.

NOTE:


Remember a standby database may also have this same issue when opened up read only.
Also, it is possible to have the Primary Database encrypted, and the Standby database unencrypted. Or the opposite if encrypting your standby database first.  Restoring from encrypted --> unencrypted or unencrypted --> encrypted and opening up the database will cause this update of metadata to occur.

Saturday, August 14, 2021

Using rclone to download Objects from OCI

 I previously created a post that walked through how to configure rclone to easily access objects within the Oracle Cloud Object Store.


Object Store access with rclone


This post is going to go a little deeper on how to quickly download objects from the OCI object store onto your host.

In my example, I needed to download RMAN disk backup files that were copied to the Object Store in OCI.

I have over 10 TB of RMAN backup pieces, so I am going to create an ACFS mount point to store them on.


1) Create ACFS mount point

Creating the mount point is made up of multiple small steps that are documented here. This is a link to the 19c documentation so note it is subject to change over time.

  • Use ASMCMD to create a volume on the data disk group of 20 TB 
- Start ASMCMD connected to the Oracle ASM instance. You must be a user in the OSASM operating system group.

                    - Create the volume "volume1" on the "data" disk group

                    ASMCMD [+] > volcreate -G data -s 20G volume1

  • Use ASMCMD to list the volume information  NOTE: my volume name is volume1-123
                             
ASMCMD [+] > volinfo -G data volume1
Diskgroup Name: DATA

         Volume Name: VOLUME1
         Volume Device: /dev/asm/volume1-123
         State: ENABLED
     ... 

SQL> SELECT volume_name, volume_device FROM V$ASM_VOLUME 
     WHERE volume_name ='VOLUME1';

VOLUME_NAME        VOLUME_DEVICE
-----------------  --------------------------------------
VOLUME1            /dev/asm/volume1-123


  • Create the file system with mkfs from the volume "/dev/asm/volume1-123"
$ /sbin/mkfs -t acfs /dev/asm/volume1-123
mkfs.acfs: version                   = 19.0.0.0.0
mkfs.acfs: on-disk version           = 46.0
mkfs.acfs: volume                    = /dev/asm/volume1-123
mkfs.acfs: volume size               = 21474836480  (   20.00 GB )
mkfs.acfs: Format complete.
  • Register the file system with srvctl
# srvctl add filesystem -device /dev/asm/volume1-123 -path /acfsmounts/acfs2
       -user oracle -mounttowner oracle -mountgroup dba -mountperm 755
NOTE: This will mount the filesystem on /acfsmounts/acfs2

  • Start the filesystem with srvctl
$ srvctl start filesystem -device /dev/asm/volume1-123

  • Change the ownership to oracle

chown -R oracle:dba /acfsmounts/acfs2

2) Use rclone to view objects

The next step is to look at the objects I want to copy to my new ACFS file system. The format of accessing the object store in the commands is
 "rclone {command} [connection name]:{bucket/partial object name - optional}.


NOTE: For all examples my connection name is oci_s3 

I am going to start with the simplest command list buckets (lsd).

NOTE: We are using the s3 interface to view the objects in the namespace.  There is a single namespace space for the entire tenancy.  With OCI there is the concept of "compartments" which can be used to separate applications and users.  The S3 interface does not have this concept, which means that all buckets are visible.
  • rclone lsd - This is the simplest command to list the buckets, and as I noted previously, it lists all buckets, not just my bucket.
       ./rclone lsd oci_s3:
          -1 2021-02-22 15:33:06        -1 Backups
          -1 2021-02-16 21:31:05        -1 MyCloudBucket
          -1 2020-09-23 22:21:36        -1 Test-20200923-1719
          -1 2021-07-20 20:03:27        -1 ZDM_bucket
          -1 2020-11-23 23:47:03        -1 archive
          -1 2021-01-21 13:03:33        -1 bsgbucket
          -1 2021-02-02 15:35:18        -1 bsgbuckets3
          -1 2021-03-03 11:42:13        -1 osctransfer
          -1 2021-03-19 19:57:16        -1 repo
          -1 2021-01-21 19:35:24        -1 short_retention
          -1 2020-11-12 13:41:48        -1 jsmithPublicBucket
          -1 2020-11-04 14:10:33        -1 jsmith_top_bucket
          -1 2020-11-04 11:43:55        -1 zfsrepl
          -1 2020-09-25 16:56:01        -1 zs-oci-bucket

If I want to list what is within my bucket (bsgbucket) I can list that bucket. In this case it treats the flat structure of the object name as if it is a file system, and lists only the top "directories" within my bucket.

./rclone lsd oci_s3:bsgbucket
           0 2021-08-14 23:58:02        -1 file_chunk
           0 2021-08-14 23:58:02        -1 sbt_catalog


  • rclone tree - command will list what is within my bucket as a tree structure.
[opc@rlcone-test rclone]$ ./rclone tree oci_s3:bsgbucket
/
├── expdat.dmp
├── file_chunk
│   └── 2985366474
│       └── MYDB
│           └── backuppiece
│               └── 2021-06-14
│                   ├── DTA_BACKUP_MYDB_4601d1ph_134_1_1
│                   │   └── yHqtjSE51L3B
│                   │       ├── 0000000001
│                   │       └── metadata.xml
│                   └── DTA_BACKUP_MYDB_4d01d1uq_141_1_1
│                       └── lS9Sdnka2nD0
│                           ├── 0000000001
│                           └── metadata.xml
└── sbt_catalog
    ├── DTA_BACKUP_MYDB_4601d1ph_134_1_1
    │   └── metadata.xml
    └── DTA_BACKUP_MYDB_4d01d1uq_141_1_1
        └── metadata.xml


  • rclone lsl- command will list what is within my bucket as a long listing with more detail
[opc@rlcone-test rclone]$ ./rclone lsl oci_s3:bsgbucket
   311296 2021-01-21 13:04:05.000000000 expdat.dmp
337379328 2021-06-14 19:48:45.000000000 file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/0000000001
     1841 2021-06-14 19:48:45.000000000 file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/metadata.xml
 36175872 2021-06-14 19:49:10.000000000 file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/lS9Sdnka2nD0/0000000001
     1840 2021-06-14 19:49:10.000000000 file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/lS9Sdnka2nD0/metadata.xml
     1841 2021-06-14 19:48:46.000000000 sbt_catalog/DTA_BACKUP_MYDB_4601d1ph_134_1_1/metadata.xml
     1840 2021-06-14 19:49:10.000000000 sbt_catalog/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/metadata.xml


3) Use rclone to copy the objects to my local file system.


There are 2 command you can use to copy the files from the object store to the local file system.
  • copy - This is as you expect. It copies the files to the local file system and overwrites the local copy
  • sync - This syncronizes the local file system with the objects in the object store, and will not copy down the object if it already has a local copy.

In my case I am going to use the sync command. This will allow me to re-start copying the objects and it will ignore any objects that were previously successfully copies.

Below is the command I am using to copy (synchronize) the objects from my bucket in the object store (oci_s3:bsgbucket) to the local filesystem (/home/opc/acfs).

./rclone -vv sync -P --multi-thread-streams 12 --transfers 64  oci_s3:bsgbucket   /home/opc/acfs

To break down the command.

  • -vv  This option to rclone gives me "verbose" output so I can see more of what is being copied as the command is executed.
  • -P  This option to rclone gives me feedback on how much of the object has downloaded so far to help me monitor it.
  • --multi-threaded-streams 12 This option to rclone breaks larger objects into chunks to increase the concurrency.
  • --transfers 64 This option to rclone allows for 64 concurrent transfers to occur. This increases the download throughput
  • oci-s3:bsgbucket - This is the source to copy/sync
  • /home/opc/acfs - this is the destination to copy/.sync with

Finally, this is the what the command looks like when it is executing.

opc@rlcone-test rclone]$  ./rclone -vv sync -P --multi-thread-streams 12 --transfers 64  oci_s3:bsgbucket   /home/opc/acfs
2021/08/15 00:15:32 DEBUG : rclone: Version "v1.56.0" starting with parameters ["./rclone" "-vv" "sync" "-P" "--multi-thread-streams" "12" "--transfers" "64" "oci_s3:bsgbucket" "/home/opc/acfs"]
2021/08/15 00:15:32 DEBUG : Creating backend with remote "oci_s3:bsgbucket"
2021/08/15 00:15:32 DEBUG : Using config file from "/home/opc/.config/rclone/rclone.conf"
2021/08/15 00:15:32 DEBUG : Creating backend with remote "/home/opc/acfs"
2021-08-15 00:15:33 DEBUG : sbt_catalog/DTA_BACKUP_MYDB_4601d1ph_134_1_1/metadata.xml: md5 = 505fc1fdce141612c262c4181a9122fc OK
2021-08-15 00:15:33 INFO  : sbt_catalog/DTA_BACKUP_MYDB_4601d1ph_134_1_1/metadata.xml: Copied (new)
2021-08-15 00:15:33 DEBUG : expdat.dmp: md5 = f97060f5cebcbcea3ad6fadbda136f4e OK
2021-08-15 00:15:33 INFO  : expdat.dmp: Copied (new)
2021-08-15 00:15:33 DEBUG : Local file system at /home/opc/acfs: Waiting for checks to finish
2021-08-15 00:15:33 DEBUG : Local file system at /home/opc/acfs: Waiting for transfers to finish
2021-08-15 00:15:33 DEBUG : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/0000000001: Starting multi-thread copy with 2 parts of size 160.875Mi
2021-08-15 00:15:33 DEBUG : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/0000000001: multi-thread copy: stream 2/2 (168689664-337379328) size 160.875Mi starting
2021-08-15 00:15:33 DEBUG : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/0000000001: multi-thread copy: stream 1/2 (0-168689664) size 160.875Mi starting
2021-08-15 00:15:33 DEBUG : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/lS9Sdnka2nD0/metadata.xml: md5 = 0a8eccc1410e1995e36fa2bfa0bf7a70 OK
2021-08-15 00:15:33 INFO  : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/lS9Sdnka2nD0/metadata.xml: Copied (new)
2021-08-15 00:15:33 DEBUG : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/metadata.xml: md5 = 505fc1fdce141612c262c4181a9122fc OK
2021-08-15 00:15:33 INFO  : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/metadata.xml: Copied (new)
2021-08-15 00:15:33 DEBUG : sbt_catalog/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/metadata.xml: md5 = 0a8eccc1410e1995e36fa2bfa0bf7a70 OK
2021-08-15 00:15:33 INFO  : sbt_catalog/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/metadata.xml: Copied (new)
2021-08-15 00:15:33 INFO  : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4d01d1uq_141_1_1/lS9Sdnka2nD0/0000000001: Copied (new)
2021-08-15 00:15:34 DEBUG : file_chunk/2985366474/MYDB/backuppiece/2021-06-14/DTA_BACKUP_MYDB_4601d1ph_134_1_1/yHqtjSE51L3B/0000000001: multi-thread copy: stream 1/2 (0-168689664) size 160.875Mi finished
Transferred:      333.398Mi / 356.554 MiByte, 94%, 194.424 MiByte/s, ETA 0s
Transferred:            6 / 7, 86%
Elapsed time:         2.0s
Transferring:

NOTE: it broke up the larger object into chunks, and you can see that it downloaded 2 chunks simultaneously.  At the end you can see the file that it was in the middle of transferring.

Conclusion.

rclone is great alternative to the OCI CLI to manage your objects and download them.  It has  more intuitive commands (like "rclone ls").  And the best part is that it doesn't require python and special privleges to install.

Thursday, August 5, 2021

Adding immutability to buckets in the Oracle Cloud Object Store

 I am going to demonstrate a new feature of the object store that you might not have known about.  The feature is "Retention Lock" and is used to protect the objects in a bucket.



Let me first start with a few  links to get you started and then I will demonstrate how to use this feature.


In order to add a retention lock to a bucket you create a rule for the individual bucket.

Below is a screen shot of where you will find the retention rules, and the "Create Rule" button. Also note that I highlighted the "Object Versioning" attribute of the bucket.

NOTE: You cannot add a retention lock to a bucket that has "Object Versioning" enabled. You can also not disable "Object Versioning" once enabled. You MUST suspend "Object Versioning" before adding any retention rules to your bucket.



 There are 3 types of retention locks and below I will describe them and show you how to implement them. They are listed from least restrictive to most restrictive.


DATA GOVERNANCE

Data Governance is a time based lock based on the modified time of EACH OBJECT in the bucket.

The Retention can be set in "days" or "years".

Below is what the settings look like for data governance. You choose "Time-Bound" for the rule type and ensure that you do not "enable retention rule lock".



With Data Governance you can both increase and decrease the duration of the retention lock.

Below you can see after the lock was created, the rule is not locked.



REGULATORY COMPLIANCE

Regulatory Compliance is similar to Data Governance with the exception that the duration can only be increased.
The retention lock of the individual objects, just like Data Governance is based on when the individual object was last modified.
Another key difference is that when you "enable retention rule lock", you also set when this rule is locked. The default is 14 days, and cannot be set less than 14 days.
The delay of 14 days is a "cooling off period" that gives you 14 days to test before the rule takes effect. This is because once the cooling off period ends, the retention time cannot be shortened.


Below is the screen shot of creating a retention rule for regulatory compliance and note that the retention rule lock MUST be enabled to ensure the duration is not shortened.


It also asked me to confirm the "lock date" before the rule is created.




Below are the rules that are set after both of these steps.


.NOTE: I now have 2 rules. I have the original rule that will lock the objects for 30 days (this can be changed as needed). I also have a Regulatory Compliance rule that will lock the objects for 1 day. The Regulatory Compliance rule not take effect for 14 days from today.


LEGAL HOLD

The final type of retention is a legal hold.  A legal hold will put a retention lock on the WHOLE bucket. All objects in the bucket are locked and cannot be modified/deleted until the hold is removed. There is no ending time period for a legal hold.

Below is how you create a legal hold.



SUMMARY

You can create the 3 types of retention locks, and you can even layer them. Below you can see that I have 3 locks. The Legal Hold rule will lock everything, but that can be removed leaving the 2 remaining rules.  I can remove the Data Governance rule, but the Regulatory Compliance rule is the most restrictive. Once the 14 day (or whatever you set) has passed this rule cannot be changed.


Now when I go to delete an object that is protected by a retention rule I get an error. Below is example of what you will see.