Monday, February 24, 2020

What are those processes in V$DATAGUARD_PROCESSES on the ZDLRA?


With 12.2 of Oracle there is a now detail in the V$DATAGUARD_PROCESS view on the downstream database.  Keep in mind that the "downstream database" may not only be a standby database, it can include a Far Sync database, or a ZDLRA receiving Real-time-redo.

Below is a table of the processes that I've seen in this view from a ZDLRA, and I will go through what each one of these are.

NAME ACTION        CLIENT_ROLE      COUNT(*)
---- ------------ ---------------- ----------
rfs  IDLE          async SRL single      10
rfs  IDLE          async SRL multi       20
rfs  IDLE          async ORL single      30
rfs  IDLE          async ORL multi       40
rfs  IDLE          gap manager          100
rfs  IDLE          archive gap           10



First off.  I want to point out what happens in a RAC environment.  As you can image, each thread (node) in a RAC environment independently sends it's redo to the downstream.  The same thing happens with a standby database sending it's redo to a downstream.
For example, if my primary DB cluster is a 4 Node RAC cluster, and I have a standby database that is a single node, redo sent from both the primary AND the standby cluster to a ZDLRA will be sent as 4 independent streams.

Now to walk through what each of these types of processes are, I will first define the types.
Since I am seeing multiple types, this example is from a ZDLRA (rather than a standby database) so I will define the terms from a ZDLRA standpoint.


SRL – Standby Redo Logs are sending the changes. These are standby databases
ORL – Online Redo Logs are sending the changes. These are primary databases

Multi  - The Protected database is using the same log buffer to write both archive logs and REDO to the ZDLRA (and possibly another standby).
Single – The Protected database is using separate buffers for REDO to the ZDLRA. The ZDLRA could not keep up with the archive/standby database.

Async       - Real-time redo processes
Gap manager – Process that keeps track of any gaps in redo, and can send archive logs to fill the gap.
Archive gap – Additional processes that send over any gaps in redo. This is controlled by the parameter “LOG_ARCHIVE_MAX_PROCESSES” on the protected databases. The Gap manager will send over the first archive log, but archive gap processes will be started in parallel with the Gap manager if more than 1 archive log is needed. This is controlled by the LOG_ARCHIVE_MAX_PROCESSES parameter which has a default value of 4.

So for my example, this is what we are seeing 

  • “Async SRL single” processes of 10 is telling us that there are 10 standby treads being applied and sending real-time-redo using the same buffer on the protected database that the archive redo process is using. 
  • Async SRL multi” processes of 20 is telling us that there are 20 standby threads being applied and sending real-time-redo that could not keep up with the archiver process and spawned an additional buffer. 
  • A total of 30 combined processes for "Async SRL" is telling us that there are 30 threads sending real-time redo in total.
**** Notice I used the term "threads" above.  The number of threads is the number of RAC nodes on the primary database regardless of how many instances are on the standby applying redo.


  • “Async ORL single” processes of 30 is telling us that there are 30 primary database instances sending real-time-redo using the same buffer on the protected database that the archive redo process is using. 

  • “Async ORL Multi” processes of 40 is telling us that there are 40 primary database instances sending real-time-redo that could not keep up with the archiver process and spawned an additional buffer. 
  • A total of 70 combined processes for "Async ORL" is telling us there are 70 primary instances sending real-time redo.
**** Notice I used the term "instances" above.  The number of instances is the number of RAC nodes on the primary database rather than the number of databases.
  • "Gap Manager" process of 100 is telling us that there are 100 process that are monitoring the async process.  Notice that the total number of "Gap Manager" processes matches the total number of async processes.
  • “archive gap” processes are temporary and should be killed once they are idle for 2 minutes since they completed their work. We should only see these processes if the ZDLRA falls behind in collecting redo.

I hope this helps explain what the processes are that are in use to manage the real-time redo.