Today I had to convert a physical standby to snapshot standby using the DG Broker. The steps to perform this are very simple and completed with no errors. However, while performing a health check of the databases I noticed a slight issue.
Messages in the Primary DB alert log were showing this:
<pre>Tue Jul 15 10:25:56 2014 FAL[server, ARC5]: FAL archive failed, see trace file. ARCH: FAL archive failed. Archiver continuing ORACLE Instance PPMOBB1 - Archival Error. Archiver continuing. Tue Jul 15 10:25:56 2014 FAL[server, ARCj]: FAL archive failed, see trace file. ARCH: FAL archive failed. Archiver continuing ORACLE Instance PPMOBB1 - Archival Error. Archiver continuing.
And messages in the Standby DB alert log were showing this:
<pre>Tue Jul 15 09:24:23 2014 Creating archive destination file : +FRA (248306 blocks) Tue Jul 15 09:24:23 2014 Creating archive destination file : +FRA (157408 blocks)
In order to fix this and allow the Primary DB to send the archivelogs to the snapshot standby, I had to set the max_connections property from 4 to 1 within the DG Broker config:
<pre>dgmgrl / DGMGRL> edit database 'RPMOBB' set property max_connections=1;
After setting this, the snapshot standby started to receive the missing archivelogs and then stayed in sync thereafter.
The solution was found in this MOS Doc. However, it is supposedly fixed in 11.2 and I am running 184.108.40.206 so appears not be the case.
Archive logs Shipping Skipped Intermittently, Standby Fails to Resolve Gap Automatically (Doc ID 1366234.1)
Since the company I work for has deployed 6 node RAC clusters as their license/utilisation consolidation platform, I thought it was wise to invest some time into scripting a simplified solution to ease the patching process.
As the cluster is Admin Managed, the DBA’s make a conscious decision as to which nodes the instance is allowed to run on and depending on the criticality of the database, how many instances it is allowed to run. The DBA then manually disables the other instances to stop CRS from starting them on restarts.
Before I wrote these scripts, we were relying on crsctl stop crs with the default shutdown option being set to transactional for each database. However, it came apparent while tailing an instance alert log that CRS actually aborts the instances and ignores the default shutdown option, meaning sessions connected to the instances CRS was aborting would end up being killed and potentially cause issues with the applications.
The scripts are a little quick and dirty stop gap at the moment but perform the job required.
The workflow is broken down into the following steps and you must have User Equivalence configured between all nodes for these to work:
- Runs Pre-Checks as SYSDBA on all running databases.
- Sets all databases default shutdown mode to transactional. This is available from 11g onwards and waits for all running transaction to either commit or rollback.
- Generates a dynamic script to stop all the instances running on the particular node you wish to stop for patching.
Work In Progress