DRM hang causes frequent RAC Instances Reconfiguration
(Doc ID 1528362.1)
Last updated on DECEMBER 24, 2023
Applies to:
Oracle Database - Enterprise Edition - Version 11.1.0.7 to 11.2.0.3 [Release 11.1 to 11.2]Oracle Database - Enterprise Edition - Version 11.2.0.4 to 11.2.0.4 [Release 11.2]
Oracle Database Cloud Schema Service - Version N/A and later
Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
Symptoms
- RAC Instances freezes during DRM for 100 secs or more.
- DB Alert log shows that all RAC instances undergo reconfiguration at the same time, but there are no instance crashes
Node 1 DB Alert Log | Node 2 DB Alert Log |
---|---|
Sat Jul 14 14:17:04 2012 Reconfiguration started (old inc 70, new inc 72) List of instances: 1 2 (myinst: 1) Global Resource Directory frozen Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Sat Jul 14 14:17:04 2012 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Sat Jul 14 14:17:04 2012 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Sat Jul 14 14:17:13 2012 minact-scn: Master returning as live inst:2 has inc# mismatch instinc:70 cur:72 errcnt:0 |
Sat Jul 14 14:17:04 2012 Reconfiguration started (old inc 70, new inc 72) List of instances: 1 2 (myinst: 2) Global Resource Directory frozen Communication channels reestablished Sat Jul 14 14:17:04 2012 * domain 0 valid = 1 according to instance 1 Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Sat Jul 14 14:17:04 2012 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Sat Jul 14 14:17:04 2012 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Sat Jul 14 14:18:03 2012 Submitted all GCS remote-cache requests |
- Lmon trace shows that DRM quiesce step hangs:
*** 2012-07-14 14:14:51.187
CGS recovery timeout = 85 sec
Begin DRM(231) (swin 1)
* drm quiesce
*** 2012-07-14 14:17:03.752
* Request pseudo reconfig due to drm quiesce hang
2012-07-14 14:17:03.752735 : kjfspseudorcfg: requested with reason 5(DRM Quiesce step stall)
*** 2012-07-14 14:17:03.766
kjxgmrcfg: Reconfiguration started, type 6
CGS/IMR TIMEOUTS:
CSS recovery timeout = 31 sec (Total CSS waittime = 65)
IMR Reconfig timeout = 75 sec
CGS rcfg timeout = 85 sec
kjxgmcs: Setting state to 70 0.
- AWR Top waits are "gcs resource directory to be unfrozen" & "gc remaster"
Changes
Large Buffer Cache
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |