DRM hang causes frequent RAC Instances Reconfiguration (Doc ID 1528362.1)

Last updated on JULY 26, 2016

Applies to:

Oracle Database - Enterprise Edition - Version 11.1.0.7 to 11.2.0.3 [Release 11.1 to 11.2]
Oracle Database - Enterprise Edition - Version 11.2.0.4 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.

Symptoms

- RAC Instances freezes during DRM for 100 secs or more.

- DB Alert log shows that all RAC instances undergo reconfiguration at the same time, but there are no instance crashes

Node 1 DB Alert LogNode 2 DB Alert Log
Sat Jul 14 14:17:04 2012
Reconfiguration started (old inc 70, new inc 72)
List of instances:
1 2 (myinst: 1)
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Sat Jul 14 14:17:04 2012
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Jul 14 14:17:04 2012
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Sat Jul 14 14:17:13 2012
minact-scn: Master returning as live inst:2 has inc# mismatch instinc:70 cur:72 errcnt:0
Sat Jul 14 14:17:04 2012
Reconfiguration started (old inc 70, new inc 72)
List of instances:
1 2 (myinst: 2)
Global Resource Directory frozen
Communication channels reestablished
Sat Jul 14 14:17:04 2012
* domain 0 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Sat Jul 14 14:17:04 2012
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Jul 14 14:17:04 2012
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Sat Jul 14 14:18:03 2012
Submitted all GCS remote-cache requests

 

 

 

 

 

 

 

 

 

 

 

 

 

 

- Lmon trace shows that DRM quiesce step hangs:

*** 2012-07-14 14:14:51.187
 CGS recovery timeout = 85 sec
Begin DRM(231) (swin 1)
* drm quiesce

*** 2012-07-14 14:17:03.752
* Request pseudo reconfig due to drm quiesce hang
2012-07-14 14:17:03.752735 : kjfspseudorcfg: requested with reason 5(DRM Quiesce step stall)

*** 2012-07-14 14:17:03.766
kjxgmrcfg: Reconfiguration started, type 6
CGS/IMR TIMEOUTS:
 CSS recovery timeout = 31 sec (Total CSS waittime = 65)
 IMR Reconfig timeout = 75 sec
 CGS rcfg timeout = 85 sec
kjxgmcs: Setting state to 70 0.

 

- AWR Top waits are "gcs resource directory to be unfrozen" & "gc remaster"

Changes

Large Buffer Cache

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms