ocssd.bin High CPU Usage, Instance Crashes With ORA-29702 or ORA-29770 or ORA-29701 With "gipcWait failed with 16"
(Doc ID 1287709.1)
Last updated on FEBRUARY 24, 2022
Applies to:
Oracle Database - Enterprise Edition - Version 11.2.0.1 to 11.2.0.2 [Release 11.2]Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
Symptoms
Symptom1:
Instance terminates with the following symptoms:
- Instance alert.log:
LMON (ospid: 26356) has not moved for 25 sec .. : waiting for event 'CSS operation: data update' for 23 secs with wait_id 6238251.
LMON (ospid: 26356) waits for event 'CSS operation: data update' for 176 secs.
..
ORA-29770: global enqueue process LMON (OSID 26356) is hung for more than 150 seconds
..
ERROR: Some process(s) is not making progress.
LMHB (ospid: 26364) is terminating the instance.
Please check LMHB trace file for more details.
Please also check the CPU load, I/O load and other system properties for anomalous behavior
ERROR: Some process(s) is not making progress.
LMHB (ospid: 26364): terminating the instance due to error 29770
- Call stack for LMON
sgipcwWaitHelper sgipcwWait gipcWaitOsd gipcInternalWait gipcWaitF clsssRecvMsg clsssServerRPC clssgsUpdateData clssgsupdateprivate kgxgnupdpridata kjxgrc_hb_write kjxgrDD_hb_write kjxgrrcfgchk kjxggpoll kjfmact kjfcln ksbrdp
Symptom2:
Server process (foreground process, parallel process etc) reports error ORA-29701
- Instance alert.log:
..
ERROR: unrecoverable error ORA-29701 raised in ASM I/O path; terminating process 1911 ..
- Trace file (in this case racdb1_p009_1911.trc)
..
2011-01-24 11:22:49.041: [ CSSCLNT]clssscConnect: gipcWait failed with 16 (12)
2011-01-24 11:22:52.498: [ CSSCLNT]clsssInitNative: connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_<HOSTNAME1>_)) failed, rc 16
kgxgncin: CLSS init failed with status 3
kgxgncin: return status 3 (1311719766 SKGXN not av) from CLSS
NOTE: kfmsInit: ASM failed to initialize group services
Error ORA-29701 signaled at
..
Call stack
.. kfmsInit kfmsSlvReg kfmdSlvOpPriv kfmdWriteSubmitted kfk_process_an_ioq kfk_submit_io ..
Symptom3:
LMON reports error ORA-29702
- Instance alert.log:
ORA-29702: error occurred in Cluster Group Service operation
LMON (ospid: 25581): terminating the instance due to error 29702
- lmon trace
The LMON trace shows a reconfiguration just before the crash:
*** 2011-10-12 01:17:27.418
kjxgmrcfg: Reconfiguration started, type 1
CGS/IMR TIMEOUTS:
CSS recovery timeout = 31 sec (Total CSS waittime = 65)
IMR Reconfig timeout = 75 sec
CGS rcfg timeout = 85 sec
kjxgmcs: Setting state to 6 0.
*** 2011-10-12 01:17:27.446
Name Service frozen
kjxgmcs: Setting state to 6 1.
kjxgmmeminfo: instance 2 info has a wrong size (112 120)
kjxggpoll: substate 1 action failed
kjxggpoll: change poll time to 50 ms
Error: Cluster Group Service encounters an error (2)
LMON caught an error 29702 in the main loop
error 29702 detected in background process
ORA-29702: error occurred in Cluster Group Service operation
kjzduptcctx: Notifying DIAG for crash event
----- Abridged Call Stack Trace -----
ksedsts()+461<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+53<-ksuitm()+1325<-ksbrdp ..
----- End of Abridged Call Stack Trace -----
Common Symptoms
- ocssd.bin high CPU utilization in most cases
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Symptom1: |
Symptom2: |
Symptom3: |
Common Symptoms |
Changes |
Cause |
Solution |
Community Discussions |
References |