My Oracle Support Banner

Instance Stuck Due To Stuck RCBG Background Process (Doc ID 2908933.1)

Last updated on MAY 15, 2023

Applies to:

Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A to N/A
Information in this document applies to any platform.

Symptoms

Severe performance degradation. The average active sessions climbed up very high during this problem period.

The wait chain info shows that sessions on other instances were being blocked by the RCBG instance#2 process. One of the predominant blocking sessions in wait chain was from instance #2. This session process corresponds to the Result cache Background process. 

Database got hung and it was blocked by the RCBG process. RCBG process was spinning with high CPU.

RCBG process was holding a latch and cleaning up the result cache dependent objects.

(post info) last post received: 0 0 47
last post received-location: ksr2.h LINE:695 ID:ksrpublish
last process to post me: 0xc19aec1e8 1 6
last post sent: 0 0 48
last post sent-location: ksr2.h LINE:699 ID:ksrmdone
last process posted by me: 0xc19aec1e8 1 6
waiter on post event: 0
(latch info) hold_bits=0x4 ud_influx=0x0
Holding:
0x6005ab40 'Result Cache: RC Latch' (level=2, SGA latch)
Holder Location: qesrc0.h LINE:3086 ID:Result Cache: Serialization25:
Holder Slot: slot=2, efd=4, pdb=1
Holder Context: 1
Latch State:
state=busy (exclusive) [value=0x1000000000000062]
holder orapid=98, ospid=82322
wlstate=free [value=0]
waiters [orapid (seconds since: put on list, posted, alive check)]:
510 (400, *, 400)
714 (400, *, 199)
269 (400, *, 368)
651 (400, *, 9)
516 (400, *, 250).  >>>>>>>>>> RCBG holding latch

and is blocked by
=> Oracle session identified by:
{
instance: 1 (cccw04.cccw041)
os id: 82322
process id: 98, oracle@cw02exa01vm01 (RCBG)
session id: 2947
session serial #: 34829
pdb id: 1 (CDB$ROOT)
}
which is not in a wait:
{
last wait: 9 min 40 sec ago
blocking: 3637 sessions
current sql: <none>
short stack:
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-qes
rcRO_Invalidate()+14<-qesrcDO_Invalidate()+1191<-qesrcRacMsg_INV()+517<-qesrcR
CBG_Main()+1320<-ksbrdp()+1079<-opirip()+609<-opidrv()+602<-sou2o()+145<-opima
i_real()+202<-ssthrdmain()+417<-main()+262<-__libc_start_main()+245
wait history:
1. event: 'wait for unread message on broadcast channel'

There was a large contingency of contention which then built up behind the RCBG process session, for a number of minutes.

27160 Final Blocker is Session ID 2 serial# 14330 on instance 2
  27161 which is 'not in a wait' for 384 seconds"

Flushed the result cache.

Attempts to terminate sessions were unsuccessful. Observed that PMON itself was having problems acquiring latches to cleanup active sessions. Eventually the problematic instances had to be forcefully aborted to cause it to terminate.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.