HEAPDUMP Cause Long 'latch: shared pool' Wait and Instance Crashes
(Doc ID 2552004.1)
Last updated on JULY 20, 2024
Applies to:
Oracle Database - Enterprise Edition - Version 11.2.0.4 and laterInformation in this document applies to any platform.
Symptoms
- 11.2.0.4 version, RDBMS
- For high 'KGLH0' issue diagnostic, heapdump was dumped with level 536870914(0x20000002):
oradebug dump heapdump 870914
- But it caused system level 'latch:shared pool' wait for a long time and instance crashed:
PMON failed to acquire latch, see PMON dump
Thu Jun 06 05:57:46 2019
NOTE: ASMB terminating
Errors in file <instanceid>_asmb_36831748.trc:Thu Jun 06 05:57:46 2019
System state dump requested by (instance=1, osid=36831748 (ASMB)), summary=[abnormal instance termination].
System State dumped to trace file <instanceid>_diag_29820756_20190606055746.trc
Thu Jun 06 05:57:47 2019
ORA-1092 : opitsk aborting process
Thu Jun 06 05:57:47 2019
ORA-1092 : opitsk aborting process
Thu Jun 06 05:57:48 2019
ORA-1092 : opitsk aborting process
Thu Jun 06 05:57:49 2019
License high water mark = 2645
Instance terminated by ASMB, pid = 36831748
USER (ospid: 36702012): terminating the instance
Instance terminated by USER, pid = 36702012
Thu Jun 06 05:58:38 2019 - The heapdump process was holding the child shared pool latch for a long time during the dumping and blocked many other sessions:
holding (efd=5) 0x70000000010d7b0 Child shared pool level=7 child#=1
Location from where latch is held: kgh.h LINE:6787 ID:kghdmp:new: heap descriptor
Context saved from call: 504403158265866304
state=busy [holder orapid=169] wlstate=free [value=0]
waiters [orapid (seconds since: put on list, posted, alive check)]:
174 (70, 1559857648, 0) <===========70 seconds.
175 (70, 1559857648, 0)
178 (70, 1559857648, 0)
181 (65, 1559857648, 0)
182 (65, 1559857648, 0)
184 (65, 1559857648, 0)
40 (43, 1559857648, 0)
63 (39, 1559857648, 0)
185 (35, 1559857648, 0)
186 (9, 1559857648, 0)
187 (9, 1559857648, 0)
190 (5, 1559857648, 0)
189 (5, 1559857648, 0)
192 (5, 1559857648, 0)
waiter count=14
ksedsts()+240<-ksdxfstk()+44<-ksdxcb()+3432<-sspuser()+116<-__sighandler()<-_sigsetmask()+124<-sigthreadmask()+16<-slaac_int()+696<-slrac()+20<-kghdmpch()+2908<-kghiexdmp()+692<-kghidmp_new()+3068
<-kghdmp_new()+1364<-ksmhdm()+200<-ksdxdmp()+544<-ksdxfdmp()+132<-ksdxen_int()+3364<-ksdxen()+28<-opiodr()+908<-ttcpip()+1028<-opitsk()+1612<-opiino()+940<-opiodr()+908<-opidrv()+1132<-sou2o()+136
<-opimai_real()+560<-ssthrdmain()+276<-main()+204<-__start()+112
EXPECTED BEHAVIOR
-----------------------
For the DB version is 11.2.0.4, it should have already included the fix of Bug 14053795 - DIAGENH: LOW RISK WAY TO GET HEAPDUMP DATA ON BUSY SYSTEMS.
On 11.2.0.4, heapdump should hold the shared pool latch multiple times for each 1024 chunks, not one for a long time.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |