- alter database datafile resize hang on row cache lock in 2 nodes RAC environment.
- Hanganalyze trace shows that server process 37438 on the instance 1 was waiting for 'row cache lock', and it was the blocker for many of the other chains.
Chains most likely to have caused the hang:
[a] Chain 1 Signature: 'row cache lock'<='row cache lock'
Chain 1 Signature Hash: 0xb78e3290
[b] Chain 2 Signature: 'row cache lock'<='row cache lock'
Chain 2 Signature Hash: 0xb78e3290
[c] Chain 3 Signature: 'row cache lock'<='row cache lock'
Chain 3 Signature Hash: 0xb78e3290
... ...
Chain 1:
------------------------------------------------------------------------------
-
Oracle session identified by:
{
... ...
and is blocked by
=> Oracle session identified by:
{
instance: 1 (<DB Name>.<Instance Name>)
os id: 37438
process id: 70, oracle@<Hostname> (TNS V1-V3)
session id: 576
session serial #: 16638
}
which is waiting for 'row cache lock' with wait info:
{
p1: 'cache id'=0x6
p2: 'mode'=0x0
p3: 'request'=0x5
time in wait: 109 min 57 sec
timeout after: never
wait id: 183
blocking: 14 sessions
current sql: alter database datafile '<Path>/<Data File Name>.dbf' resize 1579M
short stack:
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-semtimedop()+10<-skgpwwait()+200<-ksliwat()+2280<-kslwaitctx()+197<-kjusuc()+9058
<-ksipgetctxia()+5359<-ksixpgetia()+167<-kqrInstanceLock()+863<-kqrGetClusterLock()+381<-kqrLockAndPinPo()+1509<-kqrpre1()+2960<-kqrpre()+21<-ktsfx_getfdesc
()+219<-ktsfxrsz()+3773<-dbsrsz()+2136<-adbdrv_options()+31218<-opiexe()+23533<-opiosq0()+4766<-kpooprx()+404<-kpoal8()+857<-opiodr()+1229<-ttcpip()+1239<-
opitsk()+1940<-opiino()+941<-opiodr()+1229<-opidrv()
... ...
}
and may or may not be blocked by another session.
- System state dump trace of instance 1 shows that the server process of datafile resizing was trying to get a row cache object at level X on the cache id 6 (dc_files)
SO: 0x2ccc4fd88, type: 102, owner: 0x3aa77f168, flag: INIT/-/-/-/0x00if: 0x1 c: 0x1
proc=0x3f8bc7d08, name=row cache enqueues, file=kqr.h LINE:2319,pg=0, conuid=0
row cache enqueue: count=1 session=0x410d67b18 object=0x3761f5e68,request=X
savepoint=0x21
type=MULTI-INSTANCE instance locked=F handle=0x3eb917e70
row cache parent object: addr=0x3761f5e68 cid=6(dc_files) conid=0 conuid=0
hash=567275a7 typ=61 transaction=(nil) flags=00000002 inc=1, pdbinc=1
own=0x3761f5f38[0x3761f5f38,0x3761f5f38] wat=0x3761f5f48[0x2ccc4fe48,0x22c5ef180] mode=N req=X
status=VALID/-/-/-/-/-/-/-/-
instance lock=QG 567275a7 538a681
data=
00000a82 00000e60 00000670 00000002 00000395 00000000 0000000002052ef3
00000e4c e5400002 00000000 00000000 00000100 567275a7 761f5e6800000003
761f6ab8 00000003 7eec7b68 00000003 00000000 00000000
- There are many processes trying to newly get the row cache lock QG 567275a7 538a681 at X. However, no one was holding the row cache lock at S or X on the instance.
$ grep -B2 'QG 567275a7 538a681' <Instance1_SID>_diag_7221.trc | grep 'mode=' |head -5
own=0x3761f5f38[0x3761f5f38,0x3761f5f38]
wat=0x3761f5f48[0x2ccc4fe48,0x22c5ef180] mode=N req=X
own=0x3761f5f38[0x3761f5f38,0x3761f5f38]
wat=0x3761f5f48[0x2ccc4fe48,0x22c5ef180] mode=N req=X
own=0x3761f5f38[0x3761f5f38,0x3761f5f38]
wat=0x3761f5f48[0x2ccc4fe48,0x22c5ef180] mode=N req=X
own=0x3761f5f38[0x3761f5f38,0x3761f5f38]
wat=0x3761f5f48[0x2ccc4fe48,0x22c5ef180] mode=N req=X
own=0x3761f5f38[0x3761f5f38,0x3761f5f38]
wat=0x3761f5f48[0x2ccc4fe48,0x22c5ef180] mode=N req=X
$ grep -B2 'QG 567275a7 538a681' <Instance1_SID>_diag_7221.trc | grep 'mode=' | grep -v 'mode=N'
$
- Hang Manager Trace of instance 1 shows that the GES enqueue at the head of the convert queue was convertint from KJUSERNL to KJUSEREX.
============================================================
SO: 0x3eb917e70, type: 42, owner: 0x3e13ef4b0, flag: INIT/-/-/-/0x00 if: 0x1 c: 0x1
proc=0x3f8bc7d08, name=ges enqueues, file=kjl.h LINE:673, pg=0, conuid=0
----------enqueue 0x3eb917e70------------------------
lock version : 47953
Owner inst : 1
grant_level : KJUSERNL <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
req_level : KJUSEREX <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
bast_level : KJUSERNL
notify_func : none
resp : 0x3f27607d8
procp : 0x3e13ef4b0
pid : 37438
proc version : 8282
oprocp : (nil)
opid : 37438
group lock owner : 0x410d67b18
possible pid : 37438 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
xid : 46000-0001-00008683
dd_time : 0.0 secs
dd_count : 0
timeout : 0.0 secs
On_timer_q? : N
On_dd_q? : N
sec since mv2grQ : N/A
lock_state : OPENING CONVERTING
ast_flag : 0x0
flags : 0x0
Open Options : KJUSERDEADLOCK
Convert options :
History : NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > REF_RES > GR2CVT
Msg_Seq : 0x0
res_seq : 7811
valblk : 0x0400000000000000c4e1161100000000 .
----------resource 0x3f27607d8----------------------
resname : [0x567275a7][0x538a681],[QG][ext 0x0,0x0][domid 0x0]
lmdid : 0
rht group : 0
rht ptr : 0x401996950
rht bucket idx: 9128
domain ptr : (nil)
hash mask : x7
Local inst : 1
dir_inst : 1
master_inst : 1
hv idx : 121
hv last r.inc : 0
current inc : 4
hv status : 0
hv master inst: 1
open options : deadlock detection=N, cached=N, varvblk=N, slock=Y
Held mode : KJUSERNL
Cvt mode : KJUSERNL
Next Cvt mode : KJUSERNL
msg_seq : (nil)
res_seq : 7811
grant_bits : KJUSERNL
grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX
count : 14 0 0 0 0 0
val_state : KJUSERVS_NOVALUE
valblk : 0x200000003000000080aa9086fc7f0000 0
access_inst : 1
vbreq_state : 0
state : x0
resp : 0x3f27607d8
entry : DIR=Y, MASTER=Y
On Scan_q? : N
On Cache? : N
On Remote_q? : N
history : x26
frozen : 0
Total accesses: 12015557
Imm. accesses: 12006909
SbcastID : 0 (0x0)
global Xreqid : 1 (0x1)
exclusive bit : off
bast instance : 32767
recent acts : GRANT_NL=N, GRANT_PR=N, GRANT_EX=N, GRANT_UNK=N, BAST=N, NEED_NOTIFY=N, PROC_FCR=Y, READY_GRANT=N
verify replies: 1
okXmap insts : 1
slknodes/w vb : x0
Granted_locks : 0
Cvting_locks : 14 value_block: 20 00 00 00 30 00 00 00 80 aa 90 86 fc 7f 00 00
GRANTED_Q:
CONVERT_Q:
lp 0x3eb917e70 gl KJUSERNL rl KJUSEREX rp 0x3f27607d8 [0x567275a7][0x538a681],[QG][ext 0x0,0x0][domid 0x0]
master 1 gl owner 0x410d67b18 possible pid 37438 xid 46000-0001-00008683 rseq 7811 mseq 0 bast none
history NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > REF_RES > GR2CVT
convert opt flags 0x0 lp 0x3ebc6a568 gl KJUSERNL rl KJUSERPR rp 0x3f27607d8 [0x567275a7][0x538a681],[QG][ext 0x0,0x0][domid 0x0]
master 1 gl owner 0x410eea4f8 possible pid 4323 xid 62000-0001-0000B11C rseq 7811 mseq 0 bast none
history NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > REF_RES > GR2CVT
convert opt flags 0x0 lp 0x403177708 gl KJUSERNL rl KJUSERPR rp 0x3f27607d8 [0x567275a7][0x538a681],[QG][ext 0x0,0x0][domid 0x0]
master 1 gl owner 0x3e8f5ef18 possible pid 5519 xid 64000-0001-0000A01A rseq 7811 mseq 0 bast none
history NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > REF_RES > GR2CVT
convert opt flags 0x0 lp 0x3f3f325f0 gl KJUSERNL rl KJUSERPR rp 0x3f27607d8 [0x567275a7][0x538a681],[QG][ext 0x0,0x0][domid 0x0]
... ...
convert opt flags 0x0 lp 0x3f3cb8240 gl KJUSERNL rl KJUSERPR rp 0x3f27607d8 [0x567275a7][0x538a681],[QG][ext 0x0,0x0][domid 0x0]
master 1 gl owner 0x3f8df4408 possible pid 27561 xid 50000-0001-00008B80 rseq 7811 mseq 0 bast none
history NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > REF_RES > GR2CVT
============================================================
- The LMD0 trace on instance 2 shows that an owner 0x408f8a130 was holding the GES enqueue [0x567275a7][0x538a681],[QG] at KJUSERPR and the 'sec since mv2grQ' for the GES enqueue was 568878 seconds ( 6 days 14 hours 1 minutes 18 seconds )
============================================================
*** 2021-12-12T09:06:41.333823+08:00
ENQUEUE DUMP REQUEST: from 1 spnum 115 on [0x567275a7][0x538a681],[QG][ext
0x0,0x0][domid 0x0] for reason 2 mtype 0
DUMP LOCAL BLOCKER/HOLDER: block level 3 res [0x567275a7][0x538a681],[QG][ext
0x0,0x0][domid 0x0]
----------resource 0xb81c4628----------------------
resname : [0x567275a7][0x538a681],[QG][ext 0x0,0x0][domid 0x0]
lmdid : 0
rht group : 0
rht ptr : 0x401996950
rht bucket idx: 9128
domain ptr : (nil)
hash mask : x7
Local inst : 2
dir_inst : 1
master_inst : 1
hv idx : 121
hv last r.inc : 0
current inc : 4
hv status : 0
hv master inst: 1
open options : deadlock detection=N, cached=N, varvblk=N, slock=Y
Held mode : KJUSERPR
Cvt mode : KJUSERNL
Next Cvt mode : KJUSEREX
msg_seq : (nil)
res_seq : 29525
grant_bits : KJUSERPR
grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX
count : 0 0 0 1 0 0
val_state : KJUSERVS_NOVALUE
valblk : 0xc092401200000000b29a7a1100000000 .@z
access_inst : 2
vbreq_state : 0
state : x0
resp : 0xb81c4628
entry : DIR=N, MASTER=N
On Scan_q? : N
On Cache? : N
On Remote_q? : N
history : x26
frozen : 0
Total accesses: 11490356
Imm. accesses: 11459398
SbcastID : 0 (0x0)
request uid : 511901697 (0x1e830001)
exclusive bit : off
bast instance : 1
Granted_locks : 1
Cvting_locks : 0
value_block: c0 92 40 12 00 00 00 00 b2 9a 7a 11 00 00 00 00
GRANTED_Q:
lp 0x3ebd37858 gl KJUSERPR rp 0xb81c4628 [0x567275a7][0x538a681],[QG][ext
0x0,0x0][domid 0x0]
master 1 gl owner 0x408f8a130 possible pid 36352 xid 33000-0002-0001046B
rseq 29525 mseq 0 bast none
history NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE >
REF_RES > LOC_AST
open opt KJUSERDEADLOCK flags 0x1 sec since mv2grQ 567143
CONVERT_Q:
----------enqueue 0x3ebd37858------------------------
lock version : 63612361
Owner inst : 2
grant_level : KJUSERPR
req_level : KJUSERPR
bast_level : KJUSERNL
notify_func : none
resp : 0xb81c4628
procp : 0x3e141e760
pid : 54239
proc version : 63002
oprocp : (nil)
opid : 54239
group lock owner : 0x408f8a130
possible pid : 36352
xid : 33000-0002-0001046B
dd_time : 0.0 secs
dd_count : 0
timeout : 0.0 secs
On_timer_q? : N
On_dd_q? : N
sec since mv2grQ : 568878
lock_state : GRANTED
ast_flag : 0x0
flags : 0x1
Open Options : KJUSERDEADLOCK
Convert options :
History : NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE > REF_RES > LOC_AST
Msg_Seq : 0x0
res_seq : 29525
valblk : 0x00000000e5070c000c00090023002400 .#$
----------enqueue 0x402c807e8------------------------
lock version : 290354633
Owner inst : 2
grant_level : KJUSERNL
req_level : KJUSERPR
bast_level : KJUSERNL
notify_func : none
resp : 0xb81c4628
procp : 0x3e1405c10
pid : 50164
proc version : 75518
oprocp : (nil)
opid : 50164
group lock owner : 0x3f0c9d0a8
possible pid : 50164
xid : 70000-0002-0005316A
dd_time : 0.0 secs
dd_count : 0
timeout : 0.0 secs
On_timer_q? : N
On_dd_q? : N
sec since mv2grQ : N/A
lock_state : OPENING CONVERTING
ast_flag : 0x0
flags : 0x0
Open Options : KJUSERDEADLOCK
Convert options :
History : NONE > NONE > NONE > NONE > NONE > NONE > NONE > NONE >
NONE > REF_RES > GR2CVT
Msg_Seq : 0x0
res_seq : 29525
valblk : 0x00000000e5070c000c00090023002400
============================================================
- The CLMN process on the instance 2 tried to clean it up at 2021-12-05T19:34:37
============================================================
kqrReleaseLock cleaning lock handle 0x3ebd37858
*** 2021-12-05T19:34:37.077629+08:00
2021-12-05 19:34:37.077*:kjlsod(): abort cleanup of lockp 0x3ebd37858, lock
has !kss-owner, it must have been freed or reused as group-lock
2021-12-05 19:34:37.077*:kjlsod(): possible new owner: procp 0x3e141e760,
orapid 0x3e8c0b4d0, ospid 53255
============================================================
Sessions killed or exit abnormally before.