DB/ASM hangs while adding disks into diskgroup (Doc ID 2230457.1)

Last updated on FEBRUARY 08, 2017

Applies to:

Oracle Database - Enterprise Edition - Version 10.2.0.3 to 11.1.0.7 [Release 10.2 to 11.1]
Information in this document applies to any platform.

Symptoms

ASM disks hung is occurred  while adding into diskgroup.

ASM alert log:

Sat Feb 4 23:10:44 2017
SQL> ALTER DISKGROUP DBDATA_GSDB ADD DISK 'ORCL:ASM1213','ORCL:ASM1214','ORCL:ASM1215','ORCL:ASM1216' REBALANCE POWER 11
Sat Feb 4 23:10:44 2017
NOTE: reconfiguration of group 4/0x202d1a93 (DBDATA_GSDB), full=1
Sat Feb 4 23:10:44 2017
NOTE: initializing header on grp 4 disk ASM1213
NOTE: initializing header on grp 4 disk ASM1214
NOTE: initializing header on grp 4 disk ASM1215
NOTE: initializing header on grp 4 disk ASM1216
NOTE: cache opening disk 55 of grp 4: ASM1213 label:ASM1213
NOTE: cache opening disk 56 of grp 4: ASM1214 label:ASM1214
NOTE: cache opening disk 57 of grp 4: ASM1215 label:ASM1215
NOTE: cache opening disk 58 of grp 4: ASM1216 label:ASM1216
NOTE: PST update: grp = 4
Sat Feb 4 23:44:02 2017
GES: Potential blocker (pid=9962) on resource DG-00000000-00000000;
enqueue info in file /ora01/app/oracle/admin/+ASM/udump/+asm1_ora_5375.trc and DIAG trace file

Rebalance is not occurring from gv$asm_operation

SQL> select * from gv$asm_operation;

INST_ID GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES
---------- ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
1 4 DSCV WAIT 0 0 0 0 0 0
1 4 RFRSH WAIT 0 0 0 0 0 0
3 4 DSCV WAIT 0 0 0 0 0 0
3 4 RFRSH WAIT 0 0 0 0 0 0
2 4 DSCV WAIT 0 0 0 0 0 0
2 4 RFRSH WAIT 0 0 0 0 0 0

 

 RBAL trace show the following message repeatedly.

kfgbDoorBellBast: BAST release invoked, gn=4
kfgbDoorBellBast: BAST released, gn=4
kfgbTryFn: failed to acquire DD.0.0 in 6 for kfgbDiscoverNow (of group 4/0x202d1a93)
kfgbTryFn: failed to acquire DD.0.0 in 6 for kfgbDiscoverNow (of group 4/0x202d1a93)
kfgbTryFn: failed to acquire DD.0.0 in 6 for kfgbDiscoverNow (of group 4/0x202d1a93) 

 Instances are being blocked by GMON, which is waiting for CSSD:

+asm2_diag_16243.trc:

*** 2017-02-04 23:42:45.428

Dumping diagnostic information for ospid 16293:
OS pid = 16293
loadavg : 4.10 3.81 4.11
memory info: free memory = 0.00M
swap info: free = 0.00M alloc = 0.00M total = 0.00M
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
0 S oracle 16293 1 0 76 0 - 86990 - 2016 ? 00:00:13 asm_gmon_+ASM2
(no debugging symbols found)
Using host libthread_db library "/lib64/tls/libthread_db.so.1".
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 182921290944 (LWP 16293)]
0x0000003c1e9c08ef in poll () from /lib64/tls/libc.so.6
#0 0x0000003c1e9c08ef in poll () from /lib64/tls/libc.so.6
#1 0x0000000004ca2f53 in ntevpque ()
#2 0x0000000004ca0505 in ntevque ()
#3 0x0000000004c75af2 in nsevwait ()
#4 0x0000002a956d32c1 in clsc_nswait (ugblm=0x66c8280, cxdli=0x2,
cxdc=0xffffffffffffffff, poll=-1) at clsc.c:5269
#5 0x0000002a956d046b in clsc_select_ext (ugblm=0x66c8280, hdlr=0x2,
evpoll=-1, noreap=-1) at clsc.c:6653
#6 0x0000002a956ceaf6 in clsc_receive_wait (con=0x66c8280, msgbuf=0x2,
msgbufl=4294967295, to_ms=4294967295, pollint_ms=1, block=1) at clsc.c:4501
#7 0x0000002a956ce6fb in clscreceive (con=0x66c8280, msgbuf=0x2,
msgbufl=4294967295, to_ms=4294967295, pollint_ms=1, flags=1) at clsc.c:3119
#8 0x0000002a956b5166 in clsssRecvMsg (ctx=0x66c8280, conm=0x2,
resptyp=4294967295, respMsg=0xffffffffffffffff, respMsgl=1, block=1)
at clsss.c:2226
#9 0x0000002a956b6758 in clsssServerRPC (ctx=0x66c8280, conm=0x2,
reqMsg=0xffffffffffffffff, resptyp=4294967295, respMsg=0x1, respMsgl=1,
wait_ms=10000) at clsss.c:2105
#10 0x0000002a956af0a0 in clssgsGetMemberPublicInfo (ctx=0x66c8280,
GroupName=0x2 <Address 0x2 out of bounds>, MemberNum=4294967295,
Info=0xffffffffffffffff, InfoBufSize=1, InfoSize=0x1) at clssgs.c:2591
#11 0x0000002a956aedf4 in clssgspubdata (ose=0x66c8280, ctx=0x2,
grpname=0xffffffffffffffff <Address 0xffffffffffffffff out of bounds>,
memno=4294967295, pub_data=0x1, bufsz=1, len=0x7fbfffdfa0, flags=1)
at clssgs.c:1175

#12 0x0000000003a647ed in kgxgnpbdata ()
#13 0x0000000002524791 in kfmsReconfPriv ()
#14 0x00000000025234d8 in kfmdbgTimeout ()
#15 0x000000000251f246 in kfmdbgDriver ()
#16 0x00000000011d2bd1 in ksbcti ()
#17 0x00000000011d2604 in ksbabs ()
#18 0x00000000011d9212 in ksbrdp ()
#19 0x000000000238363c in opirip ()
#20 0x0000000000746096 in opidrv ()
#21 0x000000000074451e in sou2o ()
#22 0x000000000070a575 in opimai_real ()
#23 0x000000000070a42c in main ()
*** 2017-02-04 23:42:48.398
----------------------------------------
SO: 0x6b926f28, type: 2, owner: (nil), flag: INIT/-/-/0x00
(process) Oracle pid=14, calls cur/top: 0x6b49d5e8/0x6b49d5e8, flag: (6) SYSTEM
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 56
last post received-location: kjatb: wake up enqueue blocker              <<<<<<<<
last process to post me: 6b922fe8 1 6
last post sent: 0 0 25
last post sent-location: ksasnr
last process posted by me: 6b927ef8 8 0
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 0x6b9e7558
O/S info: user: oracle, term: UNKNOWN, ospid: 16293
OSD pid info: Unix process pid: 16293, image: oracle@lnx506 (GMON)         <<<<<<

 

+asm1_diag_21464.trc:

SO: 0x6b44c930, type: 4, owner: 0x6b92a680, flag: INIT/-/-/0x00
(session) sid: 427 trans: 0x6aba5ad8, creator: 0x6b92a680, flag: (80000041) USR/- BSY/-/-/-/-/-
DID: 0000-0015-002B070A, short-term DID: 0000-0000-00000000
txn branch: (nil)
oct: 0, prv: 0, sql: (nil), psql: 0x66eb0ea8, user: 0/SYS
O/S info: user: oracle, term: pts/1, ospid: 9948, machine: lnx505
program: sqlplus@lnx505 (TNS V1-V3)
application name: sqlplus@lnx505 (TNS V1-V3), hash value=0
waiting for 'rdbms ipc reply' blocking sess=0x(nil) seq=1025 wait_time=0 seconds since wait started=0
from_process=e, timeout=7ffffe71, =0
Dumping Session Wait History
for 'rdbms ipc reply' count=1 wait_time=1318743
from_process=e, timeout=7ffffe71, =0
for 'rdbms ipc reply' count=1 wait_time=1953899
from_process=e, timeout=7ffffe73, =0
for 'rdbms ipc reply' count=1 wait_time=1953899
from_process=e, timeout=7ffffe75, =0

 

This below output can be checked from the sql attached on this note - asm_blocking.sql

BLOCKER WAITER TY ID1 ID2 LMODE REQUEST
-------- --------- -- ---------- ---------- ---------- ----------
1/138 3/90 DD 0 0 6 6
1/138 3/94 DD 0 0 6 6
1/138 3/92 DD 0 0 6 6
1/138 3/97 DD 0 0 6 6
1/138 3/101 DD 0 0 6 6
1/138 3/91 DD 0 0 6 6
1/138 3/95 DD 0 0 6 6
1/138 3/114 DD 0 0 6 6
2/138 3/94 DD 0 0 6 6
2/138 3/92 DD 0 0 6 6
2/138 3/91 DD 0 0 6 6
2/138 3/90 DD 0 0 6 6
2/138 3/114 DD 0 0 6 6
2/138 3/101 DD 0 0 6 6
2/138 3/97 DD 0 0 6 6
2/138 3/95 DD 0 0 6 6
3/138 3/90 DD 0 0 6 6
3/138 3/91 DD 0 0 6 6
3/138 3/101 DD 0 0 6 6
3/138 3/97 DD 0 0 6 6
3/138 3/92 DD 0 0 6 6
3/138 3/114 DD 0 0 6 6
3/138 3/95 DD 0 0 6 6
3/138 3/94 DD 0 0 6 6

24 rows selected.

ASM1 :

BLOCKER WAITER TY ID1 ID2 LMODE REQUEST
-------- --------- -- ---------- ---------- ---------- ----------
1/427 1/412 DG 0 0 6 4
1/427 1/424 DG 0 0 6 4
1/427 1/428 DG 0 0 6 4
1/427 1/400 DG 0 0 6 4
1/427 1/402 DG 0 0 6 4

Note that ID1 and ID2 is "0", i.e DD-00000000-00000000 and LMODE is "6" which is exclusive mode.

 

I SID S# PID USERNAME
---------- ---------- ---------- ---------- ------------------------------
CS_PID STATUS
------------------------- --------
PROGRAM OSUSER
------------------------------------------------ ------------------------------
WAIT_EVENT WAITSEC P1
------------------------------ ---------- ----------
P2
----------------------------------------
SQL
--------------------------------------------------------------------------------
enq: DG - contention 9 1145503748

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms