Large ACFS Volumn: AsmResilver1 using high amount of CPU
(Doc ID 2071755.1)
Last updated on JULY 29, 2021
Applies to:
Oracle Database - Enterprise Edition - Version 11.2.0.2 and laterInformation in this document applies to any platform.
Symptoms
ASM resilvering processes consuming high CPU.
Top output:
Node 1:
15441 root 20 0 0 0 0 R 82 0.0 16579:52 asmResilver1
Node 2:
44838 root 20 0 0 0 0 R 80 0.0 16340:52 asmResilver1
Node:1 Clock: '09-02-15 09.09.24' SerialNo:238905
----------------------------------------
SYSTEM:
#pcpus: 4 #vcpus: 48 cpuht: Y chipname: AMD cpu: 4.1 cpuq: 0 physmemfree: 39126824;';3:Time=09-02-15 09.09.24, Available memory (physmemfree 39126824 KB + swapfree 16780188 KB) on node l9804022 is Too Low (< 10% of total-mem + total-swap)' physmemtotal: 1045233392 mcache: 175610424 swapfree: 16780188 swaptotal: 16780284 ior: 117887 iow: 118716 ios: 401 swpin: 0 swpout: 0 pgin: 39227 pgout: 39560 netr: 2457.645 netw: 2519.615 procs: 1593 procsoncpu: 5 rtprocs: 123 rtprocsoncpu: 4294967295 #fds: 81024 #sysfdlimit: 6815744 #disks: 1265 #nics: 7 nicErrors: 0
TOP CONSUMERS:
topcpu: 'asmResilver1(15441) 80.24' topprivmem: 'java(37808) 1369928' topshm: 'asm_mmnl_+ASM1(8494) 127648' topfd: 'multipathd(32656) 1016' topthread: 'java(30594) 217'
PROCESSES:
name: 'klogd' pid: 24290 #procfdlimit: 1024 cpuusage: 0.0 privmem: 304 shm: 360 #fd: 2 #threads: 1 priority: 20 nice: 0
name: 'asmResilver1' pid: 15441 #procfdlimit: 1024 cpuusage: 80.24 privmem: 0 shm: 0 #fd: 2 #threads: 1 priority: 20 nice: 0
File_name :: node1_CHMOS
node1:/proc/15441 # pgrep -l Resilver
15441 asmResilver1
node1:/proc/15441 # cat stack
[] asmPhysIoRequest+0x35a/0x3a1 [oracleadvm]
[] common_interrupt+0xe/0x13
[] Asm_getNextRegion+0x109/0x14a [oracleadvm]
[] Asm_resilverVolume+0x4e6/0xefe [oracleadvm]
[] ASM_LOG+0x61/0x66 [oracleadvm]
[] Asm_startRecovery+0xa11/0xce8 [oracleadvm]
[] Asm_volRecovery+0x133/0x17c [oracleadvm]
[] Asm_volRecovery+0x0/0x17c [oracleadvm]
[] KsKthreadRun+0x79/0xb0 [oracleoks]
[] kernel_thread_helper+0x4/0x10
[] KsKthreadRun+0x0/0xb0 [oracleoks]
[] kernel_thread_helper+0x0/0x10
[] 0xffffffffffffffff
V 17182507.715/150915205543 asmResilver2[45549] Asm_resilverVolume: vol ACFS_VOL1 336415 IOs, 336414 regions, 839 secs
K 17182507.715/150915205543 oks_comm[27033] KsRecv: recv failed -104
K 17182507.715/150915205543 oks_comm[27033] odlm_comm_worker, KsRecv returns -4, comm thread exiting
K 17182507.715/150915205543 oks_rbld[16327] [Oracle OKS] rebuild: client: 1, step: 10/10, elapsed: 14
K 17182507.715/150915205543 oks_rbld[16327] kcss_rbld_start: thread 0xffff887f7e9370c0 ch 0xffffffffa05bc948 client id 0
K 17182507.715/150915205543 oks_rbld[16327] odlm_comm_close: Shutdown chan 2 to node 2
K 17182507.715/150915205543 oks_rbld[16326] rcbp 0xffffffffa05bca40 step 1, cur_leader 0x1 incarn 0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigStart: reason=5 seqnum=0x1e
K 17182507.715/150915205543 oks_rbld[16326] [Oracle OKS] rebuild: client: 0, step: 1/2, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16326] rcbp 0xffffffffa05bca40 step 2, cur_leader 0x1 incarn 0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigDone: reason=5 seqnum=0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigDone: reason=5 seqnum=0x1e
V 17182507.715 asmResilver2[27039] Asm_startRecovery: vol ACFS_VOL1 recovery mastered elsewhere.
V 17182507.715 asmResilver2[27039] Asm_startRecovery: vol ACFS_VOL1 mirror recovery thread exiting
K 17182507.715/150915205543 oks_rbld[16326] [Oracle OKS] rebuild: client: 0, step: 2/2, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16326] kcss_rbld_start: thread 0xffff886bb2536b40 ch 0xffffffffa05bca18 client id 2
K 17182507.715/150915205543 oks_rbld[16326] odlm_comm_close: Shutdown chan 3 to node 2
K 17182507.715/150915205543 oks_rbld[16328] rcbp 0xffffffffa05bcac0 step 2, cur_leader 0x1 incarn 0x1e
F 17182507.715/150915205543 oks_rbld[16328] OfsCompleteRecoverySetup: incarn 0x1e reason 5
F 17182507.715/150915205543 oks_rbld[16328] OfsCompleteRecoverySetup: Exit status 0x0
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] rebuild: client: 2, step: 2/3, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16328] rcbp 0xffffffffa05bcac0 step 3, cur_leader 0x1 incarn 0x1e
F 17182507.715/150915205543 oks_rbld[16328] OfsEnableInterfaces: ncarn 0x1e reason 5 stall 1
F 17182507.715/150915205543 oks_rbld[16328] OfsEnableInterfaces: Interfaces enabled
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] rebuild: client: 2, step: 3/3, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] Cluster Membership change setup complete
K 17182507.715/150915205543 oks_rbld[16328] odlm_comm_close: Shutdown chan 4 to node 2
V 17182507.987 asmResilver1[27061] Asm_acqRecovery: DGA.ACFS_VOL1 flags(0x4177) value block VALID
V 17182507.987 asmResilver1[27061] Asm_startRecovery: read valid valb 1
V 17182508.123 asmResilver1[27061] Asm_startRecovery: recovery needed.
File_name :: oks_RV3C.log (acfsData_node1_20150921_1327)
----------------------------------------
SYSTEM:
#pcpus: 4 #vcpus: 48 cpuht: Y chipname: AMD cpu: 4.1 cpuq: 0 physmemfree: 39126824;';3:Time=09-02-15 09.09.24, Available memory (physmemfree 39126824 KB + swapfree 16780188 KB) on node l9804022 is Too Low (< 10% of total-mem + total-swap)' physmemtotal: 1045233392 mcache: 175610424 swapfree: 16780188 swaptotal: 16780284 ior: 117887 iow: 118716 ios: 401 swpin: 0 swpout: 0 pgin: 39227 pgout: 39560 netr: 2457.645 netw: 2519.615 procs: 1593 procsoncpu: 5 rtprocs: 123 rtprocsoncpu: 4294967295 #fds: 81024 #sysfdlimit: 6815744 #disks: 1265 #nics: 7 nicErrors: 0
TOP CONSUMERS:
topcpu: 'asmResilver1(15441) 80.24' topprivmem: 'java(37808) 1369928' topshm: 'asm_mmnl_+ASM1(8494) 127648' topfd: 'multipathd(32656) 1016' topthread: 'java(30594) 217'
PROCESSES:
name: 'klogd' pid: 24290 #procfdlimit: 1024 cpuusage: 0.0 privmem: 304 shm: 360 #fd: 2 #threads: 1 priority: 20 nice: 0
name: 'asmResilver1' pid: 15441 #procfdlimit: 1024 cpuusage: 80.24 privmem: 0 shm: 0 #fd: 2 #threads: 1 priority: 20 nice: 0
File_name :: node1_CHMOS
node1:/proc/15441 # pgrep -l Resilver
15441 asmResilver1
node1:/proc/15441 # cat stack
[] asmPhysIoRequest+0x35a/0x3a1 [oracleadvm]
[] common_interrupt+0xe/0x13
[] Asm_getNextRegion+0x109/0x14a [oracleadvm]
[] Asm_resilverVolume+0x4e6/0xefe [oracleadvm]
[] ASM_LOG+0x61/0x66 [oracleadvm]
[] Asm_startRecovery+0xa11/0xce8 [oracleadvm]
[] Asm_volRecovery+0x133/0x17c [oracleadvm]
[] Asm_volRecovery+0x0/0x17c [oracleadvm]
[] KsKthreadRun+0x79/0xb0 [oracleoks]
[] kernel_thread_helper+0x4/0x10
[] KsKthreadRun+0x0/0xb0 [oracleoks]
[] kernel_thread_helper+0x0/0x10
[] 0xffffffffffffffff
V 17182507.715/150915205543 asmResilver2[45549] Asm_resilverVolume: vol ACFS_VOL1 336415 IOs, 336414 regions, 839 secs
K 17182507.715/150915205543 oks_comm[27033] KsRecv: recv failed -104
K 17182507.715/150915205543 oks_comm[27033] odlm_comm_worker, KsRecv returns -4, comm thread exiting
K 17182507.715/150915205543 oks_rbld[16327] [Oracle OKS] rebuild: client: 1, step: 10/10, elapsed: 14
K 17182507.715/150915205543 oks_rbld[16327] kcss_rbld_start: thread 0xffff887f7e9370c0 ch 0xffffffffa05bc948 client id 0
K 17182507.715/150915205543 oks_rbld[16327] odlm_comm_close: Shutdown chan 2 to node 2
K 17182507.715/150915205543 oks_rbld[16326] rcbp 0xffffffffa05bca40 step 1, cur_leader 0x1 incarn 0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigStart: reason=5 seqnum=0x1e
K 17182507.715/150915205543 oks_rbld[16326] [Oracle OKS] rebuild: client: 0, step: 1/2, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16326] rcbp 0xffffffffa05bca40 step 2, cur_leader 0x1 incarn 0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigDone: reason=5 seqnum=0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigDone: reason=5 seqnum=0x1e
V 17182507.715 asmResilver2[27039] Asm_startRecovery: vol ACFS_VOL1 recovery mastered elsewhere.
V 17182507.715 asmResilver2[27039] Asm_startRecovery: vol ACFS_VOL1 mirror recovery thread exiting
K 17182507.715/150915205543 oks_rbld[16326] [Oracle OKS] rebuild: client: 0, step: 2/2, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16326] kcss_rbld_start: thread 0xffff886bb2536b40 ch 0xffffffffa05bca18 client id 2
K 17182507.715/150915205543 oks_rbld[16326] odlm_comm_close: Shutdown chan 3 to node 2
K 17182507.715/150915205543 oks_rbld[16328] rcbp 0xffffffffa05bcac0 step 2, cur_leader 0x1 incarn 0x1e
F 17182507.715/150915205543 oks_rbld[16328] OfsCompleteRecoverySetup: incarn 0x1e reason 5
F 17182507.715/150915205543 oks_rbld[16328] OfsCompleteRecoverySetup: Exit status 0x0
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] rebuild: client: 2, step: 2/3, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16328] rcbp 0xffffffffa05bcac0 step 3, cur_leader 0x1 incarn 0x1e
F 17182507.715/150915205543 oks_rbld[16328] OfsEnableInterfaces: ncarn 0x1e reason 5 stall 1
F 17182507.715/150915205543 oks_rbld[16328] OfsEnableInterfaces: Interfaces enabled
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] rebuild: client: 2, step: 3/3, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] Cluster Membership change setup complete
K 17182507.715/150915205543 oks_rbld[16328] odlm_comm_close: Shutdown chan 4 to node 2
V 17182507.987 asmResilver1[27061] Asm_acqRecovery: DGA.ACFS_VOL1 flags(0x4177) value block VALID
V 17182507.987 asmResilver1[27061] Asm_startRecovery: read valid valb 1
V 17182508.123 asmResilver1[27061] Asm_startRecovery: recovery needed.
File_name :: oks_RV3C.log (acfsData_node1_20150921_1327)
Volumn was sized almost 3.5 Tb
Doing recovery of volumn repeatedly.
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |