Large ACFS Volumn: AsmResilver1 using high amount of CPU (Doc ID 2071755.1)

Last updated on OCTOBER 28, 2015

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.2 and later
Information in this document applies to any platform.

Symptoms

ASM resilvering processes consuming high CPU. 

Top output:

Node 1:
15441 root      20   0     0    0    0 R     82  0.0  16579:52 asmResilver1


Node 2:
44838 root      20   0     0    0    0 R     80  0.0  16340:52 asmResilver1

 

 

  

Node:1 Clock: '09-02-15 09.09.24' SerialNo:238905 
----------------------------------------

SYSTEM:
#pcpus: 4 #vcpus: 48 cpuht: Y chipname: AMD cpu: 4.1 cpuq: 0 physmemfree: 39126824;';3:Time=09-02-15 09.09.24, Available memory (physmemfree 39126824 KB + swapfree 16780188 KB) on node l9804022 is Too Low (< 10% of total-mem + total-swap)' physmemtotal: 1045233392 mcache: 175610424 swapfree: 16780188 swaptotal: 16780284 ior: 117887 iow: 118716 ios: 401 swpin: 0 swpout: 0 pgin: 39227 pgout: 39560 netr: 2457.645 netw: 2519.615 procs: 1593 procsoncpu: 5 rtprocs: 123 rtprocsoncpu: 4294967295 #fds: 81024 #sysfdlimit: 6815744 #disks: 1265 #nics: 7 nicErrors: 0

TOP CONSUMERS:
topcpu: 'asmResilver1(15441) 80.24' topprivmem: 'java(37808) 1369928' topshm: 'asm_mmnl_+ASM1(8494) 127648' topfd: 'multipathd(32656) 1016' topthread: 'java(30594) 217'

PROCESSES:

name: 'klogd' pid: 24290 #procfdlimit: 1024 cpuusage: 0.0 privmem: 304 shm: 360 #fd: 2 #threads: 1 priority: 20 nice: 0
name: 'asmResilver1' pid: 15441 #procfdlimit: 1024 cpuusage: 80.24 privmem: 0 shm: 0 #fd: 2 #threads: 1 priority: 20 nice: 0

File_name :: node1_CHMOS


node1:/proc/15441 # pgrep -l Resilver
15441 asmResilver1
node1:/proc/15441 # cat stack
[] asmPhysIoRequest+0x35a/0x3a1 [oracleadvm]
[] common_interrupt+0xe/0x13
[] Asm_getNextRegion+0x109/0x14a [oracleadvm]
[] Asm_resilverVolume+0x4e6/0xefe [oracleadvm]
[] ASM_LOG+0x61/0x66 [oracleadvm]
[] Asm_startRecovery+0xa11/0xce8 [oracleadvm]
[] Asm_volRecovery+0x133/0x17c [oracleadvm]
[] Asm_volRecovery+0x0/0x17c [oracleadvm]
[] KsKthreadRun+0x79/0xb0 [oracleoks]
[] kernel_thread_helper+0x4/0x10
[] KsKthreadRun+0x0/0xb0 [oracleoks]
[] kernel_thread_helper+0x0/0x10
[] 0xffffffffffffffff



V 17182507.715/150915205543 asmResilver2[45549] Asm_resilverVolume: vol ACFS_VOL1 336415 IOs, 336414 regions, 839 secs
K 17182507.715/150915205543 oks_comm[27033] KsRecv: recv failed -104
K 17182507.715/150915205543 oks_comm[27033] odlm_comm_worker, KsRecv returns -4, comm thread exiting
K 17182507.715/150915205543 oks_rbld[16327] [Oracle OKS] rebuild: client: 1, step: 10/10, elapsed: 14
K 17182507.715/150915205543 oks_rbld[16327] kcss_rbld_start: thread 0xffff887f7e9370c0 ch 0xffffffffa05bc948 client id 0
K 17182507.715/150915205543 oks_rbld[16327] odlm_comm_close: Shutdown chan 2 to node 2
K 17182507.715/150915205543 oks_rbld[16326] rcbp 0xffffffffa05bca40 step 1, cur_leader 0x1 incarn 0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigStart: reason=5 seqnum=0x1e
K 17182507.715/150915205543 oks_rbld[16326] [Oracle OKS] rebuild: client: 0, step: 1/2, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16326] rcbp 0xffffffffa05bca40 step 2, cur_leader 0x1 incarn 0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigDone: reason=5 seqnum=0x1e
V 17182507.715 oks_rbld[16326] AsmReconfigDone: reason=5 seqnum=0x1e
V 17182507.715 asmResilver2[27039] Asm_startRecovery: vol ACFS_VOL1 recovery mastered elsewhere.
V 17182507.715 asmResilver2[27039] Asm_startRecovery: vol ACFS_VOL1 mirror recovery thread exiting
K 17182507.715/150915205543 oks_rbld[16326] [Oracle OKS] rebuild: client: 0, step: 2/2, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16326] kcss_rbld_start: thread 0xffff886bb2536b40 ch 0xffffffffa05bca18 client id 2
K 17182507.715/150915205543 oks_rbld[16326] odlm_comm_close: Shutdown chan 3 to node 2
K 17182507.715/150915205543 oks_rbld[16328] rcbp 0xffffffffa05bcac0 step 2, cur_leader 0x1 incarn 0x1e
F 17182507.715/150915205543 oks_rbld[16328] OfsCompleteRecoverySetup: incarn 0x1e reason 5
F 17182507.715/150915205543 oks_rbld[16328] OfsCompleteRecoverySetup: Exit status 0x0
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] rebuild: client: 2, step: 2/3, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16328] rcbp 0xffffffffa05bcac0 step 3, cur_leader 0x1 incarn 0x1e
F 17182507.715/150915205543 oks_rbld[16328] OfsEnableInterfaces: ncarn 0x1e reason 5 stall 1 
F 17182507.715/150915205543 oks_rbld[16328] OfsEnableInterfaces: Interfaces enabled
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] rebuild: client: 2, step: 3/3, elapsed: 0
K 17182507.715/150915205543 oks_rbld[16328] [Oracle OKS] Cluster Membership change setup complete
K 17182507.715/150915205543 oks_rbld[16328] odlm_comm_close: Shutdown chan 4 to node 2
V 17182507.987 asmResilver1[27061] Asm_acqRecovery: DGA.ACFS_VOL1 flags(0x4177) value block VALID
V 17182507.987 asmResilver1[27061] Asm_startRecovery: read valid valb 1
V 17182508.123 asmResilver1[27061] Asm_startRecovery: recovery needed.



File_name :: oks_RV3C.log (acfsData_node1_20150921_1327)

  
Volumn was sized almost 3.5 Tb
Doing recovery of volumn repeatedly.


Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms