My Oracle Support Banner

ACFS filesystem on DOM U hanged after DOM 0 patching due to network level issue (Doc ID 2661360.1)

Last updated on MAY 29, 2020

Applies to:

Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later
Information in this document applies to any platform.

Symptoms

DOM U level acfs filesystem hanged after DOM 0 level patching.

Any OS command like DF etc. was hanging on OS level.


Mar 27 03:49:29 xxxxxxxxkernel: [28560.798166] INFO: task acfssnap1:59698 blocked for more than 120 seconds.
Mar 27 03:49:29 xxxxxxxxkernel: [28560.803956] Tainted: P O 4.1.12-124.30.1.el6uek.x86_64 #2
Mar 27 03:49:29 xxxxxxxxkernel: [28560.809533] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815686] acfssnap1 D 0000000000018480 0 59698 2 0x00000080
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815691] ffff8835107bb6e8 0000000000000046 ffff8835107bb708 ffff88360e5a7000
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815693] 0000000000000000 ffff8835107bc000 0000000000000000 0000000000000001
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815696] ffff8835107bb7d8 0000000000000001 ffff8835107bb708 ffffffff816efd27
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815698] Call Trace:
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815707] [] schedule+0x37/0x90
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815719] [] KsWaitEvent_OSD+0x3c9/0x570 [oracleoks]
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815724] [] ? wait_woken+0x90/0x90
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815734] [] odlm_wait_on_event+0xd8/0x180 [oracleoks]
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815741] [] ? KsAcquireSpinLock_debug+0x67/0x120 [oracleoks]
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815752] [] convert_one_remote+0x58c/0xab0 [oracleoks]
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815763] [] convert_lock+0x423/0xc7e [oracleoks]
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815772] [] odlm_cvt+0x1df/0x230 [oracleoks]
Mar 27 03:49:29 xxxxxxxxkernel: [28560.815795] [] ? ofs_dlm_remove_defered_close_lock+0x1a0/0x1a0 [oracleacfs]V 4806519.060 blkid[330152] asmIoctl_int: Invalid ioctl type for dev 26 (type:83,opcode:49,cmd:0x5331)

K 4806591.900/200409132510 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67edf300, peer ip addr (IPv4) 192.168.xxx.n.6
K 4806591.900/200409132510 oks_comm[331623] odlm_comm_worker: start; sock ffff882d67edf300 msg ffff882be80d4000 reply ffff882be80d4180, remote 192.168.xxx.n.6
K 4806591.900/200409132510 oks_comm[331623] kcss_rbld_do_step: RBLD_NOT_READY sending node 3 client id 1 rbld_seq_num 0xa5 incarn 0xa4
K 4806591.900/200409132510 oks_comm[331623] odlm_process_request: from csid 0x7200000003 status 5
K 4806591.984/200409132510 apx_acfs_+apx1[307158] kcss_membership_update: Version: 2
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00008: Cluster Membership Change starting - Incarnation 165.
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00023: Cluster membership node list:
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00024: Node 1 (Interconnect address: 192.168.xxx.n.1) ecp501vm01
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00024: Node 2 (Interconnect address: 192.168.xxx.n.3) ecp502vm01
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00024: Node 3 (Interconnect address: 192.168.xxx.n.5) ecp503vm01 <<-----
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00024: Node 4 (Interconnect address: 192.168.xxx.n.7) ecp504vm01
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00024: Node 5 (Interconnect address: 192.168.xxx.n.9) ecp505vm01
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00024: Node 6 (Interconnect address: 192.168.xxx.n.11) ecp506vm01
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00024: Node 8 (Interconnect address: 192.168.xxx.n.15) ecp508vm01
U 4806591.984/200409132510 apx_acfs_+apx1[307158] OKSK-00025: Cluster membership node count: 7, Local Node Number: 1, Rebuild Master: 3
K 4806591.984/200409132510 apx_acfs_+apx1[307158] odlm_stall_all: reason 5

U 4806591.984 apx_acfs_+apx1[307158] ADVMK-0013: Cluster reconfiguration started.
K 4806591.984/200409132510 apx_acfs_+apx1[307158] odlm_comm_memb_update: incarn 0xa5
K 4806591.984/200409132510 apx_acfs_+apx1[307158] Remote node 2 ipAddr (IPv4) 192.168.xxx.n; port 21486 csid 0xa500000002
V 4806591.984/200409132510 UsmVolMonitor[26238] PARTIAL dump RECONFIG - node number 1 of 7 active
V 4806591.984/200409132510 UsmVolMonitor[26238] AsmBlogRoot root: ACTIVE <> SHUTDOWN rac=1 flex=1 reg=4 seq=165 vbg=0x0 ctl=0x3c07 dgs=1 fl=0x59 aux=0/0
K 4806591.984/200409132510 apx_acfs_+apx1[307158] Remote node 3 ipAddr (IPv4) 192.168.xxx.n; port 21486 csid 0x7200000003
K 4806592.901/200409132511 oks_comm[331623] rcbp 0xffffffffc0551940 step 0, cur_leader 0x3 incarn 0xa5
K 4806592.901/200409132511 oks_comm[331623] odlm_stall_all: reason 5
K 4806592.922/200409132511 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67ede900, peer ip addr (IPv4) 192.168.xxx.n <<-----
K 4806592.922/200409132511 oks_comm[331641] odlm_comm_worker: start; sock ffff882d67ede900 msg ffff8828666d4000 reply ffff8828666d4180, remote 192.168.xxx.n.4
K 4806593.644/200409132511 apx_acfs_+apx1[307158] KsConnect: IPv4 Connect() failed: rc -113, addr 192.168.xxx.n <<----
K 4806593.644/200409132511 apx_acfs_+apx1[307158] odlm_comm_memb_update: Connect failed err 4 node 3 retrying
K 4806593.902/200409132512 oks_comm[331623] rcbp 0xffffffffc0551940 step 0, cur_leader 0x3 incarn 0xa5
K 4806593.902/200409132512 oks_comm[331623] odlm_stall_all: reason 5
K 4806593.903/200409132512 oks_comm[331623] rcbp 0xffffffffc0551940 step 1, cur_leader 0x3 incarn 0xa5
K 4806593.903/200409132512 oks_comm[331623] rcbp 0xffffffffc0551940 step 2, cur_leader 0x3 incarn 0xa5
K 4806593.904/200409132512 oks_comm[331623] rcbp 0xffffffffc0551940 step 3, cur_leader 0x3 incarn 0xa5
F 4806593.904/200409132512 oks_comm[331623] OfsMarkForRecovery: Node 165 reason 8
F 4806593.904/200409132512 oks_comm[331623] OfsMarkForRecovery: Exit
K 4806593.904/200409132512 oks_comm[331623] odlm_init_rbld: 0 files removed from cache
K 4806593.964/200409132512 oks_comm[331623] rcbp 0xffffffffc0551940 step 4, cur_leader 0x3 incarn 0xa5
K 4806593.964/200409132512 oks_comm[331623] rcbp 0xffffffffc0551940 step 5, cur_leader 0x3 incarn 0xa5
K 4806593.965/200409132512 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67ed9680, peer ip addr (IPv4) 192.168.xxx.n
K 4806593.965/200409132512 oks_comm[331699] odlm_comm_worker: start; sock ffff882d67ed9680 msg ffff882865e8c000 reply ffff882865e8c180, remote 192.168.xxx.n
K 4806593.965/200409132512 oks_comm[331623] odlm_rbld_dir: 5 threads created
K 4806594.101/200409132512 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67edd000, peer ip addr (IPv4) 192.168.xxx.n
K 4806594.101/200409132512 oks_comm[331702] odlm_comm_worker: start; sock ffff882d67edd000 msg ffff882d678d8780 reply ffff882d678d8900, remote 192.168.xxx.n
K 4806594.111/200409132512 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67edfd00, peer ip addr (IPv4) 192.168.xxx.n
K 4806594.112/200409132512 oks_comm[331703] odlm_comm_worker: start; sock ffff882d67edfd00 msg ffff8835bb580180 reply ffff8835bb580000, remote 192.168.xxx.n
K 4806594.114/200409132512 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67ede400, peer ip addr (IPv4) 192.168.xxx.n
K 4806594.114/200409132512 oks_comm[331704] odlm_comm_worker: start; sock ffff882d67ede400 msg ffff88374a294a80 reply ffff88374a294c00, remote 192.168.xxx.n
K 4806594.119/200409132512 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67ed9e00, peer ip addr (IPv4) 192.168.xxx.n
K 4806594.119/200409132512 oks_comm[331705] odlm_comm_worker: start; sock ffff882d67ed9e00 msg ffff8828666d4300 reply ffff8828666d4480, remote 192.168.xxx.n
K 4806594.119/200409132512 oks_conn[307197] odlm_comm_conn_thread: new sock 0x67edfa80, peer ip addr (IPv4) 192.168.xxx.n
K 4806594.119/200409132512 oks_comm[331706] odlm_comm_worker: start; sock ffff882d67edfa80 msg ffff8835d8e34480 reply ffff8835d8e34300, remote 192.168.xxx.n
K 4806596.650/200409132514 apx_acfs_+apx1[307158] KsConnect: IPv4 Connect() failed: rc -113, addr 192.168.xxx.n <<----
K 4806596.650/200409132514 apx_acfs_+apx1[307158] odlm_comm_memb_update: Connect failed err 4 node 3 retrying
K 4806596.650/200409132514 OKS rbld[331701] KsConnect: IPv4 Connect() failed: rc -113, addr 192.168.xxx.n <<----
K 4806596.650/200409132514 OKS rbld[331701] odlm_comm_send_setup: connect to 192.168.xxx.n.5 failed: 4
K 4806596.650/200409132514 OKS rbld[331701] odlm_comm_send: setup for chan 1 failed err -1
K 4806596.650/200409132514 OKS rbld[331701] snd_credir_msg: dircsid 0x7200000003 hashval 0xd90304 rsid domain 0x2000000000000000
K 4806596.650/200409132514 OKS rbld[331701] snd_credir_msg : send_status -1
K 4806596.650/200409132514 OKS rbld[331701] odlm_rebuild_dir: snd_credir_req failed rsbp 0xffff882869d1d320 dircsid 0x7200000003
K 4806596.650/200409132514 OKS rbld[331701] odlm_walk_hash: err 1
K 4806596.650/200409132514 OKS rbld[331701] odlm_rbld_directory: status 1
K 4806596.650/200409132514 oks_comm[331623] [Oracle OKS]: Rebuild step 5 : Num rsbs 16 Num messages sent to director 7
K 4806596.650/200409132514 oks_comm[331623] kcss_rbld_do_step: step failed with status 1
K 4806596.650/200409132514 oks_comm[331623] odlm_process_request: from csid 0x7200000003 status 6
K 4806599.656/200409132517 apx_acfs_+apx1[307158] KsConnect: IPv4 Connect() failed: rc -113, addr 192.168.xxx.n <<----
K 4806599.656/200409132517 apx_acfs_+apx1[307158] odlm_comm_memb_update: Connect failed err 4 node 3 retrying
K 4806600.988/200409132519 oks_comm[331623] KsRecv: recv failed -104
K 4806600.988/200409132519 oks_comm[331623] odlm_comm_worker: KsRecv failed -4, comm thread exitingremote node 3
K 4806600.988/200409132519 oks_comm[331623] odlm_comm_worker: Socket Shutdown, remote 192.168.xxx.n
K 4806602.662/200409132520 apx_acfs_+apx1[307158] KsConnect: IPv4 Connect() failed: rc -113, addr 192.168.xxx.n <<----
K 4806602.662/200409132520 apx_acfs_+apx1[307158] odlm_comm_memb_update: Connect failed err 4 node 3 retrying
K 4806605.668/200409132523 apx_acfs_+apx1[307158] KsConnect: IPv4 Connect() failed: rc -113, addr 192.168.xxx.n <<----
K 4806605.668/200409132523 apx_acfs_+apx1[307158] odlm_comm_memb_update: Connect failed err 4 node 3
K 4806605.668/200409132523 apx_acfs_+apx1[307158] kcss_membership_update: Unexpected membership update (2)
V 4806819.277 blkid[354647] asmIoctl_int: Invalid ioctl type for dev 18 (type:83,opcode:49,cmd:0x5331)

File_name:: acfs.log.0

 

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.