My Oracle Support Banner

Command hang under OCFS2 mount point (Doc ID 2357209.1)

Last updated on AUGUST 04, 2018

Applies to:

Linux OS - Version Oracle Linux 5.0 and later
Linux x86-64

Symptoms

There are three node cluster.
The issue happened to one node while listing files under ocfs2 directory "i_o2e_domain" .. please see below snap.
[root@*** ]# ls
nodea nodeb i_o2e_domain lost+found
[root@*** ]#
[root@*** ]# ls nodea
nodemanager.log nodemanager.log.20141210194055 nodemanager.log.20141215143257
nodemanager.log.1 nodemanager.log.20141212163554 nodemanager.log.20141215154840
nodemanager.log.1.lck nodemanager.log.20141212182353 nodemanager.log.20141215161007
nodemanager.log.20141208185424 nodemanager.log.20141212200151 nodemanager.log.20150203231158
nodemanager.log.20141209003824 nodemanager.log.20141214175629 nodemanager.log.20150306213532
nodemanager.log.20141209182303 nodemanager.log.20141214180724 nodemanager.log.lck
[root@*** ]#
[root@*** ]# ls nodeb
nodemanager.log nodemanager.log.20141212163554 nodemanager.log.20141215154839
nodemanager.log.1 nodemanager.log.20141212182353 nodemanager.log.20141215161007
nodemanager.log.20141208185423 nodemanager.log.20141212200151 nodemanager.log.20150203231158
nodemanager.log.20141209003821 nodemanager.log.20141214175632 nodemanager.log.20150306213532
nodemanager.log.20141209182301 nodemanager.log.20141214180723 nodemanager.log.lck
nodemanager.log.20141210194039 nodemanager.log.20141215143257
[root@*** ]#
[root@*** ]# ls i_o2e_domain
==== ls COMMAND HUNG HERE ======

Verified the issue in the log file as noted below:

332 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 shutdown, state 8
333 o2net: No longer connected to node nodea.xx.com (num 0) at 172.24.12.10:7777
334 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 shutdown, state 7
335 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 shutdown, state 7
336 o2net: Connected to node nodea.xx.com (num 0) at 172.24.12.10:7777
337 o2dlm: Node 0 joins domain 1A0A33E02167489EB6312549DABB3CF1 ( 0 1 2 ) 3 nodes
338 o2dlm: Node 0 joins domain B4E689FBBFA24533B415C8241B6E12EF ( 0 1 2 ) 3 nodes
339 INFO: task logrotate:10807 blocked for more than 120 seconds.
340 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
341 logrotate D ffff8807c6784b80 0 10807 10805 0x00000000
342 ffff8800c42a1848 0000000000000282 ffff8800c42a1768 0000000000011180
343 ffff8800c5602600 ffff88075d9781c0 ffff8800c42a17b8 ffffffff81041129
344 004c710dfeca9dcc 0021cec82a6664d6 ffffffffe271bda1 ffff8800c42a17b8
345 Call Trace:
346 [<ffffffff81041129>] ? pvclock_clocksource_read+0x29/0x80
347 [<ffffffff810410c1>] ? pvclock_get_nsec_offset+0x11/0x50
348 [<ffffffff81041129>] ? pvclock_clocksource_read+0x29/0x80
349 [<ffffffff8125824b>] ? radix_tree_lookup+0xb/0x10
350 [<ffffffff810d7447>] ? irq_to_desc+0x17/0x20
351 [<ffffffff810d9ffe>] ? irq_get_irq_data+0xe/0x10
352 [<ffffffff81508625>] schedule+0x45/0x60
353 [<ffffffff81508a7c>] schedule_timeout+0x19c/0x2e0
354 [<ffffffff81011f50>] ? xen_smp_send_reschedule+0x10/0x20
355 [<ffffffff81056eab>] ? ttwu_queue_remote+0x4b/0x60
356 [<ffffffff8150a43e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
357 [<ffffffff81068dc7>] ? try_to_wake_up+0x137/0x180
358 [<ffffffff8150842d>] wait_for_common+0x10d/0x160
359 [<ffffffff81068e10>] ? try_to_wake_up+0x180/0x180
360 [<ffffffff8150855d>] wait_for_completion+0x1d/0x20
361 [<ffffffffa068c05a>] ocfs2_wait_for_mask+0x1a/0x30 [ocfs2]
362 [<ffffffffa06931e6>] __ocfs2_cluster_lock+0x186/0x740 [ocfs2]
363 [<ffffffffa06ef109>] ? __ocfs2_set_buffer_uptodate+0xe9/0x280 [ocfs2]
364 [<ffffffff8150a39e>] ? _raw_spin_lock+0xe/0x20
365 [<ffffffffa06a0a02>] ? ocfs2_inode_cache_unlock+0x12/0x20 [ocfs2]
366 [<ffffffffa069393d>] ocfs2_inode_lock_full_nested+0x19d/0x480 [ocfs2]
367 [<ffffffffa0705e71>] ? ocfs2_xattr_set+0x151/0x890 [ocfs2]
368 [<ffffffff8150a39e>] ? _raw_spin_lock+0xe/0x20
369 [<ffffffff8115cd89>] ? kmem_cache_alloc_trace+0xc9/0x1a0
[<ffffffffa0705e71>] ocfs2_xattr_set+0x151/0x890 [ocfs2]
371 [<ffffffffa01c723c>] ? do_get_write_access+0x1ec/0x4c0 [jbd2]
372 [<ffffffffa070826d>] ocfs2_set_acl+0x19d/0x1b0 [ocfs2]
373 [<ffffffffa0708371>] ocfs2_acl_chmod+0xf1/0x100 [ocfs2]
374 [<ffffffffa069e155>] ocfs2_setattr+0x6b5/0x9e0 [ocfs2]
375 [<ffffffff8115b44a>] ? kfree+0x11a/0x250
376 [<ffffffff8150a43e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
377 [<ffffffff81187b8b>] notify_change+0x18b/0x2f0
378 [<ffffffff8116aef4>] sys_fchmod+0xd4/0x110
379 [<ffffffff81177c76>] ? putname+0x36/0x50
380 [<ffffffff81512b62>] system_call_fastpath+0x16/0x1b
381 INFO: task logrotate:10807 blocked for more than 120 seconds.
382 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
383 logrotate D ffff8807c6784b80 0 10807 10805 0x00000000
384 ffff8800c42a1848 0000000000000282 ffff8800c42a1768 0000000000011180
385 ffff8800c5602600 ffff88075d9781c0 ffff8800c42a17b8 ffffffff81041129
386 004c710dfeca9dcc 0021cec82a6664d6 ffffffffe271bda1 ffff8800c42a17b8
387 Call Trace:
388 [<ffffffff81041129>] ? pvclock_clocksource_read+0x29/0x80
389 [<ffffffff810410c1>] ? pvclock_get_nsec_offset+0x11/0x50
390 [<ffffffff81041129>] ? pvclock_clocksource_read+0x29/0x80
391 [<ffffffff8125824b>] ? radix_tree_lookup+0xb/0x10
392 [<ffffffff810d7447>] ? irq_to_desc+0x17/0x20
393 [<ffffffff810d9ffe>] ? irq_get_irq_data+0xe/0x10
394 [<ffffffff81508625>] schedule+0x45/0x60
395 [<ffffffff81508a7c>] schedule_timeout+0x19c/0x2e0
396 [<ffffffff81011f50>] ? xen_smp_send_reschedule+0x10/0x20
397 [<ffffffff81056eab>] ? ttwu_queue_remote+0x4b/0x60
398 [<ffffffff8150a43e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
399 [<ffffffff81068dc7>] ? try_to_wake_up+0x137/0x180
400 [<ffffffff8150842d>] wait_for_common+0x10d/0x160
401 [<ffffffff81068e10>] ? try_to_wake_up+0x180/0x180
402 [<ffffffff8150855d>] wait_for_completion+0x1d/0x20
403 [<ffffffffa068c05a>] ocfs2_wait_for_mask+0x1a/0x30 [ocfs2]
404 [<ffffffffa06931e6>] __ocfs2_cluster_lock+0x186/0x740 [ocfs2]
405 [<ffffffffa06ef109>] ? __ocfs2_set_buffer_uptodate+0xe9/0x280 [ocfs2]
406 [<ffffffff8150a39e>] ? _raw_spin_lock+0xe/0x20
407 [<ffffffffa06a0a02>] ? ocfs2_inode_cache_unlock+0x12/0x20 [ocfs2]
[<ffffffffa069393d>] ocfs2_inode_lock_full_nested+0x19d/0x480 [ocfs2]
409 [<ffffffffa0705e71>] ? ocfs2_xattr_set+0x151/0x890 [ocfs2]
410 [<ffffffff8150a39e>] ? _raw_spin_lock+0xe/0x20
411 [<ffffffff8115cd89>] ? kmem_cache_alloc_trace+0xc9/0x1a0
412 [<ffffffffa0705e71>] ocfs2_xattr_set+0x151/0x890 [ocfs2]
413 [<ffffffffa01c723c>] ? do_get_write_access+0x1ec/0x4c0 [jbd2]
414 [<ffffffffa070826d>] ocfs2_set_acl+0x19d/0x1b0 [ocfs2]
415 [<ffffffffa0708371>] ocfs2_acl_chmod+0xf1/0x100 [ocfs2]
416 [<ffffffffa069e155>] ocfs2_setattr+0x6b5/0x9e0 [ocfs2]
417 [<ffffffff8115b44a>] ? kfree+0x11a/0x250
418 [<ffffffff8150a43e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
419 [<ffffffff81187b8b>] notify_change+0x18b/0x2f0
420 [<ffffffff8116aef4>] sys_fchmod+0xd4/0x110
421 [<ffffffff81177c76>] ? putname+0x36/0x50
422 [<ffffffff81512b62>] system_call_fastpath+0x16/0x1b
423 INFO: task logrotate:10807 blocked for more than 120 seconds.
424 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
425 logrotate D ffff8807c6784b80 0 10807 10805 0x00000000
426 ffff8800c42a1848 0000000000000282 ffff8800c42a1768 0000000000011180
427 ffff8800c5602600 ffff88075d9781c0 ffff8800c42a17b8 ffffffff81041129
428 004c710dfeca9dcc 0021cec82a6664d6 ffffffffe271bda1 ffff8800c42a17b8
429 Call Trace:
430 [<ffffffff81041129>] ? pvclock_clocksource_read+0x29/0x80
431 [<ffffffff810410c1>] ? pvclock_get_nsec_offset+0x11/0x50
432 [<ffffffff81041129>] ? pvclock_clocksource_read+0x29/0x80
433 [<ffffffff8125824b>] ? radix_tree_lookup+0xb/0x10
434 [<ffffffff810d7447>] ? irq_to_desc+0x17/0x20
435 [<ffffffff810d9ffe>] ? irq_get_irq_data+0xe/0x10
436 [<ffffffff81508625>] schedule+0x45/0x60
437 [<ffffffff81508a7c>] schedule_timeout+0x19c/0x2e0
438 [<ffffffff81011f50>] ? xen_smp_send_reschedule+0x10/0x20
439 [<ffffffff81056eab>] ? ttwu_queue_remote+0x4b/0x60
440 [<ffffffff8150a43e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
441 [<ffffffff81068dc7>] ? try_to_wake_up+0x137/0x180
442 [<ffffffff8150842d>] wait_for_common+0x10d/0x160
443 [<ffffffff81068e10>] ? try_to_wake_up+0x180/0x180
444 [<ffffffff8150855d>] wait_for_completion+0x1d/0x20
445 [<ffffffffa068c05a>] ocfs2_wait_for_mask+0x1a/0x30 [ocfs2]
...757 [<ffffffff81177c76>] ? putname+0x36/0x50
758 [<ffffffff81512b62>] system_call_fastpath+0x16/0x1b
759 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 has been idle for 30.3 secs.
760 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 has been idle for 30.80 secs.
761 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 has been idle for 30.80 secs.
762 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 has been idle for 30.80 secs.
763 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 has been idle for 30.80 secs.
764 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 shutdown, state 7
765 o2net: No longer connected to node nodea.xx.com (num 0) at 172.24.12.10:7777
766 o2net: Connected to node nodea.xx.com (num 0) at 172.24.12.10:7777
767 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 shutdown, state 8
768 o2net: No longer connected to node nodea.xx.com (num 0) at 172.24.12.10:7777
769 (kworker/5:1,120,5):o2quo_make_decision:170 not fencing this node, heartbeating: 2, connected: 2, lowest: 1 (reachable)
770 o2cb: o2dlm has evicted node 0 from domain 1A0A33E02167489EB6312549DABB3CF1
771 o2cb: o2dlm has evicted node 0 from domain B4E689FBBFA24533B415C8241B6E12EF
772 o2dlm: Begin recovery on domain B4E689FBBFA24533B415C8241B6E12EF for node 0
773 o2dlm: Node 1 (me) is the Recovery Master for the dead node 0 in domain B4E689FBBFA24533B415C8241B6E12EF
774 o2net: Connection to node nodea.xx.com (num 0) at 172.24.12.10:7777 shutdown, state 7
775 o2dlm: Begin recovery on domain 1A0A33E02167489EB6312549DABB3CF1 for node 0
776 o2dlm: Node 1 (me) is the Recovery Master for the dead node 0 in domain 1A0A33E02167489EB6312549DABB3CF1
777 o2net: Connected to node nodea.xx.com (num 0) at 172.24.12.10:7777
778 o2dlm: End recovery on domain 1A0A33E02167489EB6312549DABB3CF1
779 o2dlm: End recovery on domain B4E689FBBFA24533B415C8241B6E12EF
780 o2dlm: Node 0 joins domain B4E689FBBFA24533B415C8241B6E12EF ( 0 1 2 ) 3 nodes
781 o2dlm: Node 0 joins domain 1A0A33E02167489EB6312549DABB3CF1 ( 0 1 2 ) 3 nodes


Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.