My Oracle Support Banner

Intermittent high load average in a cluster file system (Doc ID 2577953.1)

Last updated on AUGUST 26, 2019

Applies to:

Linux OS - Version Oracle Linux 6.10 with Unbreakable Enterprise Kernel [4.1.12] and later
Information in this document applies to any platform.

Symptoms


Intermittent load average spikes in a cluster file system. The following stack trace is found when the performance if cluster file system is slow

ul 7 23:01:44 hostxxx kernel: [447230.244735] INFO: task
FNDLIBR:60262 blocked for more than 120 seconds.
Jul 7 23:01:44 hostxxx kernel: [447230.244740] Not tainted
4.1.12-124.26.10.el6uek.x86_64 #2
Jul 7 23:01:44 hostxxx kernel: [447230.244741] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 7 23:01:44 hostxxx kernel: [447230.244743] FNDLIBR D
ffff880e06858480 0 60262 14930 0x20020080
Jul 7 23:01:44 hostxxx kernel: [447230.244747] ffff880a492abb28
0000000000200082 ffff880d08f8c600 ffff880d08f8c600
Jul 7 23:01:44 hostxxx kernel: [447230.244750] ffff880a492abb08
ffff880a492ac000 ffff880dfbe2856c ffff880d08f8c600
Jul 7 23:01:44 hostxxx kernel: [447230.244752] 00000000ffffffff
0000000000000000 ffff880a492abb48 ffffffff816ee9c7
Jul 7 23:01:44 hostxxx kernel: [447230.244755] Call Trace:
Jul 7 23:01:44 hostxxx kernel: [447230.244764] [<ffffffff816ee9c7>]
schedule+0x37/0x90
Jul 7 23:01:44 hostxxx kernel: [447230.244767] [<ffffffff816eed4e>]
schedule_preempt_disabled+0xe/0x10
Jul 7 23:01:44 hostxxx kernel: [447230.244770] [<ffffffff816f0f85>]
__mutex_lock_slowpath+0xa5/0x200
Jul 7 23:01:44 hostxxx kernel: [447230.244772] [<ffffffff816f1103>]
mutex_lock+0x23/0x40
Jul 7 23:01:44 hostxxx kernel: [447230.244800] [<ffffffffc07b74d9>]
ocfs2_lookup_lock_orphan_dir.constprop.28+0x49/0x1c0 [ocfs2]
Jul 7 23:01:44 hostxxx kernel: [447230.244812] [<ffffffffc0798e35>]
? __ocfs2_cluster_unlock.isra.37+0x85/0xe0 [ocfs2]
Jul 7 23:01:44 hostxxx kernel: [447230.244825] [<ffffffffc07b7692>]
ocfs2_prepare_orphan_dir+0x42/0x320 [ocfs2]
Jul 7 23:01:44 hostxxx kernel: [447230.244837] [<ffffffffc07b9f4d>]
ocfs2_unlink+0x86d/0xce0 [ocfs2]
Jul 7 23:01:44 hostxxx kernel: [447230.244841] [<ffffffff812196cc>]
vfs_unlink+0x10c/0x1b0
Jul 7 23:01:44 hostxxx kernel: [447230.244843] [<ffffffff8121eca8>]
do_unlinkat+0x2a8/0x350
Jul 7 23:01:44 hostxxx kernel: [447230.244847] [<ffffffff816fa5ed>]
? ia32_sysenter_target+0x10d/0x1b9
Jul 7 23:01:44 hostxxx kernel: [447230.244849] [<ffffffff816fa5e6>]
? ia32_sysenter_target+0x106/0x1b9
Jul 7 23:01:44 hostxxx kernel: [447230.244851] [<ffffffff8121fda6>]
SyS_unlink+0x16/0x20
Jul 7 23:01:44 hostxxx kernel: [447230.244854] [<ffffffff816fa6d1>]
sysenter_dispatch+0x7/0x25
Jul 7 23:01:44 hostxxx kernel: [447230.244857] [<ffffffff816fa576>]
? ia32_sysenter_target+0x96/0x1b9
Jul 7 23:01:44 hostxxx kernel: [447230.244859] [<ffffffff816fa56f>]
? ia32_sysenter_target+0x8f/0x1b9
Jul 7 23:01:44 hostxxx kernel: [447230.244861] [<ffffffff816fa568>]
? ia32_sysenter_target+0x88/0x1b9
Jul 7 23:01:44 hostxxx kernel: [447230.244864] [<ffffffff816fa561>]
? ia32_sysenter_target+0x81/0x1b9
...

Jul 8 08:34:38 hostxxx kernel: [ 720.278835] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 8 08:34:38 hostxxx kernel: [ 720.278837] rm D
ffff880e06698480 0 12701 10216 0x00000080
Jul 8 08:34:38 hostxxx kernel: [ 720.278841] ffff880db4effb28
0000000000000086 ffff880db4f18e00 ffff880db4f18e00
Jul 8 08:34:38 hostxxx kernel: [ 720.278844] ffff880db4effb08
ffff880db4f00000 ffff880dfaa6bbac ffff880db4f18e00
Jul 8 08:34:38 hostxxx kernel: [ 720.278847] 00000000ffffffff
0000000000000000 ffff880db4effb48 ffffffff816ee9c7
Jul 8 08:34:38 hostxxx kernel: [ 720.278849] Call Trace:
Jul 8 08:34:38 hostxxx kernel: [ 720.278859] [<ffffffff816ee9c7>]
schedule+0x37/0x90
Jul 8 08:34:38 hostxxx kernel: [ 720.278863] [<ffffffff816eed4e>]
schedule_preempt_disabled+0xe/0x10
Jul 8 08:34:38 hostxxx kernel: [ 720.278865] [<ffffffff816f0f85>]
__mutex_lock_slowpath+0xa5/0x200
Jul 8 08:34:38 hostxxx kernel: [ 720.278867] [<ffffffff816f1103>]
mutex_lock+0x23/0x40
Jul 8 08:34:38 hostxxx kernel: [ 720.278897] [<ffffffffc07c44d9>]
ocfs2_lookup_lock_orphan_dir.constprop.28+0x49/0x1c0 [ocfs2]
Jul 8 08:34:38 hostxxx kernel: [ 720.278908] [<ffffffffc07a5e35>] ?
__ocfs2_cluster_unlock.isra.37+0x85/0xe0 [ocfs2]
Jul 8 08:34:38 hostxxx kernel: [ 720.278922] [<ffffffffc07c4692>]
ocfs2_prepare_orphan_dir+0x42/0x320 [ocfs2]
Jul 8 08:34:38 hostxxx kernel: [ 720.278933] [<ffffffffc07c6f4d>]
ocfs2_unlink+0x86d/0xce0 [ocfs2]
Jul 8 08:34:38 hostxxx kernel: [ 720.278938] [<ffffffff812196cc>]
vfs_unlink+0x10c/0x1b0
Jul 8 08:34:38 hostxxx kernel: [ 720.278940] [<ffffffff8121eca8>]
do_unlinkat+0x2a8/0x350
Jul 8 08:34:38 hostxxx kernel: [ 720.278943] [<ffffffff816f54da>] ?
apic_timer_interrupt+0xba/0x1a0
Jul 8 08:34:38 hostxxx kernel: [ 720.278946] [<ffffffff816f54be>] ?
apic_timer_interrupt+0x9e/0x1a0
Jul 8 08:34:38 hostxxx kernel: [ 720.278950] [<ffffffff81025a3c>] ?
do_audit_syscall_entry+0x6c/0x70
Jul 8 08:34:38 hostxxx kernel: [ 720.278952] [<ffffffff81026deb>] ?
syscall_trace_enter_phase1+0x11b/0x180
Jul 8 08:34:38 hostxxx kernel: [ 720.278954] [<ffffffff816f3fa6>] ?
system_call_after_swapgs+0xf0/0x190
Jul 8 08:34:38 hostxxx kernel: [ 720.278956] [<ffffffff816f3f9f>] ?
system_call_after_swapgs+0xe9/0x190
Jul 8 08:34:38 hostxxx kernel: [ 720.278959] [<ffffffff8121fd6b>]
SyS_unlinkat+0x1b/0x40

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.