My Oracle Support Banner

Oracle Linux: OS Hung with High Number of XFS Processes in "UN" State. (Doc ID 2678422.1)

Last updated on JUNE 10, 2020

Applies to:

Linux OS - Version Oracle Linux 7.4 and later
Linux x86-64

Symptoms

Oracle Linux 7 server gets hung with high load average and a large number of XFS processes in "UN" state.
This is Hadoop Cluster and data indexing is done on XFS file system.

vmcore was generated when the server was hung:

KERNEL: vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 64
DATE: Wed Feb 12 06:09:54 2020
UPTIME: 1 days, 05:48:13
LOAD AVERAGE: 146.13, 143.98, 139.58
TASKS: 4073
NODENAME: <HOSTNAME>
RELEASE: 4.14.35-1902.10.2.1.el7uek.x86_64
VERSION: #2 SMP Sun Jan 26 09:50:03 PST 2020
MACHINE: x86_64 (2100 Mhz)
MEMORY: 159.7 GB
PANIC: "Kernel panic - not syncing: An NMI occurred. Depending on your
system the reason for the NMI is logged in any one of the following
resources:"
PID: 0
COMMAND: "swapper/0"
TASK: ffffffff90412480 (1 of 64) [THREAD_INFO: ffffffff90412480]
CPU: 0
STATE: TASK_RUNNING (PANIC)
log:
[99779.387792] INFO: task kswapd1:0:446 blocked for more than 120 seconds.
[99779.387814] Not tainted 4.14.35-1902.10.2.1.el7uek.x86_64 #2
[99779.387833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[99779.387855] kswapd1:0 D 0 446 2 0x80000000
[99779.387857] Call Trace:
[99779.387860] __schedule+0x2bc/0x89b
[99779.387862] ? __wake_up_common+0x8f/0x15c
[99779.387864] schedule+0x36/0x7c
[99779.387866] schedule_preempt_disabled+0xe/0x10
[99779.387867] __mutex_lock.isra.5+0x20c/0x634
[99779.387871] __mutex_lock_slowpath+0x13/0x15
[99779.387873] mutex_lock+0x2f/0x3a
[99779.387902] xfs_reclaim_inodes_ag+0x2c2/0x360 [xfs]
[99779.387905] ? radix_tree_next_chunk+0x10b/0x2d5
[99779.387907] ? radix_tree_gang_lookup_tag+0xd7/0x145
[99779.387910] ? __list_lru_walk_one.isra.4+0x37/0x16a
[99779.387912] ? iput+0x1f0/0x1e7
[99779.387932] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[99779.387953] xfs_fs_free_cached_objects+0x19/0x20 [xfs]
[99779.387955] super_cache_scan+0x187/0x18c
[99779.387957] shrink_slab+0x293/0x4b8
[99779.387961] shrink_node+0x108/0x301
[99779.387963] kswapd+0x2f3/0x75d
[99779.387965] ? __schedule+0x2c4/0x89b
[99779.387968] kthread+0x105/0x138
[99779.387970] ? mem_cgroup_shrink_node+0x1a0/0x191
[99779.387971] ? kthread_bind+0x20/0x15
[99779.387973] ret_from_fork+0x24/0x49
...
ps | grep 'UN' | wc
620 5580 46999

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.