Oracle Linux: OS Hung with High Number of XFS Processes in "UN" State.
(Doc ID 2678422.1)
Last updated on JUNE 10, 2020
Applies to:
Linux OS - Version Oracle Linux 7.4 and laterLinux x86-64
Symptoms
Oracle Linux 7 server gets hung with high load average and a large number of XFS processes in "UN" state.
This is Hadoop Cluster and data indexing is done on XFS file system.
vmcore was generated when the server was hung:
KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 64 DATE: Wed Feb 12 06:09:54 2020 UPTIME: 1 days, 05:48:13 LOAD AVERAGE: 146.13, 143.98, 139.58 TASKS: 4073 NODENAME: <HOSTNAME> RELEASE: 4.14.35-1902.10.2.1.el7uek.x86_64 VERSION: #2 SMP Sun Jan 26 09:50:03 PST 2020 MACHINE: x86_64 (2100 Mhz) MEMORY: 159.7 GB PANIC: "Kernel panic - not syncing: An NMI occurred. Depending on your system the reason for the NMI is logged in any one of the following resources:" PID: 0 COMMAND: "swapper/0" TASK: ffffffff90412480 (1 of 64) [THREAD_INFO: ffffffff90412480] CPU: 0 STATE: TASK_RUNNING (PANIC)
log: [99779.387792] INFO: task kswapd1:0:446 blocked for more than 120 seconds. [99779.387814] Not tainted 4.14.35-1902.10.2.1.el7uek.x86_64 #2 [99779.387833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [99779.387855] kswapd1:0 D 0 446 2 0x80000000 [99779.387857] Call Trace: [99779.387860] __schedule+0x2bc/0x89b [99779.387862] ? __wake_up_common+0x8f/0x15c [99779.387864] schedule+0x36/0x7c [99779.387866] schedule_preempt_disabled+0xe/0x10 [99779.387867] __mutex_lock.isra.5+0x20c/0x634 [99779.387871] __mutex_lock_slowpath+0x13/0x15 [99779.387873] mutex_lock+0x2f/0x3a [99779.387902] xfs_reclaim_inodes_ag+0x2c2/0x360 [xfs] [99779.387905] ? radix_tree_next_chunk+0x10b/0x2d5 [99779.387907] ? radix_tree_gang_lookup_tag+0xd7/0x145 [99779.387910] ? __list_lru_walk_one.isra.4+0x37/0x16a [99779.387912] ? iput+0x1f0/0x1e7 [99779.387932] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [99779.387953] xfs_fs_free_cached_objects+0x19/0x20 [xfs] [99779.387955] super_cache_scan+0x187/0x18c [99779.387957] shrink_slab+0x293/0x4b8 [99779.387961] shrink_node+0x108/0x301 [99779.387963] kswapd+0x2f3/0x75d [99779.387965] ? __schedule+0x2c4/0x89b [99779.387968] kthread+0x105/0x138 [99779.387970] ? mem_cgroup_shrink_node+0x1a0/0x191 [99779.387971] ? kthread_bind+0x20/0x15 [99779.387973] ret_from_fork+0x24/0x49 ...
ps | grep 'UN' | wc 620 5580 46999
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |