Oracle Linux:UEK4 4.1.12-103.9.7 Crashed with RIP xfs_inode_operations

(Doc ID 2385919.1)

Last updated on APRIL 19, 2018

Applies to:

Linux OS - Version Oracle Linux 7.0 with Unbreakable Enterprise Kernel [3.8.13] to Oracle Linux 7.4 with Unbreakable Enterprise Kerne [4.1.12] [Release OL7 to OL7U4]
Linux x86-64

Symptoms

1) Linux server crashing with UEK4 kernel 4.1.12-103.9.7.el7uek.x86_64

2) TSO has been disabled in all NIC, but the Server/Node still randomly crashing.

Call stack in vmcore analysis:

 KERNEL: /share/linuxrpm/vmlinux_repo/64/4.1.12-103.9.7.el7uek.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 28
DATE: Wed Dec 6 20:11:04 2017
UPTIME: 2 days, 01:27:36
LOAD AVERAGE: 0.23, 0.43, 0.41
TASKS: 2558
NODENAME: xxx
RELEASE: 4.1.12-103.9.7.el7uek.x86_64
VERSION: #2 SMP Mon Nov 20 18:15:07 PST 2017
MACHINE: x86_64 (1995 Mhz)
MEMORY: 383.9 GB
PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000020"
PID: 124298
COMMAND: "oracle_124298_i"
TASK: ffff88442d123800 [THREAD_INFO: ffff8841cec20000]
CPU: 12
STATE: TASK_RUNNING (PANIC)

crash64> bt
PID: 19930 TASK: ffff884704fd1c00 CPU: 27 COMMAND: "oracle_xxxx_xx"
#0 [ffff8840bc5e7890] machine_kexec at ffffffff8105f27b
#1 [ffff8840bc5e7900] crash_kexec at ffffffff81115e62
#2 [ffff8840bc5e79d0] oops_end at ffffffff8101b868
#3 [ffff8840bc5e7a00] no_context at ffffffff81730b24
#4 [ffff8840bc5e7a60] __bad_area_nosemaphore at ffffffff81730c0c
#5 [ffff8840bc5e7ab0] bad_area_nosemaphore at ffffffff81730d78
#6 [ffff8840bc5e7ac0] __do_page_fault at ffffffff8106de76
#7 [ffff8840bc5e7b30] do_page_fault at ffffffff8106e280
#8 [ffff8840bc5e7b70] page_fault at ffffffff81740728
[exception RIP: xfs_inode_operations]
RIP: ffffffffa0278e40 RSP: ffff8840bc5e7c20 RFLAGS: 00010286
RAX: ffff885e51d355d0 RBX: ffff88413e469800 RCX: 000000004c95eef3
RDX: 0000000000000000 RSI: ffff8840bc5e7d00 RDI: ffff884930773300
RBP: ffff8840bc5e7c68 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: ffff884704fd1c00 R12: 00000000000005b4
R13: ffff8840bc5e7d00 R14: ffff8840bc5e7da8 R15: 0000000000000008
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff8840bc5e7c20] tcp_current_mss at ffffffff81676eac
#10 [ffff8840bc5e7c70] tcp_send_mss at ffffffff816670c0
#11 [ffff8840bc5e7ca0] tcp_sendmsg at ffffffff8166aae8
#12 [ffff8840bc5e7d50] inet_sendmsg at ffffffff81697034
#13 [ffff8840bc5e7d80] sock_sendmsg at ffffffff815fac4d
#14 [ffff8840bc5e7da0] sock_write_iter at ffffffff815face5
#15 [ffff8840bc5e7e20] __vfs_write at ffffffff81212e1e
#16 [ffff8840bc5e7eb0] vfs_write at ffffffff812134c9
#17 [ffff8840bc5e7f00] sys_write at ffffffff812143b5
#18 [ffff8840bc5e7f50] system_call_fastpath at ffffffff8173e3ae
RIP: 00007f5763633690 RSP: 00007ffdd7827b18 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 0000000000000184 RCX: 00007f5763633690
RDX: 0000000000000016 RSI: 00007f576736d66e RDI: 0000000000000015
RBP: 00007ffdd7827b90 R8: 000000000c7026f0 R9: 0000000000000020
R10: 00007f57673ee280 R11: 0000000000000246 R12: 00000000d9d21c98
R13: 00000013bd991430 R14: 00007ffdd7827be0 R15: 00000013e382c2d8
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b

crash64> bt -l
PID: 67923 TASK: ffff884452408000 CPU: 10 COMMAND: "oracle_67923_in"
#0 [ffff8842358a7890] machine_kexec at ffffffff8105f27b
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/kernel/machine_kexec_64.c: 320
#1 [ffff8842358a7900] crash_kexec at ffffffff81115e62
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/kernel/kexec.c:1503
#2 [ffff8842358a79d0] oops_end at ffffffff8101b868
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/kernel/dumpstack.c: 232
#3 [ffff8842358a7a00] no_context at ffffffff81730b24
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/mm/fault.c:737
#4 [ffff8842358a7a60] __bad_area_nosemaphore at ffffffff81730c0c
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/mm/fault.c:817
#5 [ffff8842358a7ab0] bad_area_nosemaphore at ffffffff81730d78
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/mm/fault.c:825
#6 [ffff8842358a7ac0] __do_page_fault at ffffffff8106de76
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/mm/fault.c:1302
#7 [ffff8842358a7b30] do_page_fault at ffffffff8106e280
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/mm/fault.c:1320
#8 [ffff8842358a7b70] page_fault at ffffffff81740728
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/kernel/entry_64.S: 1410
[exception RIP: xfs_inode_operations]
RIP: ffffffffa02a0e40 RSP: ffff8842358a7c20 RFLAGS: 00010286
RAX: ffff885e967be5d0 RBX: ffff8843fa6fc800 RCX: 000000001a94d436
RDX: 0000000000000000 RSI: ffff8842358a7d00 RDI: ffff8844df31a400
RBP: ffff8842358a7c68 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: ffff884452408000 R12: 00000000000005b4
R13: ffff8842358a7d00 R14: ffff8842358a7da8 R15: 0000000000000008
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff8842358a7c20] tcp_current_mss at ffffffff81676eac
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/net/ipv4/tcp_output.c: 1457
#10 [ffff8842358a7c70] tcp_send_mss at ffffffff816670c0
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/net/ipv4/tcp.c:852
#11 [ffff8842358a7ca0] tcp_sendmsg at ffffffff8166aae8
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/net/ipv4/tcp.c:1129
#12 [ffff8842358a7d50] inet_sendmsg at ffffffff81697034
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/net/ipv4/af_inet.c:736
#13 [ffff8842358a7d80] sock_sendmsg at ffffffff815fac4d
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/net/socket.c:614
#14 [ffff8842358a7da0] sock_write_iter at ffffffff815face5
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/net/socket.c:823
#15 [ffff8842358a7e20] __vfs_write at ffffffff81212e1e
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/fs/read_write.c:479
#16 [ffff8842358a7eb0] vfs_write at ffffffff812134c9
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/fs/read_write.c:539
#17 [ffff8842358a7f00] sys_write at ffffffff812143b5
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/fs/read_write.c:586
#18 [ffff8842358a7f50] system_call_fastpath at ffffffff8173e3ae
/usr/src/debug/kernel-4.1.12/linux-4.1.12-103.9.7.el7uek/arch/x86/kernel/entry_64.S: 262
RIP: 00007f06bca96690 RSP: 00007ffdc9b92218 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 0000000000000184 RCX: 00007f06bca96690
RDX: 0000000000000011 RSI: 00007f06c07d7a2e RDI: 0000000000000015
RBP: 00007ffdc9b92290 R8: 000000000c7026f0 R9: 0000000000000020
R10: 00007f06c0851280 R11: 0000000000000246 R12: 000000013ce7a2f0
R13: 00000013bd991408 R14: 00007ffdc9b922e0 R15: 00000013e305cc38
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b

 

Messages log:

[20834.968095] WARNING: CPU: 22 PID: 160748 at net/core/dst.c:288 dst_release+0x50/0x60()

[20834.968181] Call Trace:
[20834.968188] [<ffffffff81736708>] dump_stack+0x63/0x81
[20834.968192] [<ffffffff810862aa>] warn_slowpath_common+0x8a/0xc0
[20834.968194] [<ffffffff810863da>] warn_slowpath_null+0x1a/0x20
[20834.968196] [<ffffffff816214f0>] dst_release+0x50/0x60
[20834.968200] [<ffffffff815fd314>] __sk_dst_check+0x84/0x90
[20834.968203] [<ffffffff8165f357>] ip_queue_xmit+0x277/0x3d0
[20834.968205] [<ffffffff816778f6>] tcp_transmit_skb+0x4e6/0xa80
[20834.968210] [<ffffffff81341ed2>] ? radix_tree_lookup_slot+0x22/0x50
[20834.968212] [<ffffffff81677fa5>] tcp_write_xmit+0x115/0xe90
[20834.968213] [<ffffffff81678fa2>] __tcp_push_pending_frames+0x32/0xd0
[20834.968216] [<ffffffff8166706f>] tcp_push+0xef/0x120
[20834.968218] [<ffffffff8166aa9a>] tcp_sendmsg+0xca/0xb70
[20834.968222] [<ffffffff81697034>] inet_sendmsg+0x64/0xa0
[20834.968227] [<ffffffff812c4b83>] ? selinux_socket_sendmsg+0x23/0x30
[20834.968229] [<ffffffff815fac4d>] sock_sendmsg+0x3d/0x50
[20834.968231] [<ffffffff815face5>] sock_write_iter+0x85/0xf0
[20834.968236] [<ffffffff81212e1e>] __vfs_write+0xce/0x120
[20834.968238] [<ffffffff812134c9>] vfs_write+0xa9/0x1b0
[20834.968241] [<ffffffff81026c6c>] ? do_audit_syscall_entry+0x6c/0x70
[20834.968244] [<ffffffff812143b5>] SyS_write+0x55/0xd0
[20834.968249] [<ffffffff8106e280>] ? do_page_fault+0x30/0x90
[20834.968253] [<ffffffff8173e3ae>] system_call_fastpath+0x12/0x71
[20834.968254] ---[ end trace 6314bf7c288d345c ]---

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms