My Oracle Support Banner

Linux: Cluster Nodes Reboot After the Operating System Kernel Update to 4.1.12-124.22.2.el6uek.x86_64 (Doc ID 2489510.1)

Last updated on MARCH 08, 2019

Applies to:

Oracle Database - Enterprise Edition - Version 12.1.0.2 to 18.3.0.0.0 [Release 12.1 to 18]
Information in this document applies to any platform.

Symptoms

Cluster nodes evicts and reboots after operating system (OS) kernel updates to 4.1.12-124.22.2.el6uek.x86_64.

alert.log from <oracle base>/diag/crs/node2/crs/trace

2018-11-28 14:31:40.444 [OHASD(5670)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 5670
2018-11-28 14:31:40.447 [OHASD(5670)]CRS-0714: Oracle Clusterware Release 12.1.0.2.0.
.
2018-11-28 14:32:07.992 [OCSSD(10058)]CRS-1601: CSSD Reconfiguration complete. Active nodes are <node1> <node2> .
.
2018-11-28 14:32:38.802 [CRSD(11551)]CRS-1201: CRSD started on node <node2>.
2018-11-28 14:32:39.017 [ORAAGENT(11696)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 11696
2018-11-28 14:32:39.043 [ORAROOTAGENT(11700)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 11700

2018-11-28 14:39:13.435 [OHASD(5674)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 5674 <<<<<<<Clusterware restart
2018-11-28 14:39:13.438 [OHASD(5674)]CRS-0714: Oracle Clusterware Release 12.1.0.2.0.
2018-11-28 14:39:13.446 [OHASD(5674)]CRS-2112: The OLR service started on node <node2>.

 

Reboot seen after CRSD starts on the node and ASM starts to enable ACFS volumes

alert_+ASM2.log from <oracle base>/diag/asm/+ASM2/trace location:

Wed Nov 28 14:32:38 2018
NOTE: ASMB connected to ASM instance +ASM2 osid: 11637 (Flex mode; client id 0xffffffffffffffff)
Wed Nov 28 14:33:42 2018
SQL> ALTER DISKGROUP BCKP ENABLE VOLUME BCKPACFSVOL /* asm agenWed Nov 28 14:40:00 2018
MEMORY_TARGET defaulting to 1128267776.
* instance_number obtained from CSS = 2, checking for the existence of node 0...
* node 0 does not exist. instance_number = 2
Starting ORACLE instance (normal) (OS id: 11415)

vmstat output during the time will display many processes getting blocked

<node2>_vmstat_18.11.28.1400.dat

zzz ***Wed Nov 28 14:34:00 UTC 2018
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
10 19 0 391368096 153848 1451868 0 0 1892 66 643 1023 10 2 78 9 0
5 19 0 391364320 153848 1451964 0 0 3 0 2968 6291 1 1 0 98 0
5 19 0 391364352 153848 1451940 0 0 4 0 2919 6089 1 1 0 99 0 <<<<<<<<<<< Many block processes with idle cpu 0

OS message log have below details with modules linked in oracleacfs, oracleadvm,oracleoks

Nov 28 14:33:43 <node2> kernel: [ 175.145168] ADVMK-0014: Cluster reconfiguration completed.
Nov 28 14:33:44 <node2> kernel: [ 175.882870] ------------[ cut here ]------------
Nov 28 14:33:44 <node2> kernel: [ 175.900072] WARNING: CPU: 3 PID: 13576 at lib/list_debug.c:62 __list_del_entry+0xdd/0xe0()
Nov 28 14:33:44 <node2> kernel: [ 175.917692] list_del corruption. next->prev should be ffff885de0689bc8, but was (null)
Nov 28 14:33:44 <node2> kernel: [ 175.935593] Modules linked in: oracleacfs(PO) oracleadvm(PO) oracleoks(PO) nfsd lockd grace nfs_acl auth_rpcgss oracleasm sunrpc cpufreq_powersave bonding ipv6 dm_round_robin dm_multipath iTCO_wdt iTCO_vendor_support ipmi_devintf dcdbas pcspkr sb_edac edac_core shpchp lpc_ich mfd_core tg3 ptp pps_core osst st ch sg ipmi_ssif i2c_core ipmi_si ipmi_msghandler ext4 jbd2 mbcache2 sr_mod cdrom sd_mod ahci libahci mpt3sas scsi_transport_sas raid_class megaraid_sas mxm_wmi wmi dm_mirror dm_region_hash dm_log dm_mod
Nov 28 14:33:44 <node2> kernel: [ 176.033722] CPU: 3 PID: 13576 Comm: asm_vbg1_+asm2 Tainted: P O 4.1.12-124.22.2.el6uek.x86_64 #2
.
Nov 28 14:33:44 <node2> kernel: [ 176.075355] 0000000000000000 ffff885de105fa08 ffffffff816e9260 ffff885de105fa58
Nov 28 14:33:44 <node2> kernel: [ 176.096332] ffffffff81a0db99 ffff885de105fa48 ffffffff8108601a ffffffff816ebeea
Nov 28 14:33:44 <node2> kernel: [ 176.117259] ffff885de0689bc0 0000000000000010 ffff885de0689bc8 ffffffff81bc7600
Nov 28 14:33:44 <node2> kernel: [ 176.138053] Call Trace:
Nov 28 14:33:44 <node2> kernel: [ 176.158425] [<ffffffff816e9260>] dump_stack+0x63/0x81
Nov 28 14:33:44 <node2> kernel: [ 176.178921] [<ffffffff8108601a>] warn_slowpath_common+0x8a/0xc0
Nov 28 14:33:44 <node2> kernel: [ 176.199285] [<ffffffff816ebeea>] ? __schedule+0x24a/0x810
Nov 28 14:33:44 <node2> kernel: [ 176.219266] [<ffffffff81086096>] warn_slowpath_fmt+0x46/0x50
Nov 28 14:33:44 <node2> kernel: [ 176.238904] [<ffffffff8134516d>] __list_del_entry+0xdd/0xe0
Nov 28 14:33:44 <node2> kernel: [ 176.258266] [<ffffffff813172d6>] blkg_destroy+0x86/0x1c0
Nov 28 14:33:44 <node2> kernel: [ 176.277333] [<ffffffff8131745e>] blkg_destroy_all+0x4e/0x80
Nov 28 14:33:44 <node2> kernel: [ 176.295880] [<ffffffff813188b3>] blkcg_exit_queue+0x23/0x80
Nov 28 14:33:44 <node2> kernel: [ 176.313789] [<ffffffff812fdc34>] blk_release_queue+0x24/0x120
Nov 28 14:33:44 <node2> kernel: [ 176.331181] [<ffffffff8132ac5d>] kobject_release+0x7d/0x1c0
Nov 28 14:33:44 <node2> kernel: [ 176.348014] [<ffffffff8132ab05>] kobject_put+0x35/0x70
Nov 28 14:33:44 <node2> kernel: [ 176.364789] [<ffffffff812f9bf7>] blk_cleanup_queue+0x147/0x200
Nov 28 14:33:44 <node2> kernel: [ 176.381696] [<ffffffffa05391fb>] asmSetVolumeLimits+0x254/0x2d7 [oracleadvm]
Nov 28 14:33:44 <node2> kernel: [ 176.398291] [<ffffffffa0512110>] Asm_updateVolumeLimits+0xab/0x13b [oracleadvm]
Nov 28 14:33:44 <node2> kernel: [ 176.414457] [<ffffffffa0513403>] Asm_loadDisks+0x42e/0x9eb [oracleadvm]
Nov 28 14:33:44 <node2> kernel: [ 176.430656] [<ffffffffa04f8fbe>] Asm_ioctlVBG+0x105/0x37a [oracleadvm]
Nov 28 14:33:44 <node2> kernel: [ 176.446732] [<ffffffffa04fb950>] AsmIoctl+0x7fd/0xd31 [oracleadvm]
Nov 28 14:33:44 <node2> kernel: [ 176.463054] [<ffffffffa0539d3c>] ? asmIoctl_int.isra.17+0x66/0x247 [oracleadvm]
Nov 28 14:33:44 <node2> kernel: [ 176.479806] [<ffffffffa0539ef4>] asmIoctl_int.isra.17+0x21e/0x247 [oracleadvm]
Nov 28 14:33:44 <node2> kernel: [ 176.496882] [Nov 28 14:39:02 <node2> kernel: imklog 5.8.10, log source = /proc/kmsg started.
Nov 28 14:39:02 <node2> rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="4006" x-info="http://www.rsyslog.com"] start

 

Changes

OS Kernel update to 4.1.12-124.22.2.EL6UEK.X86_64 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.