Oracle VM: VM Crash Post OVM Upgrade From 3.3.3 to 3.4.6 With Message NMI Watchdog: BUG: Soft Lockup - CPU#0
(Doc ID 2634638.1)
Last updated on JANUARY 03, 2023
Applies to:
Oracle VM - Version 3.4.6 and laterLinux OS - Version Oracle Linux 6.10 and later
Linux x86-64
Symptoms
After upgrading OVM from 3.3.3 to 3.4.6, Oracle Linux 6.10 VMs crashed with high CPU usage. This high CPU is occurring during database backup.
Error messages in /var/log/messages
Dec 20 17:01:53 hostname kernel: [79488.478736] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [kworker/0:1H:1040] Dec 20 17:01:53 hostname kernel: [79488.482240] Modules linked in: rds oracleacfs(PO) oracleadvm(PO) oracleoks(PO) oracleasm nfs_layout_nfsv41_files rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace sunrpc ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ovmapi ppdev parport_pc parport acpi_cpufreq xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea pcspkr i2c_piix4 i2c_core ext4 jbd2 mbcache2 xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod Dec 20 17:01:53 hostname kernel: [79488.482240] CPU: 0 PID: 1040 Comm: kworker/0:1H Tainted: P O 4.1.12-124.32.1.el6uek.x86_64 #2 Dec 20 17:01:53 hostname kernel: [79488.482240] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 11/16/2018 Dec 20 17:01:53 hostname kernel: [79488.482240] Workqueue: kblockd blk_mq_requeue_work Dec 20 17:01:53 hostname kernel: [79488.482240] task: ffff881d88d4c600 ti: ffff881d86ed8000 task.ti: ffff881d86ed8000 Dec 20 17:01:53 hostname kernel: [79488.482240] RIP: 0010:[<ffffffff815fe6da>] [<ffffffff815fe6da>] skb_release_head_state+0x5a/0xe0 Dec 20 17:01:53 hostname kernel: [79488.482240] RSP: 0018:ffff881d97203e68 EFLAGS: 00000202 Dec 20 17:01:53 hostname kernel: [79488.482240] RAX: 0000000000000001 RBX: ffffffff810cd0d4 RCX: 0000000000000304 Dec 20 17:01:53 hostname kernel: [79488.482240] RDX: 0000000000000f02 RSI: 0000000000000292 RDI: 0000000000000292 Dec 20 17:01:53 hostname kernel: [79488.482240] RBP: ffff881d97203e78 R08: 0000000000000304 R09: 000000000001c0d8 Dec 20 17:01:53 hostname kernel: [79488.482240] R10: 0000000000000009 R11: 000000000000000a R12: ffff881d97203dd8 Dec 20 17:01:53 hostname kernel: [79488.482240] R13: ffffffff816fa626 R14: ffff881d97203e78 R15: ffff881196d6ff00 Dec 20 17:01:53 hostname kernel: [79488.482240] FS: 0000000000000000(0000) GS:ffff881d97200000(0000) knlGS:0000000000000000 Dec 20 17:01:53 hostname kernel: [79488.482240] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 20 17:01:53 hostname kernel: [79488.482240] CR2: ffffffffff600400 CR3: 00000017d7d60000 CR4: 0000000000160670 Dec 20 17:01:53 hostname kernel: [79488.482240] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Dec 20 17:01:53 hostname kernel: [79488.482240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Dec 20 17:01:53 hostname kernel: [79488.482240] Stack: Dec 20 17:01:53 hostname kernel: [79488.482240] ffff881d97219680 ffff881196d6ff00 ffff881d97203e98 ffffffff815fe776 Dec 20 17:01:53 hostname kernel: [79488.482240] ffff881d97203e98 ffff881196d6ff00 ffff881d97203eb8 ffffffff815fe7a6 Dec 20 17:01:53 hostname kernel: [79488.482240] 0000000000000001 ffff881196df0400 ffff881d97203ef8 ffffffff8161106e Dec 20 17:01:53 hostname kernel: [79488.482240] Call Trace: Dec 20 17:01:53 hostname kernel: [79488.482240] <IRQ> Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff815fe776>] skb_release_all+0x16/0x30 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff815fe7a6>] __kfree_skb+0x16/0x30 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff8161106e>] net_tx_action+0x8e/0x290 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff8108bb30>] __do_softirq+0x120/0x300 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff8108bf35>] irq_exit+0x125/0x130 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff813fbcf9>] xen_evtchn_do_upcall+0x39/0x50 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff816fa626>] xen_hvm_callback_vector+0x1b6/0x1c0 Dec 20 17:01:53 hostname kernel: [79488.482240] <EOI> Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff810e1f07>] ? irq_to_desc+0x17/0x20 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff816f5098>] ? _raw_spin_unlock_irqrestore+0x28/0x60 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffffc0071c00>] blkif_queue_rq+0x560/0x670 [xen_blkfront] Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff813093f0>] __blk_mq_run_hw_queue+0x1e0/0x390 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff816ef93a>] ? __schedule+0x24a/0x810 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff816ef93a>] ? __schedule+0x24a/0x810 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff813091fd>] blk_mq_run_hw_queue+0x8d/0xa0 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff816ef93a>] ? __schedule+0x24a/0x810 Dec 20 17:01:53 hostname kernel: [79488.482240] [<ffffffff8130a305>] blk_mq_start_hw_queue+0x15/0x20 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff8130a347>] blk_mq_start_hw_queues+0x37/0x50 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff8130af0e>] blk_mq_requeue_work+0x10e/0x130 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff810a1359>] process_one_work+0x169/0x4a0 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff810a1b8b>] worker_thread+0x5b/0x560 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff810a1b30>] ? flush_delayed_work+0x50/0x50 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff810a795b>] kthread+0xcb/0xf0 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff816ef93a>] ? __schedule+0x24a/0x810 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff816ef93a>] ? __schedule+0x24a/0x810 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff810a7890>] ? kthread_create_on_node+0x180/0x180 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff816f5ae1>] ret_from_fork+0x61/0x90 Dec 20 17:01:53 hostname kernel: [79488.703362] [<ffffffff810a7890>] ? kthread_create_on_node+0x180/0x180 Dec 20 17:01:53 hostname kernel: [79488.703362] Code: 68 48 85 ff 74 05 f0 ff 0f 74 66 48 8b 43 60 48 85 c0 74 17 65 8b 15 96 fe a0 7e 81 e2 00 00 0f 00 75 6e 48 89 df e8 d6 0e 10 00 <48> 8b 7b 70 48 85 ff 74 05 f0 ff 0f 74 28 48 8b 7b 78 48 85 ff Dec 20 17:02:29 hostname kernel: [79524.478748] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [oracle_6133_kkv:6133] Dec 20 17:02:29 hostname kernel: [79524.478748] Modules linked in: rds oracleacfs(PO) oracleadvm(PO) oracleoks(PO) oracleasm nfs_layou t_nfsv41_files rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace sunrpc ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag _ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ovmapi ppdev parport_pc parport acpi_cpufreq xen_netfront fb_sys_fops sysi mgblt sysfillrect syscopyarea pcspkr i2c_piix4 i2c_core ext4 jbd2 mbcache2 xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirro r dm_region_hash dm_log dm_mod 2169 Dec 20 17:02:29 hostname kernel: [79524.478748] CPU: 0 PID: 6133 Comm: oracle_6133_kkv Tainted: P O L 4.1.12-124.32.1.el6u ek.x86_64 #2 2170 Dec 20 17:02:29 hostname kernel: [79524.478748] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 11/16/2018 2171 Dec 20 17:02:29 hostname kernel: [79524.478748] task: ffff881299daaa00 ti: ffff88129b65c000 task.ti: ffff88129b65c000 2172 Dec 20 17:02:29 hostname kernel: [79524.478748] RIP: 0010:[<ffffffff810f3569>] [<ffffffff810f3569>] run_timer_softirq+0x269/0x350 2173 Dec 20 17:02:29 hostname kernel: [79524.478748] RSP: 0000:ffff881d97203e98 EFLAGS: 00000246 2174 Dec 20 17:02:29 hostname kernel: [79524.478748] RAX: 00000000000072b0 RBX: ffff881d97215820 RCX: ffff88179bdb9410 2175 Dec 20 17:02:29 hostname kernel: [79524.478748] RDX: ffff881d97203eb8 RSI: ffff881d97203eb8 RDI: ffffffff8202a5c0 2176 Dec 20 17:02:29 hostname kernel: [79524.478748] RBP: ffff881d97203ef8 R08: ffffffff8202b3f8 R09: ffffffff8202a5c0 2177 Dec 20 17:02:29 hostname kernel: [79524.478748] R10: 0000000000000000 R11: 0000000000800000 R12: ffff881d97203e08 2178 Dec 20 17:02:29 hostname kernel: [79524.478748] R13: ffffffff816fa626 R14: ffff881d97203ef8 R15: ffff88179bdb9410 2179 Dec 20 17:02:29 hostname kernel: [79524.478748] FS: 00007f2493b21f60(0000) GS:ffff881d97200000(0000) knlGS:0000000000000000 2180 Dec 20 17:02:29 hostname kernel: [79524.478748] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2181 Dec 20 17:02:29 hostname kernel: [79524.478748] CR2: 000000009bff0010 CR3: 0000001cc3e00000 CR4: 0000000000160670 2182 Dec 20 17:02:29 hostname kernel: [79524.478748] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2183 Dec 20 17:02:29 hostname kernel: [79524.478748] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2184 Dec 20 17:02:29 hostname kernel: [79524.478748] Stack: 2185 Dec 20 17:02:29 hostname kernel: [79524.478748] ffffffff8202bdf8 ffffffff8202b9f8 ffffffff8202b5f8 ffff88179bdb9000 2186 Dec 20 17:02:29 hostname kernel: [79524.478748] ffff881d97203eb8 ffff881d97203eb8 ffff881d97203f48 0000000000000001 2187 Dec 20 17:02:29 hostname kernel: [79524.478748] ffffffff81ad6108 0000000000000001 0000000000000001 0000000000000000 2188 Dec 20 17:02:29 hostname kernel: [79524.478748] Call Trace: 2189 Dec 20 17:02:29 hostname kernel: [79524.478748] <IRQ> 2190 Dec 20 17:02:29 hostname kernel: [79524.478748] [<ffffffff8108bb30>] __do_softirq+0x120/0x300 2191 Dec 20 17:02:29 hostname kernel: [79524.478748] [<ffffffff8108bf35>] irq_exit+0x125/0x130 2192 Dec 20 17:02:29 hostname kernel: [79524.478748] [<ffffffff813fbcf9>] xen_evtchn_do_upcall+0x39/0x50 Dec 20 17:02:29 hostname kernel: [79524.478748] [<ffffffff816fa626>] xen_hvm_callback_vector+0x1b6/0x1c0 2194 Dec 20 17:02:29 hostname kernel: [79524.478748] <EOI> 2195 Dec 20 17:02:29 hostname kernel: [79524.478748] Code: 60 00 49 8b 55 00 48 85 d2 75 e7 e9 d8 fe ff ff 66 90 e9 33 00 00 00 66 41 83 06 02 4c 89 f7 90 0f 1f 40 00 fb 66 0f 1f 44 00 00 <48> 8b 55 b8 48 89 df 4c 89 e6 e8 18 e0 ff ff 4c 89 f7 e8 00 1a 2196 Dec 20 17:03:37 hostname kernel: [79592.478310] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ora_m000_hpqaut:19338] 2197 Dec 20 17:03:37 hostname kernel: [79592.478310] Modules linked in: rds oracleacfs(PO) oracleadvm(PO) oracleoks(PO) oracleasm nfs_layou t_nfsv41_files rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace sunrpc ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag _ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ovmapi ppdev parport_pc parport acpi_cpufreq xen_netfront fb_sys_fops sysi mgblt sysfillrect syscopyarea pcspkr i2c_piix4 i2c_core ext4 jbd2 mbcache2 xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirro r dm_region_hash dm_log dm_mod 2198 Dec 20 17:03:37 hostname kernel: [79592.478310] CPU: 0 PID: 19338 Comm: ora_m000_hpqaut Tainted: P O L 4.1.12-124.32.1.el6 uek.x86_64 #2 2199 Dec 20 17:03:37 hostname kernel: [79592.478310] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 11/16/2018 2200 Dec 20 17:03:37 hostname kernel: [79592.478310] task: ffff8812df7f7000 ti: ffff88119082c000 task.ti: ffff88119082c000 2201 Dec 20 17:03:37 hostname kernel: [79592.478310] RIP: 0010:[<ffffffff816f5098>] [<ffffffff816f5098>] _raw_spin_unlock_irqrestore+0x28/ 0x60 2202 Dec 20 17:03:37 hostname kernel: [79592.478310] RSP: 0000:ffff881d97203da8 EFLAGS: 00000292 2203 Dec 20 17:03:37 hostname kernel: [79592.478310] RAX: ffff881d97218480 RBX: ffffffff8169efc9 RCX: ffff881d84d7ab48 2204 Dec 20 17:03:37 hostname kernel: [79592.478310] RDX: 0000000000009876 RSI: 0000000000000292 RDI: 0000000000000292 2205 Dec 20 17:03:37 hostname kernel: [79592.478310] RBP: ffff881d97203db8 R08: ffffffff8202bc88 R09: ffff881d89aebd98 2206 Dec 20 17:03:37 hostname kernel: [79592.478310] R10: 0000000000000000 R11: 0000000001196fcf R12: ffff881d97203d18 2207 Dec 20 17:03:37 hostname kernel: [79592.478310] R13: ffffffff816fa626 R14: ffff881d97203db8 R15: ffffffff8202a5c0 2208 Dec 20 17:03:37 hostname kernel: [79592.478310] FS: 00007fc1e6b61f60(0000) GS:ffff881d97200000(0000) knlGS:0000000000000000 2209 Dec 20 17:03:37 hostname kernel: [79592.478310] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2210 Dec 20 17:03:37 hostname kernel: [79592.478310] CR2: 00000000c04298c8 CR3: 00000011ad30e000 CR4: 0000000000160670 2211 Dec 20 17:03:37 hostname kernel: [79592.478310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2212 Dec 20 17:03:37 hostname kernel: [79592.478310] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2213 Dec 20 17:03:37 hostname kernel: [79592.478310] Stack: 2214 Dec 20 17:03:37 hostname kernel: [79592.478310] ffff881d84d7ab48 ffffffff8202a5c0 ffff881d97203e08 ffffffff810f2e52 2215 Dec 20 17:03:37 hostname kernel: [79592.478310] ffff881d97203e18 0000000000000292 bfadfea900000000 0000000000000005 2216 Dec 20 17:03:37 hostname kernel: [79592.478310] ffff881d84d7aab0 ffff88179bdbe000 ffff8812d50b5000 0000000000000004 2217 Dec 20 17:03:37 hostname kernel: [79592.478310] Call Trace: 2218 Dec 20 17:03:37 hostname kernel: [79592.478310] <IRQ> 2219 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff810f2e52>] mod_timer_pinned+0xe2/0x1c0 2220 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff816610c0>] reqsk_timer_handler+0x270/0x300 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff81660e50>] ? inet_csk_reqsk_queue_drop+0x220/0x220 2222 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff810f15d0>] call_timer_fn+0x40/0x160 2223 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff81660e50>] ? inet_csk_reqsk_queue_drop+0x220/0x220 2224 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff810f3578>] run_timer_softirq+0x278/0x350 2225 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff8108bb30>] __do_softirq+0x120/0x300 2226 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff8108bf35>] irq_exit+0x125/0x130 2227 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff813fbcf9>] xen_evtchn_do_upcall+0x39/0x50 2228 Dec 20 17:03:37 hostname kernel: [79592.478310] [<ffffffff816fa626>] xen_hvm_callback_vector+0x1b6/0x1c0 2229 Dec 20 17:03:37 hostname kernel: [79592.478310] <EOI>
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |