Oracle VM Domain 0 Crashes at "memcpy+0xb/0x120" And Aborts All VM Guests (Doc ID 1643489.1)

Last updated on SEPTEMBER 14, 2017

Applies to:

Oracle VM - Version 3.1.1 and later
Linux x86-64

Symptoms

While the cluster was operating properly, one node suddenly crashed. All VM guests on the node also crashed. The remaining cluster nodes correctly responded with an OCFS2 node eviction.

Inspecting the /var/log/messages system log, or the console output, shows a stack signature similar to:

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff8126044b>] memcpy+0xb/0x120
PGD 1ef08d067 PUD 1ef09d067 PMD 0
Oops: 0002 [#1] SMP
CPU 9
Modules linked in: xen_pciback ksplice_gx7w5nyk_vmlinux_new(U) ksplice_gx7w5nyk(U) xen_blkback xen_netback dm_nfs nfs fscache auth_rpcgss nfs_acl xen_gntdev xen_evtchn ipmi_devintf ipmi_si ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm lockd sun
rpc bridge stp llc bonding be2iscsi iscsi_boot_sysfs iscsi_tcp bnx2i cnic uio cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ipv6 ib_uverbs(
U) ib_umad(U) mlx4_vnic(U) mlx4_vnic_helper(U) mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) mlx4_core(U) xenfs xen_privcmd ocfs2 jbd2 ocfs2_nodemanager configfs ocfs2_stackglue dm_mirror video sbs sbshc acpi_memhotplug acpi_ipmi ipmi_msghand
ler parport_pc lp parport ixgbe dca snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd i2c_i801 i2c_core iTCO_wdt soundcore iTCO_vendor_support snd_page_alloc ghes pcspkr h
ed dm_region_hash dm_log dm_mod usb_storage ahci libahci sg shpchp megaraid_sas sd_mod crc_t10dif ext3 jbd mbcache [last unloaded: ksplice_gx7w5nyk_vmlinux_old]

Pid: 0, comm: swapper Not tainted 2.6.39-300.22.2.el5uek #1 Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U  
RIP: e030:[<ffffffff8126044b>]  [<ffffffff8126044b>] memcpy+0xb/0x120
RSP: e02b:ffff880238d23c10  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffff819a7f40 RDI: 0000000000000000
RBP: ffff880238d23c68 R08: 000000000001daf7 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 000000023dbfc884 R14: 000000000000d560 R15: 0000000000080000
FS:  00007fae527306e0(0000) GS:ffff880238d20000(0000) knlGS:0000000000000000
CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000001ed9c3000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff880223fd4000, task ffff880223fd22c0)
Stack:
ffffffff81039a23 ffff880238d23e00 ffff880238d12201 0000000000000002
ffff880238d23de0 0000000000000002 0000000000000000 ffffffff817d0b00
000000023dbfc884 ffff880238d2e440 ffff880238d23e38 ffff880238d23c78
Call Trace:
<IRQ>
[<ffffffff81039a23>] ? __x2apic_send_IPI_mask+0x73/0x160
[<ffffffff81039b2e>] x2apic_send_IPI_all+0x1e/0x20
[<ffffffff81034d6c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0
[<ffffffff810dd2c7>] print_cpu_stall+0x47/0xa0
[<ffffffff810dd4cc>] check_cpu_stall+0x4c/0x80
[<ffffffff810dd530>] __rcu_pending+0x30/0x140
[<ffffffff810dd6ab>] rcu_pending+0x6b/0x90
[<ffffffff810dd855>] rcu_check_callbacks+0x85/0xa0
[<ffffffff8107e4b6>] update_process_times+0x46/0x90
[<ffffffff810a2dd6>] tick_sched_timer+0x66/0xd0
[<ffffffff810a2d70>] ? tick_clock_notify+0x60/0x60
[<ffffffff81095b33>] __run_hrtimer+0x83/0x1e0
[<ffffffff81095e46>] hrtimer_interrupt+0xe6/0x240
[<ffffffff8100a5b7>] xen_timer_interrupt+0x27/0x40
[<ffffffff810d6ddd>] handle_irq_event_percpu+0x5d/0x1b0
[<ffffffff810d9548>] handle_percpu_irq+0x48/0x70
[<ffffffff812f8c6f>] __xen_evtchn_do_upcall+0x16f/0x270
[<ffffffff812f965f>] xen_evtchn_do_upcall+0x2f/0x50
[<ffffffff815119ce>] xen_do_hypervisor_callback+0x1e/0x30
<EOI>
[<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[<ffffffff8100a0b0>] ? xen_safe_halt+0x10/0x20
[<ffffffff8101dfeb>] ? default_idle+0x5b/0x170
[<ffffffff81014ac6>] ? cpu_idle+0xc6/0xf0
[<ffffffff8100a8c9>] ? xen_irq_enable_direct_reloc+0x4/0x4
[<ffffffff814f780e>] ? cpu_bringup_and_idle+0xe/0x10
Code: 01 c6 43 4c 04 19 c0 4c 8b 65 f0 4c 8b 6d f8 83 e0 fc 83 c0 08 88 43 4d 48 8b 5d e8 c9 c3 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
RIP  [<ffffffff8126044b>] memcpy+0xb/0x120
RSP <ffff880238d23c10>
CR2: 0000000000000000
---[ end trace db15b122f04f452b ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 0, comm: swapper Tainted: G      D     2.6.39-300.22.2.el5uek #1
Call Trace:
<IRQ>  [<ffffffff8106f24f>] panic+0xbf/0x1f0
[<ffffffff81507fa4>] ? _raw_spin_lock_irqsave+0x34/0x50
[<ffffffff81507ffe>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
[<ffffffff81070b60>] ? kmsg_dump+0x50/0x100
[<ffffffff815091eb>] oops_end+0xeb/0x100
[<ffffffff81048d9f>] no_context+0xff/0x110
[<ffffffff81048f20>] __bad_area_nosemaphore+0x80/0x1d0
[<ffffffff81323cac>] ? vt_console_print+0xbc/0x360
[<ffffffff81049133>] bad_area_nosemaphore+0x13/0x20
[<ffffffff8150bd2e>] do_page_fault+0x25e/0x4b0
[<ffffffff81507ffe>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
[<ffffffff8106fc28>] ? console_unlock+0xd8/0x110
[<ffffffff81070290>] ? vprintk+0x1b0/0x3a0
[<ffffffff81508755>] page_fault+0x25/0x30
[<ffffffff8126044b>] ? memcpy+0xb/0x120
[<ffffffff81039a23>] ? __x2apic_send_IPI_mask+0x73/0x160
[<ffffffff81039b2e>] x2apic_send_IPI_all+0x1e/0x20
[<ffffffff81034d6c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0
[<ffffffff810dd2c7>] print_cpu_stall+0x47/0xa0
[<ffffffff810dd4cc>] check_cpu_stall+0x4c/0x80
[<ffffffff810dd530>] __rcu_pending+0x30/0x140
[<ffffffff810dd6ab>] rcu_pending+0x6b/0x90
[<ffffffff810dd855>] rcu_check_callbacks+0x85/0xa0
[<ffffffff8107e4b6>] update_process_times+0x46/0x90
[<ffffffff810a2dd6>] tick_sched_timer+0x66/0xd0
[<ffffffff810a2d70>] ? tick_clock_notify+0x60/0x60
[<ffffffff81095b33>] __run_hrtimer+0x83/0x1e0
[<ffffffff81095e46>] hrtimer_interrupt+0xe6/0x240
[<ffffffff8100a5b7>] xen_timer_interrupt+0x27/0x40
[<ffffffff810d6ddd>] handle_irq_event_percpu+0x5d/0x1b0
[<ffffffff810d9548>] handle_percpu_irq+0x48/0x70
[<ffffffff812f8c6f>] __xen_evtchn_do_upcall+0x16f/0x270
[<ffffffff812f965f>] xen_evtchn_do_upcall+0x2f/0x50
[<ffffffff815119ce>] xen_do_hypervisor_callback+0x1e/0x30
<EOI>
[<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[<ffffffff8100a0b0>] ? xen_safe_halt+0x10/0x20
[<ffffffff8101dfeb>] ? default_idle+0x5b/0x170
[<ffffffff81014ac6>] ? cpu_idle+0xc6/0xf0
[<ffffffff8100a8c9>] ? xen_irq_enable_direct_reloc+0x4/0x4
[<ffffffff814f780e>] ? cpu_bringup_and_idle+0xe/0x10
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms