My Oracle Support Banner

Virtual Machine Constantly Freezes with Various Call Stacks (Doc ID 2193060.1)

Last updated on AUGUST 04, 2018

Applies to:

Oracle VM - Version 3.2.9 and later
Linux x86-64

Symptoms

A guest VM freezes or runs sluggishly after live migration from one Oracle VM server to another.


When you check out vm's syslog or log output from vmcore generated from the hang vm, you can see lots of,  variant call stacks recorded.

Here is an example:

7738 BUG: soft lockup - CPU#18 stuck for 22s! [java:4370]
7739 Modules linked in: des_generic ecb md4 nls_iso8859_1 cifs ovmapi ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 mlx4 _vnic mlx4_vnic_helper mlx4_ib ib_sa ib_mad ib_core mlx4_xen_fmr_slave mlx4_core ext3 jbd ppdev parport_pc parport microcode xen_netfront pcspkr i2c_pii x4 i2c_core ext4 mbcache jbd2 xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
7740 CPU 18
7741 Modules linked in: des_generic ecb md4 nls_iso8859_1 cifs ovmapi ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 mlx4 _vnic mlx4_vnic_helper mlx4_ib ib_sa ib_mad ib_core mlx4_xen_fmr_slave mlx4_core ext3 jbd ppdev parport_pc parport microcode xen_netfront pcspkr i2c_pii x4 i2c_core ext4 mbcache jbd2 xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
7742
7743 Pid: 4370, comm: java Not tainted 2.6.39-400.248.3.el6uek.x86_64 #1 Xen HVM domU
7744 RIP: 0010:[<ffffffff8115c38c>] [<ffffffff8115c38c>] kmem_cache_alloc_node_trace+0xc/0x1d0
7745 RSP: 0018:ffff88400e843c40 EFLAGS: 00000282
7746 RAX: ffff88400e0b02c0 RBX: ffff88400e843be0 RCX: 00000000ffffffff
7747 RDX: 0000000000000020 RSI: ffff88400e0b02c0 RDI: 00000000000001ac
7748 RBP: ffff88400e843ca0 R08: ffff883e90823040 R09: ffff883e8f8ceef0
7749 R10: 0000000000001400 R11: 0000000000000010 R12: ffffffff81516823
7750 R13: ffff88400e843bb8 R14: ffff883e904b5b80 R15: ffff88400e843ba8
7751 FS: 00007f93104b5700(0000) GS:ffff88400e840000(0000) knlGS:0000000000000000
7752 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7753 CR2: 00000000d7b69004 CR3: 0000003e9011e000 CR4: 00000000001006e0
7754 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
7755 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
7756 Process java (pid: 4370, threadinfo ffff883e91e80000, task ffff883e9099a140)
7757 Stack:
7758 ffff88400e843cb0 0000000000000246 0000000000000000 000000000000009a
7759 0000000000000001 0000000000000000 0000000000000246 ffff883e904b5b80
7760 ffffffff810d70ca ffff883e8ea38000 0000000000000020 00000000ffffffff
7761 Call Trace:
7762 <IRQ>
7763 [<ffffffff810d70ca>] ? handle_irq_event_percpu+0xba/0x1f0
7764 [<ffffffff8115c59d>] __kmalloc_node_track_caller+0x4d/0x60
7765 [<ffffffff814327a4>] __alloc_skb+0x74/0x170
7766 [<ffffffff814a37b0>] arp_create+0x70/0x220
7767 [<ffffffff814a4406>] arp_send+0x36/0x50
7768 [<ffffffff814a45dc>] arp_solicit+0x1bc/0x210
7769 [<ffffffff8144c335>] neigh_timer_handler+0xe5/0x390
7770 [<ffffffff8108a060>] ? __queue_work+0x3f0/0x3f0
7771 [<ffffffff8144c250>] ? neigh_delete+0x1a0/0x1a0
7772 [<ffffffff8107e359>] call_timer_fn+0x49/0x130
7773 [<ffffffff8144c250>] ? neigh_delete+0x1a0/0x1a0
7774 [<ffffffff8107e769>] run_timer_softirq+0x149/0x280
7775 [<ffffffff810d70ca>] ? handle_irq_event_percpu+0xba/0x1f0
7776 [<ffffffff81075147>] __do_softirq+0xb7/0x210
7777 [<ffffffff815166fc>] call_softirq+0x1c/0x30
7778 [<ffffffff81017425>] do_softirq+0x65/0xa0
7779 [<ffffffff81074f4d>] irq_exit+0xbd/0xe0
7780 [<ffffffff812fa7d5>] xen_evtchn_do_upcall+0x35/0x50
7781 [<ffffffff81516823>] xen_hvm_callback_vector+0x13/0x20
7782 <EOI>
7783 [<ffffffff8115a9cc>] ? ksize+0x3c/0x80
7784 [<ffffffff8115a9a9>] ? ksize+0x19/0x80
7785 [<ffffffff814327b8>] __alloc_skb+0x88/0x170
7786 [<ffffffff814951cf>] tcp_send_fin+0xaf/0x1d0
7787 [<ffffffff8148467a>] tcp_close+0x10a/0x430
7788 [<ffffffff814aa8be>] inet_release+0x5e/0x80
7789 [<ffffffffa0399c3f>] inet6_release+0x3f/0x50 [ipv6]
7790 [<ffffffff814296b9>] sock_release+0x29/0x90
7791 [<ffffffff81429737>] sock_close+0x17/0x30
7792 [<ffffffff8116f0ce>] __fput+0xbe/0x240
7793 [<ffffffff8116f275>] fput+0x25/0x30
7794 [<ffffffff8116b1b3>] filp_close+0x63/0x90
7795 [<ffffffff8117e693>] sys_dup3+0x133/0x190
7796 [<ffffffff8117e704>] sys_dup2+0x14/0x50
7797 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b
7798 Code: 8b 93 0c 80 00 00 31 f6 4c 89 ef e8 6f 59 10 00 e9 fe fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 60 48 89 5d d8 <4c> 89 65 e0 4c 89 6d e8 4c 89 75 f0 4c 89 7d f8 0f 1f 44 00 00
7799 Call Trace:
7800 <IRQ> [<ffffffff810d70ca>] ? handle_irq_event_percpu+0xba/0x1f0
7801 [<ffffffff8115c59d>] __kmalloc_node_track_caller+0x4d/0x60
7802 [<ffffffff814327a4>] __alloc_skb+0x74/0x170
7803 [<ffffffff814a37b0>] arp_create+0x70/0x220
7804 [<ffffffff814a4406>] arp_send+0x36/0x50
7805 [<ffffffff814a45dc>] arp_solicit+0x1bc/0x210
7806 [<ffffffff8144c335>] neigh_timer_handler+0xe5/0x390
7807 [<ffffffff8108a060>] ? __queue_work+0x3f0/0x3f0
7808 [<ffffffff8144c250>] ? neigh_delete+0x1a0/0x1a0
7809 [<ffffffff8107e359>] call_timer_fn+0x49/0x130
7810 [<ffffffff8144c250>] ? neigh_delete+0x1a0/0x1a0
7811 [<ffffffff8107e769>] run_timer_softirq+0x149/0x280
7812 [<ffffffff810d70ca>] ? handle_irq_event_percpu+0xba/0x1f0
7813 [<ffffffff81075147>] __do_softirq+0xb7/0x210
7814 [<ffffffff815166fc>] call_softirq+0x1c/0x30
7815 [<ffffffff81017425>] do_softirq+0x65/0xa0
7816 [<ffffffff81074f4d>] irq_exit+0xbd/0xe0
7817 [<ffffffff812fa7d5>] xen_evtchn_do_upcall+0x35/0x50
7818 [<ffffffff81516823>] xen_hvm_callback_vector+0x13/0x20
7819 <EOI> [<ffffffff8115a9cc>] ? ksize+0x3c/0x80
7820 [<ffffffff8115a9a9>] ? ksize+0x19/0x80
7821 [<ffffffff814327b8>] __alloc_skb+0x88/0x170
7822 [<ffffffff814951cf>] tcp_send_fin+0xaf/0x1d0
7823 [<ffffffff8148467a>] tcp_close+0x10a/0x430
7824 [<ffffffff814aa8be>] inet_release+0x5e/0x80
7825 [<ffffffffa0399c3f>] inet6_release+0x3f/0x50 [ipv6]
7826 [<ffffffff814296b9>] sock_release+0x29/0x90
7827 [<ffffffff81429737>] sock_close+0x17/0x30
7828 [<ffffffff8116f0ce>] __fput+0xbe/0x240
7829 [<ffffffff8116f275>] fput+0x25/0x30
7830 [<ffffffff8116b1b3>] filp_close+0x63/0x90
7831 [<ffffffff8117e693>] sys_dup3+0x133/0x190
7832 [<ffffffff8117e704>] sys_dup2+0x14/0x50
7833 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b

7834 BUG: soft lockup - CPU#1 stuck for 23s! [irqbalance:1763]
7835 Modules linked in: des_generic ecb md4 nls_iso8859_1 cifs ovmapi ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 mlx4 _vnic mlx4_vnic_helper mlx4_ib ib_sa ib_mad ib_core mlx4_xen_fmr_slave mlx4_core ext3 jbd ppdev parport_pc parport microcode xen_netfront pcspkr i2c_pii x4 i2c_core ext4 mbcache jbd2 xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
7836 CPU 1
7837 Modules linked in: des_generic ecb md4 nls_iso8859_1 cifs ovmapi ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 mlx4 _vnic mlx4_vnic_helper mlx4_ib ib_sa ib_mad ib_core mlx4_xen_fmr_slave mlx4_core ext3 jbd ppdev parport_pc parport microcode xen_netfront pcspkr i2c_pii x4 i2c_core ext4 mbcache jbd2 xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
7838
7839 Pid: 1763, comm: irqbalance Not tainted 2.6.39-400.248.3.el6uek.x86_64 #1 Xen HVM domU
7840 RIP: 0010:[<ffffffff8150d099>] [<ffffffff8150d099>] _raw_spin_unlock_irqrestore+0x19/0x30
7841 RSP: 0018:ffff88400e623e58 EFLAGS: 00000297
7842 RAX: 0000000000004e4c RBX: ffff88400e623df0 RCX: ffffffff817c2b88
7843 RDX: 0000000000004e4c RSI: 0000000000000297 RDI: 0000000000000297
7844 RBP: ffff88400e623e60 R08: 0000000000008000 R09: 0000000000000001
7845 R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff81516823
7846 R13: ffff88400e623dc8 R14: 0000000000000297 R15: ffffffff8106529a
7847 FS: 00007f9c221fc720(0000) GS:ffff88400e620000(0000) knlGS:0000000000000000
7848 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7849 CR2: 00007f9c22217000 CR3: 0000003e9139a000 CR4: 00000000001006e0
7850 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
7851 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
7852 Process irqbalance (pid: 1763, threadinfo ffff883e914c2000, task ffff883e933d6400)
7853 Stack:
7854 ffffffff817b2600 ffff88400e623e90 ffffffff810dca55 ffff88400e62d720
7855 ffffffff817b2600 ffffffff817860c8 0000000000000048 ffff88400e623ec0
7856 ffffffff810dd23d 0000000000000001 0000000000000009 ffffffff817860c8
7857 Call Trace:
7858 <IRQ>
7859 [<ffffffff810dca55>] force_quiescent_state+0x145/0x1e0
7860 [<ffffffff810dd23d>] __rcu_process_callbacks+0x13d/0x1e0
7861 [<ffffffff810dd305>] rcu_process_callbacks+0x25/0x50
7862 [<ffffffff81075147>] __do_softirq+0xb7/0x210
7863 [<ffffffff815166fc>] call_softirq+0x1c/0x30
7864 [<ffffffff81017425>] do_softirq+0x65/0xa0
7865 [<ffffffff81074f4d>] irq_exit+0xbd/0xe0
7866 [<ffffffff812fa7d5>] xen_evtchn_do_upcall+0x35/0x50
7867 [<ffffffff81516823>] xen_hvm_callback_vector+0x13/0x20
7868 <EOI>
7869 [<ffffffff81260b06>] ? copy_user_enhanced_fast_string+0x6/0x10
7870 [<ffffffff8118ef5e>] ? seq_read+0x2ee/0x420
7871 [<ffffffff8118ec70>] ? single_open+0xa0/0xa0
7872 [<ffffffff811ccef1>] proc_reg_read+0x81/0xc0
7873 [<ffffffff8116d915>] vfs_read+0xc5/0x190
7874 [<ffffffff8116dae1>] sys_read+0x51/0x90
7875 [<ffffffff810d00bb>] ? audit_syscall_exit+0x25b/0x290
7876 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b
7877 Code: 00 00 e8 5b 3f b3 ff 66 90 c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 53 0f 1f 44 00 00 48 89 f3 e8 8e 3f b3 ff 66 90 48 89 df 57 9d <0f> 1f 44 00 00 5b c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00
7878 Call Trace:
7879 <IRQ> [<ffffffff810dca55>] force_quiescent_state+0x145/0x1e0
7880 [<ffffffff810dd23d>] __rcu_process_callbacks+0x13d/0x1e0
7881 [<ffffffff810dd305>] rcu_process_callbacks+0x25/0x50
7882 [<ffffffff81075147>] __do_softirq+0xb7/0x210
7883 [<ffffffff815166fc>] call_softirq+0x1c/0x30
7884 [<ffffffff81017425>] do_softirq+0x65/0xa0
7885 [<ffffffff81074f4d>] irq_exit+0xbd/0xe0
7886 [<ffffffff812fa7d5>] xen_evtchn_do_upcall+0x35/0x50
7887 [<ffffffff81516823>] xen_hvm_callback_vector+0x13/0x20
7888 <EOI> [<ffffffff81260b06>] ? copy_user_enhanced_fast_string+0x6/0x10
7889 [<ffffffff8118ef5e>] ? seq_read+0x2ee/0x420
7890 [<ffffffff8118ec70>] ? single_open+0xa0/0xa0
7891 [<ffffffff811ccef1>] proc_reg_read+0x81/0xc0
7892 [<ffffffff8116d915>] vfs_read+0xc5/0x190
7893 [<ffffffff8116dae1>] sys_read+0x51/0x90
7894 [<ffffffff810d00bb>] ? audit_syscall_exit+0x25b/0x290
7895 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b

 

Changes

Live migration of guest VM from one ovs server to another.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.