My Oracle Support Banner

Oracle VM: An OVS Server Node Has Been Rebooted During Storage Change (Doc ID 2623218.1)

Last updated on JULY 14, 2022

Applies to:

Oracle VM - Version 3.4.5 to 3.4.6 [Release OVM34]
Linux x86-64

Symptoms

An OVS Server Node was rebooted, no other hosts on the pool was rebooted except this node (localhost1).

1. The below error message was appeared in the host which rebooted.

[root@localhost1 ~]# grep -i tainted /var/log/messages*
/var/log/messages:May 19 10:46:15 localhost1 kernel: [8495814.874063] CPU: 0 PID: 846 Comm: qla2xxx_3_dpc Not tainted 4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 11:32:15 localhost1 kernel: [8498575.347522] CPU: 6 PID: 846 Comm: qla2xxx_3_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 11:58:29 localhost1 kernel: [8500149.460452] CPU: 8 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 13:13:33 localhost1 kernel: [8504653.380840] CPU: 1 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 13:33:10 localhost1 kernel: [8505829.713672] CPU: 1 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 13:34:20 localhost1 kernel: [8505899.855920] CPU: 3 PID: 846 Comm: qla2xxx_3_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 13:45:56 localhost1 kernel: [8506596.232673] CPU: 4 PID: 846 Comm: qla2xxx_3_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 14:35:10 localhost1 kernel: [8509550.077807] CPU: 4 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 15:34:54 localhost1 kernel: [8513134.199521] CPU: 4 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 15:52:50 localhost1 kernel: [8514209.332958] CPU: 1 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 16:52:57 localhost1 kernel: [8517816.474570] CPU: 1 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2
/var/log/messages:May 19 17:25:58 localhost1 kernel: [8519797.399635] CPU: 1 PID: 837 Comm: qla2xxx_2_dpc Tainted: G        W    4.1.12-124.23.2.el6uek.bug29158186.x86_64 #2


2. The system has encountered firmware error during fabric login requiring HBA chip reset.

May 21 12:23:56 localhost1 kernel: [8681736.298633] Call Trace:
May 21 12:23:56 localhost1 kernel: [8681736.298643]  [<ffffffff816e9350>] dump_stack+0x63/0x81
May 21 12:23:56 localhost1 kernel: [8681736.298649]  [<ffffffff8108601a>] warn_slowpath_common+0x8a/0xc0
May 21 12:23:56 localhost1 kernel: [8681736.298652]  [<ffffffff81086096>] warn_slowpath_fmt+0x46/0x50
May 21 12:23:56 localhost1 kernel: [8681736.298654]  [<ffffffff8134506b>] __list_add+0xcb/0xd0
May 21 12:23:56 localhost1 kernel: [8681736.298668]  [<ffffffffa0374d66>] qla24xx_async_gnl+0xc6/0x420 [qla2xxx]
May 21 12:23:56 localhost1 kernel: [8681736.298672]  [<ffffffff810cbe28>] ? __wake_up+0x48/0x60
May 21 12:23:56 localhost1 kernel: [8681736.298677]  [<ffffffffa036cc5c>] qla2x00_do_work+0xfc/0x760 [qla2xxx]
May 21 12:23:56 localhost1 kernel: [8681736.298680]  [<ffffffff816ebfda>] ? __schedule+0x24a/0x810
May 21 12:23:56 localhost1 kernel: [8681736.298682]  [<ffffffff816ebfce>] ? __schedule+0x23e/0x810
May 21 12:23:56 localhost1 kernel: [8681736.298684]  [<ffffffff816ebfda>] ? __schedule+0x24a/0x810
May 21 12:23:56 localhost1 kernel: [8681736.298689]  [<ffffffff810afae1>] ? finish_task_switch+0x81/0x1d0
May 21 12:23:56 localhost1 kernel: [8681736.298691]  [<ffffffff816ebfda>] ? __schedule+0x24a/0x810
May 21 12:23:56 localhost1 kernel: [8681736.298693]  [<ffffffff816ec00f>] ? __schedule+0x27f/0x810
May 21 12:23:56 localhost1 kernel: [8681736.298699]  [<ffffffffa036d67d>] qla2x00_do_dpc+0x14d/0x930 [qla2xxx]
May 21 12:23:56 localhost1 kernel: [8681736.298705]  [<ffffffffa036d530>] ? qla2x00_relogin+0x230/0x230 [qla2xxx]
May 21 12:23:56 localhost1 kernel: [8681736.298708]  [<ffffffff810a68fb>] kthread+0xcb/0xf0
May 21 12:23:56 localhost1 kernel: [8681736.298710]  [<ffffffff816ebfda>] ? __schedule+0x24a/0x810
May 21 12:23:56 localhost1 kernel: [8681736.298712]  [<ffffffff816ebfda>] ? __schedule+0x24a/0x810
May 21 12:23:56 localhost1 kernel: [8681736.298714]  [<ffffffff810a6830>] ? kthread_create_on_node+0x180/0x180
May 21 12:23:56 localhost1 kernel: [8681736.298717]  [<ffffffff816f2101>] ret_from_fork+0x61/0x90
May 21 13:39:19 localhost1 kernel: [8686258.335469] qla2xxx [0000:95:00.0]-5060:3: 83XX: F/W Error Reported: Check if reset required.
May 21 13:39:19 localhost1 kernel: [8686258.335749] qla2xxx [0000:95:00.0]-5069:3: Heartbeat Failure encountered, chip reset required.
May 21 13:39:19 localhost1 kernel: [8686258.336356] qla2xxx [0000:95:00.0]-506f:3: Failed to capture mctp dump

 

3. Other Details:

[root@localhost1 ~]# systool -c fc_host -v | grep -E '_stat|port_name|sym'
    port_name           = "0x2100000e1ed1ed74"
    port_state          = "Online"
    symbolic_name       = "QLE2662 FW:v8.08.05 DVR:v9.00.00.00.41.0-k1"
    port_name           = "0x2100000e1ed1ed75"
    port_state          = "Online"
    symbolic_name       = "QLE2662 FW:v8.08.05 DVR:v9.00.00.00.41.0-k1"
    port_name           = "0x2100000e1ee8a2b0"
    port_state          = "Online"
    symbolic_name       = "QLE2662 FW:v8.08.05 DVR:v9.00.00.00.41.0-k1"
    port_name           = "0x2100000e1ee8a2b1"
    port_state          = "Online"
    symbolic_name       = "QLE2662 FW:v8.08.05 DVR:v9.00.00.00.41.0-k1"
[root@localhost1 ~]#

[root@localhost1 ~]# lspci -vvvv -s 51:00.0  |grep -i -E 'Product|Part'
Capabilities: [88] Vital Product Data
Product Name: QLogic 16Gb FC Dual-port HBA for System XXXXXX
[PN] Part number: 00Y3343
[root@localhost1 ~]#

Changes

Storage Change

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.