My Oracle Support Banner

Exalogic Compute Node IB Network Going Down Frequently With Mlx4_ib_event MLX4_DEV_EVENT_PORT_DOWN Error Messages (Doc ID 2204140.1)

Last updated on APRIL 23, 2018

Applies to:

Oracle Exalogic Elastic Cloud Software - Version 2.0.6.2.4 and later
Linux x86-64

Symptoms

In Exalogic Linux Physical racks issue of Compute nodes IB network going down frequently is seen.

Exalogic compute node report failed bond1 (IPOIB ) interface. Several mlx4_ib_event MLX4_DEV_EVENT_PORT_DOWN messages are seen in /var/log/messages.

Below are error messages in /var/log/messages system logs on Compute Node.

Aug 21 06:35:14 el01cn06 kernel: ib0: ipoib_cm_handle_tx_wc: failed cm send event (status=12, wrid=61 vend_err 81)
Aug 21 06:35:14 el01cn06 kernel: ib0: ipoib_cm_handle_tx_wc: failed cm send event (status=12, wrid=49 vend_err 81)
Aug 21 08:20:05 el01cn06 kernel: mlx4_ib_event MLX4_DEV_EVENT_PORT_DOWN (port:1)
Aug 21 08:20:05 el01cn06 kernel: mlx4_ib_event MLX4_DEV_EVENT_PORT_DOWN (port:2)
Aug 21 08:20:05 el01cn06 kernel: bonding: bond0: link status down for active interface ib0, disabling it in 5000 ms.
Aug 21 08:20:10 el01cn06 kernel: bonding: bond0: link status definitely down for interface ib0, disabling it
Aug 21 08:20:23 el01cn06 kernel: bonding: bond0: link status up for interface ib0, enabling it in 0 ms.
Aug 21 08:20:23 el01cn06 kernel: bonding: bond0: link status definitely up for interface ib0, 4294967295 Mbps full duplex.
Aug 21 08:20:23 el01cn06 kernel: bonding: bond0: making interface ib0 the new active one.
Aug 21 09:02:07 el01cn06 kernel: ib0: ipoib_cm_handle_tx_wc: failed cm send event (status=12, wrid=122 vend_err 81)
Aug 21 11:05:40 el01cn06 kernel: ib0: ipoib_cm_handle_tx_wc: failed cm send event (status=12, wrid=46 vend_err 81)
Aug 21 11:49:10 el01cn06 kernel: mlx4_ib_event MLX4_DEV_EVENT_PORT_DOWN (port:1)
Aug 21 11:49:10 el01cn06 kernel: bonding: bond0: link status down for active interface ib0, disabling it in 5000 ms.
Aug 21 11:49:16 el01cn06 kernel: bonding: bond0: link status definitely down for interface ib0, disabling it
Aug 21 11:49:31 el01cn06 kernel: bonding: bond0: link status up for interface ib0, enabling it in 5000 ms.
Aug 21 11:49:36 el01cn06 kernel: bonding: bond0: link status definitely up for interface ib0, 4294967295 Mbps full duplex.
Aug 21 12:14:21 el01cn06 kernel: mlx4_ib_event MLX4_DEV_EVENT_PORT_DOWN (port:2)
Aug 21 12:14:26 el01cn06 kernel: bonding: bond0: making interface ib0 the new active one.
Aug 21 12:48:29 el01cn06 kernel: bonding: bond0: Removing slave ib0.
Aug 21 12:48:29 el01cn06 kernel: bonding: bond0: releasing active interface ib0
Aug 21 12:48:29 el01cn06 kernel: ib0: mtu > 2044 will cause multicast packet drops.
Aug 21 12:48:31 el01cn06 kernel: ib0: enabling connected mode will cause multicast packet drops
Aug 21 12:48:31 el01cn06 kernel: ib0: mtu > 2044 will cause multicast packet drops.
Aug 21 12:48:31 el01cn06 kernel: bonding: bond0: Adding slave ib0.
Aug 21 12:48:31 el01cn06 kernel: bonding: bond0: Warning: enslaved VLAN challenged slave ib0. Adding VLANs will be blocked as long as ib0 is part of bond bond0
Aug 21 12:48:31 el01cn06 kernel: ib0: mtu > 2044 will cause multicast packet drops.
Aug 21 12:48:31 el01cn06 kernel: bonding: bond0: enslaving ib0 as a backup interface with a down link.
Aug 21 12:48:31 el01cn06 kernel: bonding: bond0: link status up for interface ib0, enabling it in 0 ms.
Aug 21 12:48:31 el01cn06 kernel: bonding: bond0: link status definitely up for interface ib0, 4294967295 Mbps full duplex.
Aug 21 12:48:31 el01cn06 kernel: bonding: bond0: making interface ib0 the new active one.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.