Exalogic: Unexpected reboot of Infiniband Switches due to a BUG present in firmware versions 2.2.14 and older
(Doc ID 2720462.1)
Last updated on OCTOBER 19, 2020
Applies to:Oracle Exalogic Elastic Cloud Software - Version 22.214.171.124.0 to 126.96.36.199.191015
Oracle Exalogic Elastic Cloud Software - Version 188.8.131.52.0 to 184.108.40.206.200114
Information in this document applies to any platform.
A brief lost of network connectivty in the Exalogic evnironment. In an Exalogic Virtual enviornment, lost of network connectivity will be visible with all guest vServers running in the environment. In a Exalogic physical, lost of network connectivity will be visible in all compute nodes in the environment.
The lost of the network connectivty will be due to an unexpected reboot of both Infiniband switches at the same time the lost of network occurs. This will be correlated with some management Ethernet network wiring or topology changes.
In an Exalogic virtual environment guest vServers or Exalogic physical environment compute nodes, the following log entries will be found during the time the network connectivity was lost:
From the /var/log/messages we can see that the MLX ports is brought down abruptly:
The reboot fo the Infiniband switches will be correlated with Management Ethernet network problems, wireing, or topology changes.
Management Ethernet network problems, wireing, or topology changes in the Infiniband GW switches.
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document
|Recovering from lost of network connectivity|
|Infiniband gateway switches reboots|