My Oracle Support Banner

Infiniband Switch rebooted as a part of Patching can caused multiple node evictions in the Cluster (Doc ID 2703311.1)

Last updated on AUGUST 07, 2023

Applies to:

Exadata X5-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exalogic Elastic Cloud X3-2 Hardware - Version X6 to X6 [Release X6]
Exadata X6-2 Hardware - Version All Versions to All Versions [Release All Releases]
Oracle SuperCluster M7 Hardware - Version All Versions to All Versions [Release All Releases]
Oracle SuperCluster M8 Hardware - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

The clock on the first Infiniband switch jumped backwards or forwards in time during the re-boot during the patching process.
If the clock jumps forward, you could have a Split Infiniband subnet with a short period of time where there are the Subnet Masters on the fabric.
If the clock jumps backwards, the Subnet Manager state on the switch that was patched is still in DISCOVER state until the clock catches up to the time prior to the jump backwards:

[root@xtu16sw-iba01]# uptime;date;sminfo
17:11:18 up 23 min, 1 user, load average: 0.15, 0.17, 0.15
Jul 19 10:42:57 UTC 2020
sminfo: sm lid 0 sm guid 0x0, activity count 184 priority 5 state 1 SMINFO_DISCOVER

OR

[root@xtu16sw-iba01]#  getmaster
Local SM enabled and running, state DISCOVER
BOOT IN PROGRESS

 

If the second switch is patched and "re-booted" prior to the first switches clock catching up to the time it jumped backwards from (10:55:23 as seen in the example below), you could have a condition where there is NO Sumbnet Master on the fabric or possibly a split Fabric with two Subnet Masters for s short period of time causing the issue.

 

Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.