My Oracle Support Banner

Troubleshooting Infiniband links when a node reports a link down (Doc ID 1989373.1)

Last updated on MARCH 10, 2021

Applies to:

Exadata Database Machine X2-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exalogic Elastic Cloud X4-2 Hardware - Version X4 to X5 [Release X4 to X5]
Exadata X4-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exadata X3-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exadata X4-2 Quarter Rack - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

Host side shows that port 1 on IB HCA is DOWN with Physical State Polling

[root@eddb02 ~]# ibstat
CA 'mlx4_0'
        CA type: MT26428
        Number of ports: 2
        Firmware version: 2.7.8130
        Hardware version: b0
        Node GUID: 0xXXXXXXXXXXXXXX90
        System image GUID: 0xXXXXXXXXXXXXXX93
        Port 1:
                State: Down<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< The Link is Down
                Physical state: Polling<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Polling state. waiting for SM
                Rate: 70
                Base lid: 142
                LMC: 0
                SM lid: 8
                Capability mask: 0x02510868
                Port GUID: 0xXXXXXXXXXXXXXX91
        Port 2:
                State: Active
                Physical state: LinkUp
                Rate: 40
                Base lid: 143
                LMC: 0
                SM lid: 8
                Capability mask: 0x02510868
                Port GUID: 0xXXXXXXXXXXXXXX92 

  On the infiniband switch side we see the following from listlinkup
 
[root@edsw-ib3]# listlinkup
Connector  0A Present <-> Switch Port 20 is up (Enabled)
Connector  1A Present <-> Switch Port 22 is up (Enabled)
Connector  2A Present <-> Switch Port 24 is up (Enabled)
Connector  3A Present <-> Switch Port 26 is down (Enabled)
Connector  4A Present <-> Switch Port 28 is up (Enabled)
Connector  5A Present <-> Switch Port 30 is down (Enabled)<<<cable plugged in to switchport 30. Physical connector labled 5A
Connector  6A Present <-> Switch Port 35 is down (Enabled)
Connector  7A Present <-> Switch Port 33 is down (Enabled)

Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.