Troubleshooting Infiniband links when a node reports a link down (Doc ID 1989373.1)

Last updated on OCTOBER 26, 2016

Applies to:

Exadata Database Machine X2-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exalogic Elastic Cloud X4-2 Hardware - Version X4 to X5 [Release X4 to X5]
Exadata X4-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exadata X3-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exadata X4-2 Quarter Rack - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

Host side shows that port 1 on IB HCA is DOWN with Physical State Polling

[root@edx28bur09db02 ~]# ibstat
CA 'mlx4_0'
        CA type: MT26428
        Number of ports: 2
        Firmware version: 2.7.8130
        Hardware version: b0
        Node GUID: 0x0021280001a10790
        System image GUID: 0x0021280001a10793
        Port 1:
                State: Down<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< The Link is Down
                Physical state: Polling<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Polling state. waiting for SM
                Rate: 70
                Base lid: 142
                LMC: 0
                SM lid: 8
                Capability mask: 0x02510868
                Port GUID: 0x0021280001a10791
        Port 2:
                State: Active
                Physical state: LinkUp
                Rate: 40
                Base lid: 143
                LMC: 0
                SM lid: 8
                Capability mask: 0x02510868
                Port GUID: 0x0021280001a10792 

  On the infiniband switch side we see the following from listlinkup
 
[root@edx28bur09sw-ib3 ibdiag]# listlinkup
Connector  0A Present <-> Switch Port 20 is up (Enabled)
Connector  1A Present <-> Switch Port 22 is up (Enabled)
Connector  2A Present <-> Switch Port 24 is up (Enabled)
Connector  3A Present <-> Switch Port 26 is down (Enabled)
Connector  4A Present <-> Switch Port 28 is up (Enabled)
Connector  5A Present <-> Switch Port 30 is down (Enabled)<<<cable plugged in to switchport 30. Physical connector labled 5A
Connector  6A Present <-> Switch Port 35 is down (Enabled)
Connector  7A Present <-> Switch Port 33 is down (Enabled)

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms