Inifiniband SRIOV : Virtual Function created over Infiniband HCA stays down in Solaris 11.1. (Doc ID 1951257.1)

Last updated on DECEMBER 04, 2014

Applies to:

Sun Infiniband HCA - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

The infiniband (IB) interfaces created over virtual functions (VFs) on an infiniband card (HCA) are found to have its HCA port or IB link states as DOWN (see below).

This is in a domain (or VM) to which a virtual function on IB HCA is added.  The ibstat output collected here shows the IB link states as DOWN.

# ibstat
CA 'mlx4_0'
         CA type: MT26428
         Number of ports: 2
         Firmware version: 2.11.2010
         Hardware version: 176
         Node GUID: 0x0010e1000128be3c
         System image GUID: 0x0010e0000128be3f
         Port 1:
                 State: Down             <<--
                 Physical state: LinkUp  <<--
                 Rate: 40
                 Base lid: 0
                 LMC: 0
                 SM lid: 20
                 Capability mask: 0x02100000
                 Port GUID: 0x0000000000000000
                 Link layer: IB
         Port 2:
                 State: Down             <<--
                 Physical state: LinkUp  <<--
                 Rate: 40
                 Base lid: 0
                 LMC: 0
                 SM lid: 20
                 Capability mask: 0x02100000
                 Port GUID: 0x0000000000000000
                 Link layer: IB

Yet, the ibstat output collected in the primary domain (or physical server), at the same time, shows the IB link states are actually UP and ACTIVE.

So, this problem is not with the physical IB links, but with the IB link states over virtual functions.  The ibstat output in the primary domain shows that the IB link states as UP, and they are assigned LIDs from the subnet manager correctly.  But, the IB link states for IB interfaces over virtual functions are DOWN without any LIDs assigned by the subnet manager.  This implies a failure in communication with the subnet manager.  Please see below for an explanation.


Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms