FMA fault - NIC-8000-0Q reported on 'ixgbe' interfaces

(Doc ID 2377615.1)

Last updated on MAY 14, 2018

Applies to:

SPARC T5-8 - Version All Versions to All Versions [Release All Releases]
Oracle SuperCluster T5-8 Full Rack - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

 FMA reports NIC-8000-0Q faults on 'ixgbe' interface. This maybe reported on a single interface, or multiple interfaces and NIC cards. There are 2 possible fault reports that can be logged:

 

1) This fault reports an 'invalid_state' error detected by the driver:

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Mar 19 14:08:08 f3c169c3-b164-49fb-8330-bf1c40f576fe  NIC-8000-0Q    Critical

Problem Status    : open
Diag Engine       : eft / 1.16
System
    Manufacturer  : Oracle Corporation
    Name          : SuperCluster T5-8
    Part_Number   : SuperCluster T5-8
    Serial_Number : AK00XXXXX


System Component
    Manufacturer  : Oracle Corporation
    Name          : SPARC T5-8
    Part_Number   : 7308731
    Serial_Number : AK00XXXXXX
    Host_ID       : XXXXXXXX

----------------------------------------
Suspect 1 of 2 :
   Problem class : defect.io.nic.correctable
   Certainty   : 95%
   Affects     : mod:///mod-name=ixgbe/
   Status      : faulted but still in service

   FRU
     Status           : faulty
     FMRI             : "pkg://solaris/driver/network/ethernet/ixgbe@0.5.11,5.11-0.175.3.22.0.3.0:20170629T155906Z
"
----------------------------------------
Suspect 2 of 2 :
   Problem class : fault.io.nic.correctable
   Certainty   : 5%
   Affects     : dev:////pci@380/pci@1/pci@0/pci@a/network@0,1
   Status      : faulted but still in service

   FRU
     Status           : faulty
     Location         : "/SYS/PCIE9"
     Manufacturer     : unknown
     Name             : unknown
     Part_Number      : unknown
     Revision         : unknown
     Serial_Number    : unknown
     Chassis
        Manufacturer  : Oracle Corporation
        Name          : SPARC T5-8
        Part_Number   : 7308731
        Serial_Number : AK00XXXXXX

Description : The number of correctable errors detected in the driver has
              crossed the allowed threshold. A(n) invalid_state error has been
              detected during driver's runtime context causing a(n) correctable
              service impact.

              Firmware: pba number: E70856-012; short1: 00.03 61ab0001 0.0.0

Response    : One or more device instances may be disabled.

Impact      : Loss of services provided by the device instances associated with
              this fault.

Action      : Use 'fmadm faulty' to provide a more detailed view of this event.
              Please refer to the associated reference document at
              http://support.oracle.com/msg/NIC-8000-0Q for the latest service
              procedures and policies regarding this diagnosis.

--

The stack for this will be noted in the associated error in 'fmdump-eV.out':

 

Mar 19 2018 14:07:43.608305389 ereport.io.nic.correctable
nvlist version: 0
        class = ereport.io.nic.correctable
        ena = 0xea2799eb1360401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@400/pci@1/pci@0/pci@c/network@0,1
        (end detector)

        core = (embedded nvlist)
        nvlist version: 0
                reason = A(n) invalid_state error has been detected during driver's runtime context causing a(n) correctable service impact
                driver_error = 0x7
                driver_context = 0x1
                service_impact = 0x1
                device_subsystem = 0x0
                driver_error_label = invalid_state
                driver_context_label = runtime
                service_impact_label = correctable
                device_subsystem_label = unknown
                driver_health = 0x0
                reaped = 1
                overridden = 0
                throttle_threshold = 0x0
                throttle_interval = 0x0
                stack = [ mac`mac_fm_error_log+20 () | ixgbe`ixgbe_fm_shared_code_error+44 () | ixgbe`ixgbe_fc_autoneg+68 () | ixgbe`ixgbe_fc_enable_generic+94 () | ixgbe`ixgbe_driver_link_check+1d8 () | ixgbe`ixgbe_link_check_task+34 () | genunix`taskq_thread+3e0 () ]
        (end core)

        framework = (embedded nvlist)
        nvlist version: 0
                nvm_version = pba number: E70856-012; short1: 00.03 61ab0001 0.0.0
                dcb_flags = 0x0
        (end framework)

        driver = (embedded nvlist)
        nvlist version: 0
                driver_error_message = The link is down
        (end driver)

        __ttl = 0x1
        __tod = 0x5ab00a7f 0x244200ed

============================

 

2) This fault reports a 'software' error detected during the driver attach:

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Mar 19 14:19:55 f4e0d8ba-62d1-4f8c-a5eb-83e276af6fe3  NIC-8000-0Q    Critical

Problem Status    : open
Diag Engine       : eft / 1.16
System
    Manufacturer  : Oracle Corporation
    Name          : SPARC S7-2L
    Part_Number   : 35442060+1+1
    Serial_Number : 1803XXXXXX
    Host_ID       : XXXXXXXX

----------------------------------------
Suspect 1 of 2 :
   Problem class : defect.io.nic.correctable
   Certainty   : 95%
   Affects     : mod:///mod-name=ixgbe/
   Status      : faulted but still in service

   FRU
     Status           : faulty
     FMRI             : "pkg://solaris/driver/network/ethernet/ixgbe@0.5.11,5.11-0.175.3.29.0.4.0:20180206T160622Z
"
----------------------------------------
Suspect 2 of 2 :
   Problem class : fault.io.nic.correctable
   Certainty   : 5%
   Affects     : dev:////pci@302/pci@2/pci@0/pci@16/network@0
   Status      : faulted but still in service

   FRU
     Status           : faulty
     Location         : "/SYS/MB/PCIE6"
     Manufacturer     : unknown
     Name             : unknown
     Part_Number      : unknown
     Revision         : unknown
     Serial_Number    : unknown
     Chassis
        Manufacturer  : Oracle Corporation
        Name          : SPARC S7-2L
        Part_Number   : 35442060+1+1
        Serial_Number : 1803XXXXXX

Description : The number of correctable errors detected in the driver has
              crossed the allowed threshold. A(n) software error has been
              detected during driver's attach context causing a(n) correctable
              service impact.

              Firmware: pba number: G45101-010; short2: 04.04.0 4.03.0 80000600
              0.0.0

Response    : One or more device instances may be disabled.

Impact      : Loss of services provided by the device instances associated with
              this fault.

Action      : Use 'fmadm faulty' to provide a more detailed view of this event.
              Please refer to the associated reference document at
              http://support.oracle.com/msg/NIC-8000-0Q for the latest service
              procedures and policies regarding this diagnosis.

--

The stack for this will be noted in the associated error in 'fmdump-eV.out':

 

Mar 19 2018 14:19:25.640053998 ereport.io.nic.correctable
nvlist version: 0
        class = ereport.io.nic.correctable
        ena = 0x3fe7b0bb8a1b801
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@302/pci@2/pci@0/pci@16/network@0,1
        (end detector)

        core = (embedded nvlist)
        nvlist version: 0
                reason = A(n) software error has been detected during driver's attach context causing a(n) correctable service impact
                driver_error = 0x8
                driver_context = 0x2
                service_impact = 0x1
                device_subsystem = 0x0
                driver_error_label = software
                driver_context_label = attach
                service_impact_label = correctable
                device_subsystem_label = unknown
                driver_health = 0x0
                reaped = 1
                overridden = 0
                throttle_threshold = 0x0
                throttle_interval = 0x0
                psargs = /usr/lib/ldoms/ldmad
                stack = [ mac`mac_fm_error_log+20 () | ixgbe`ixgbe_fm_shared_code_error+44 () | ixgbe`ixgbe_set_rar_generic+30 () | ixgbe`ixgbe_reset_hw_X540+1f4 () | ixgbe`ixgbe_solaris_reset_hw+18 () | ixgbe`ixgbe_init+8 () | ixgbe`ixgbe_attach+7d8 () | genunix`devi_attach+ec () | genunix`attach_node+b8 () | genunix`i_ndi_config_node+178 () | genunix`i_ddi_attachchild+34 () | genunix`devi_attach_node+110 () | genunix`devi_config_one+318 () | genunix`ndi_devi_config_one+d0 () | devfs`dv_find+1e4 () | devfs`devfs_lookup+1c () | genunix`fop_lookup+15c () | genunix`lookuppnvp+430 () | genunix`lookuppnatcred+118 () | genunix`lookupnameatcred+4c () | genunix`lookupnameat+20 () | genunix`vn_openat+338 () | genunix`copen+43c () ]
        (end core)

        framework = (embedded nvlist)
        nvlist version: 0
                nvm_version = pba number: G45101-010; short2: 04.04.0 4.03.0 80000600 0.0.0
        (end framework)

        driver = (embedded nvlist)
        nvlist version: 0
                driver_error_message = RAR index 128 is out of range.

        (end driver)

        __ttl = 0x1
        __tod = 0x5aafc6ed 0x262672ee

 

=================================================

Changes

 Customer updated to S11.3 SRU 15 or greater, or may have updated from S10 to S11.3 SRU 15 or greater.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms