Emulex HBA faulted following ereport.io.pciex.pl.re events when operating at PCIe Gen2 (Doc ID 1624749.1)

Last updated on OCTOBER 05, 2017

Applies to:

SPARC T4-4 - Version All Versions to All Versions [Release All Releases]
SPARC T3-4 - Version All Versions to All Versions [Release All Releases]
SPARC SuperCluster T4-4 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

Systems report repeated FMA faults against Emulex Dual FC/Gigabit ethernet host adapters operating at PCIe Gen2 link speed.

Impacted component details;

371-4666-02 / SG-XPCIEFCGBE-E8-Z
Emulex LPem12002E-S

Fault events will look similar to the following reported by 'fmadm faulty';

--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Feb 08 22:51:10 d57ba133-abf6-c03c-8419-a4663472040b PCIEX-8000-J5 Major

Host : t4test
Platform : ORCL,SPARC-T4-4 Chassis_id :
Product_sn :

Fault class : fault.io.pciex.device-interr-corr
Affects : dev:////pci@600/pci@1/pci@0/pci@4/pci@0
faulted but still in service
FRU : "PCI-EM4" (hc://:product-id=ORCL,SPARC-T4-4:product-sn=1204BDY993:server-id=t4cdrl01:chassis-id=1204BDY993/chassis
=0/motherboard=0/hostbridge=2/pciexrc=4/pciexbus=1/pciexdev=0/pciexfn=0/pciexbus=2/pciexdev=4/pciexfn=0/pciexbus=4/pciexdev=0)
faulty

Description : Too many recovered internal errors have been detected within the
specified PCIEX device. This may degrade into a non-recoverable
fault.
Refer to http://sun.com/msg/PCIEX-8000-J5 for more information.

Response : One or more device instances may be disabled

Impact : Loss of services provided by the device instances associated with
this fault

Action : Schedule a repair procedure to replace the affected device. Use
fmadm faulty to identify the device or contact Sun for support.

 

All 371-4666-02 Emulex HBAs should operate at PCIe Gen1 (2.5GT/s), if they are not then there is a risk they will suffer from excessive PCIe correctable errors. In order to confirm the cards configured operating speed first check prtdiag;

/SYS/PCI-EM4 PCIE SUNW,assigned-device-pciex10df,fc40 5.0GTx4
/pci@600/pci@1/pci@0/pci@4/pci@0/pci@3/SUNW,assigned-device@0
/SYS/PCI-EM4 PCIE SUNW,assigned-device-pciex10df,fc40 5.0GTx4
/pci@600/pci@1/pci@0/pci@4/pci@0/pci@3/SUNW,assigned-device@0,1

 

In this example the impacted device is operating at the incorrect PCIe Gen2 speed (5.0GT/s) and should be replaced.

Alternatively run operations POST mode, the PCIe switch configuration including link speed and width will be reported towards the end of POST.

 

On the SP;

set /HOST/diag mode=ops0

set /HOST/diag level=max

set /HOST/diag verbosity=max

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms