I/O SERD threshold values are set too low and may result in PCIEX-8000-J5, PCIEX-8000-YJ and PCIEX-8000-KP faults. (Doc ID 1617956.1)

Last updated on JULY 29, 2016

Applies to:

Oracle SuperCluster T5-8 Full Rack - Version All Versions and later
SPARC M6-32 - Version All Versions and later
Oracle SuperCluster M6-32 Hardware - Version All Versions and later
Oracle SuperCluster T5-8 Half Rack - Version All Versions and later
SPARC T5-4 - Version All Versions and later
Information in this document applies to any platform.

Symptoms

PCIEX-8000-J5 and/or PCIEX-8000-KP FMA faults similar to those below will be reported. Systems utilizing InfiniBand fabric, i.e., T5-8 SSC are seen to be more susceptible to these faults.

The fmdump -e output will contain ereport.io.pciex.dl.btlp and/or ereport.io.pciex.dl.bdllp events. Examples are below the FMA examples.

-------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Jan 26 23:29:14 0db03f13-3b8e-6c2c-e4d7-99da84c97f63  PCIEX-8000-J5  Major

Problem Status    : isolated
Diag Engine       : eft / 1.16
System
   Manufacturer  : unknown
   Name          : unknown
   Part_Number   : unknown
   Serial_Number : unknown
   Host_ID       : 84fbc856

----------------------------------------
----------------------------------------
Suspect 1 of 1 :
  Fault class : fault.io.pciex.device-interr-corr
  Certainty   : 100%
  Affects     : dev:////pci@540/pci@1
  Status      : faulted but still in service

  FRU
    Location         : "/SYS/PM2"
    Manufacturer     : unknown
    Name             : unknown
    Part_Number      : 7056873
    Revision         : 08
    Serial_Number    : 465769T+1321880308
    Chassis
       Manufacturer  : Oracle Corporation
       Name          : SPARC T5-8
       Part_Number   : 7068108
       Serial_Number : AK00119551
       Status        : faulty

Description : Too many recovered internal errors have been detected within the
             specified PCIEX device. This may degrade into a non-recoverable
             specified PCIEX device. This may degrade into a non-recoverable
             fault.

 


--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Jan 26 23:56:38 dd72c9ae-890f-e479-8860-c850fffef4e9  PCIEX-8000-KP  Major

Problem Status    : solved
Diag Engine       : eft / 1.16
System
   Manufacturer  : unknown
   Name          : unknown
   Part_Number   : unknown
   Serial_Number : unknown
   Host_ID       : 84fbc856

----------------------------------------
Suspect 1 of 2 :
  Fault class : fault.io.pciex.device-interr-corr
  Certainty   : 100%
  Affects     : dev:////pci@5c0/pci@1/pci@0
  Status      : faulted but still in service

  FRU
    Location         : "/SYS/MB"
    Manufacturer     : unknown
    Name             : unknown
    Part_Number      : 7070931
    Revision         : 03
    Serial_Number    : 465769T+13245203VU
    Chassis
       Manufacturer  : Oracle Corporation
       Name          : SPARC T5-8
       Part_Number   : 7068108
       Serial_Number : AK00119551
       Status        : faulty
----------------------------------------
Suspect 2 of 2 :
  Fault class : fault.io.pciex.bus-linkerr-corr
  Certainty   : 100%
  Affects     : dev:////pci@5c0/pci@1/pci@0
  Status      : faulted but still in service

  FRU
    Location         : "/SYS/MB"
    Manufacturer     : unknown
    Name             : unknown
    Part_Number      : 7070931
    Revision         : 03
    Serial_Number    : 465769T+13245203VU
    Chassis
       Manufacturer  : Oracle Corporation
       Name          : SPARC T5-8
       Part_Number   : 7068108
       Serial_Number : AK00119551
       Status        : faulty

Description : Too many recovered bus errors have been detected, which indicates
             a problem with the specified bus or with the specified
             transmitting device. This may degrade into an unrecoverable
             fault.


Verify the type and number of PCIe errors that triggered the fault event;

 fmdump -eVu {UUID}

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms