I/O SERD Threshold Values Are Set Too Low And May Result In PCIEX-8000-J5, PCIEX-8000-YJ And PCIEX-8000-KP Faults.
(Doc ID 1617956.1)
Last updated on JANUARY 26, 2024
Applies to:
Sun SPARC Enterprise M9000-32 Server - Version All Versions and laterOracle SuperCluster T5-8 Hardware - Version All Versions and later
SPARC M5-32 - Version All Versions and later
Oracle SuperCluster T5-8 Full Rack - Version All Versions and later
SPARC M6-32 - Version All Versions and later
Information in this document applies to any platform.
Symptoms
PCIEX-8000-J5 and/or PCIEX-8000-KP FMA faults similar to those shown below will be reported. Systems utilizing InfiniBand fabric, i.e., T5-8 SSC are seen to be more susceptible to these faults.
The fmdump -e output will contain ereport.io.pciex.dl.btlp and/or ereport.io.pciex.dl.bdllp events. Examples are below the FMA examples.
-------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Jan 26 23:29:14 <UUID> PCIEX-8000-J5 Major
Problem Status : isolated
Diag Engine : eft / 1.16
System
Manufacturer : unknown
Name : unknown
Part_Number : unknown
Serial_Number : unknown
Host_ID : 84fbxxx
----------------------------------------
----------------------------------------
Suspect 1 of 1 :
Fault class : fault.io.pciex.device-interr-corr
Certainty : 100%
Affects : dev:////pci@540/pci@1
Status : faulted but still in service
FRU
Location : "/SYS/PM2"
Manufacturer : unknown
Name : unknown
Part_Number : 7056873
Revision : 08
Serial_Number : xxxxx
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T5-8
Part_Number : 7068108
Serial_Number : xxxxx
Status : faulty
Description : Too many recovered internal errors have been detected within the
specified PCIEX device. This may degrade into a non-recoverable
specified PCIEX device. This may degrade into a non-recoverable
fault.
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Jan 26 23:56:38 <UUID> PCIEX-8000-KP Major
Problem Status : solved
Diag Engine : eft / 1.16
System
Manufacturer : unknown
Name : unknown
Part_Number : unknown
Serial_Number : unknown
Host_ID : 84fbcxxx
----------------------------------------
Suspect 1 of 2 :
Fault class : fault.io.pciex.device-interr-corr
Certainty : 100%
Affects : dev:////pci@5c0/pci@1/pci@0
Status : faulted but still in service
FRU
Location : "/SYS/MB"
Manufacturer : unknown
Name : unknown
Part_Number : 7070931
Revision : 03
Serial_Number :xxxxx
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T5-8
Part_Number : 7068108
Serial_Number : xxxxx
Status : faulty
----------------------------------------
Suspect 2 of 2 :
Fault class : fault.io.pciex.bus-linkerr-corr
Certainty : 100%
Affects : dev:////pci@5c0/pci@1/pci@0
Status : faulted but still in service
FRU
Location : "/SYS/MB"
Manufacturer : unknown
Name : unknown
Part_Number : 7070931
Revision : 03
Serial_Number : xxxxx
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T5-8
Part_Number : 7068108
Serial_Number : xxxxx
Status : faulty
Description : Too many recovered bus errors have been detected, which indicates
a problem with the specified bus or with the specified
transmitting device. This may degrade into an unrecoverable
fault.
Verify the type and number of PCIe errors that triggered the fault event;
fmdump -eVu {UUID}
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
Purpose |
Scope |
Details |
References |