X4800 / X2-8 Server might Post CPU/PCI Faults and Fail to Boot after Firmware Upgrade on Sun InfiniBand Dual Port 4x QDR PCIe EM (Doc ID 2116695.1)

Last updated on MAY 05, 2017

Applies to:

Sun Fire X4800 Server - Version Not Applicable to Not Applicable [Release N/A]
Sun Server X2-8 - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

The system posts CPU/PCI faults and fails to boot.

e.g.

FMA fault on SP:
fault.cpu.intel.internal --> SPX86-8000-F4


HOST Console output:
[  162.215634] mlx4_core 0000:08:00.0: command 0x23 timed out (go bit not cleared).
[  162.292781] mlx4_core 0000:08:00.0: device is going to be reset.
[  163.411212] pcieport 0000:00:05.0: AER: Uncorrected (Non-Fatal) error received: id=0028.
[  163.498970] pcieport 0000:00:05.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0028(Requester ID).
[  163.644466] pcieport 0000:00:05.0:   device [8086:340c] error status/mask=00004000/00000000.
[  163.728809] pcieport 0000:00:05.0:    [14] Completion Timeout     (First).

Changes

On systems which have both PCIe EM slots of a CMOD/IOH populated with a Sun InfiniBand Dual Port 4x QDR PCIe ExpressModule Host Channel Adapter M2.

The issue might be triggered by:

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms