Systems With Estes RAID HBA Seeing PCIE Panics

(Doc ID 2163842.1)

Last updated on AUGUST 03, 2016

Applies to:

Oracle Server X5-2 - Version All Versions and later
Oracle Server X5-2L - Version All Versions and later
Information in this document applies to any platform.

Symptoms

 Solaris panic and the following panic strings can be seen in the system log.

genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
unix: [ID 836849 kern.notice]
^Mpanic[cpu12]/thread=fffffffc81580b40:
genunix: [ID 647700 kern.notice] pcieb-7: PCI(-X) Express Fatal Error. (0x40)
unix: [ID 100000 kern.notice]
genunix: [ID 655072 kern.notice] fffffffc81580aa0 pcieb:pcieb_intr_handler+233 ()
genunix: [ID 655072 kern.notice] fffffffc81580ae0 apix:apix_dispatch_by_vector+89 ()
genunix: [ID 655072 kern.notice] fffffffc81580b20 apix:apix_dispatch_lowlevel+32 ()
genunix: [ID 655072 kern.notice] fffffffc815388f0 unix:switch_sp_and_call+13 ()
genunix: [ID 655072 kern.notice] fffffffc81538960 apix:apix_do_interrupt+353 ()
genunix: [ID 655072 kern.notice] fffffffc81538970 unix:cmnint+c1 ()
genunix: [ID 655072 kern.notice] fffffffc81538a60 unix:i86_mwait+d ()
genunix: [ID 655072 kern.notice] fffffffc81538a90 unix:cpu_idle_mwait+122 ()
genunix: [ID 655072 kern.notice] fffffffc81538ae0 unix:cpu_acpi_idle+d7 ()
genunix: [ID 655072 kern.notice] fffffffc81538af0 unix:cpu_idle_adaptive+19 ()
genunix: [ID 655072 kern.notice] fffffffc81538b20 unix:idle+11a ()
genunix: [ID 655072 kern.notice] fffffffc81538b30 unix:thread_start+8 ()

 In the FMA log, it showed a fault in the PCIE device is not responding to a valid request.

2016-05-12/12:55:35 1cd51aa0-9f8f-4fe2-a62d-9871d96f7468 PCIEX-8000-DJ

fault = fault.io.pciex.device-noresp@hc:///chassis=0/motherboard=0/hostbridge=1/pciexrc=6/pciexbus=132/pciexdev=0/pciexfn=0
certainty = 40.0 %
FRU = /SYS/MB/RISER1/PCIE1
ASRU = dev:////pci@7a,0/pci8086,2f08@3/pci1000,30a0@0
resource = hc:///chassis=0/motherboard=0/hostbridge=1/pciexrc=6/pciexbus=132/pciexdev=0/pciexfn=0

fault = fault.io.pciex.device-interr@hc:///chassis=0/motherboard=0/hostbridge=1/pciexrc=6/pciexbus=132/pciexdev=0/pciexfn=0
certainty = 40.0 %
FRU = /SYS/MB/RISER1/PCIE1
ASRU = dev:////pci@7a,0/pci8086,2f08@3/pci1000,30a0@0
resource = hc:///chassis=0/motherboard=0/hostbridge=1/pciexrc=6/pciexbus=132/pciexdev=0/pciexfn=0

fault = fault.io.pciex.bus-noresp@hc:///chassis=0/motherboard=0/hostbridge=1/pciexrc=6/pciexbus=132/pciexdev=0/pciexfn=0
certainty = 20.0 %
FRU = /SYS/MB/RISER1/PCIE1
ASRU = dev:////pci@7a,0/pci8086,2f08@3/pci1000,30a0@0
resource = hc:///chassis=0/motherboard=0/hostbridge=1/pciexrc=6/pciexbus=132/pciexdev=0/pciexfn=0

 The ILOM snapshot showed the affected PCIE card in slot 1 is an Estes-EXT RAID card.

/System/PCI_Devices/Add-on/Device_1
Properties:
part_number = 7110119
description = Oracle Storage 12 Gb SAS PCIe HBA, External
location = PCIE1 (PCIe Slot 1)
pci_vendor_id = 0x1000
pci_device_id = 0x0097
pci_subvendor_id = 0x1000
pci_subdevice_id = 0x30a0

 The firmware on this Estes-EXT RAID card shown it is not running the latest firmware release.

PCI   ENCL  LUN  VENDOR    PRODUCT           PRODUCT      SIZE \
SLOT  SLOT  NUM  NAME      IDENTIFIER        REVISION     NVDATA
----  ----  ---  --------  ----------------  ------------ ---------
1                LSI Corp  SAS3008-IT        4.00.00.00   04:05:00:08

 

Changes

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms