FMA and ASR incorrectly reporting L1/L0 CPU cache or internal CPU faults on Exadata CELLs during patching (Doc ID 1582782.1)

Last updated on OCTOBER 17, 2015

Applies to:

Oracle Exalogic Elastic Cloud X2-2 One-Eighth Rack - Version X2 and later
Oracle Exalogic Elastic Cloud X2-2 Hardware - Version X2 and later
Exadata Database Machine X2-2 Qtr Rack - Version All Versions and later
Exadata Database Machine X2-2 Full Rack - Version All Versions and later
Exadata Database Machine X2-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Symptoms

When applying Exadata patch updates on CELLs, a L0/L1 CPU Cache or internal CPU fault occurs on either CPU0 or CPU1 during the first reboot of the HOST after the staging of the patch.  Note: X4800 and X2-8 DB nodes are not affected by this issue, only the X2 CELLs are.  You might see any of these FMA faults or others:

SPX86-8000-P5

SPX86-8000-LH

SPX86-8000-QS

SPX86-8000-F4

 

The Following FMA Error is an example of one fault that may be seen on either CPU0 or CPU1:

 

fma/@usr@local@bin@fmadm_faulty.out

------------------- ------------------------------------ -------------- --------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- --------
2013-03-26/19:08:08 fd8b3d5d-ce70-e4a3-83cf-8e97b84c4ba2 SPX86-8000-P5  Critical

Fault class : fault.cpu.intel.l1cache

FRU         : /SYS/MB/P1
             (Part Number: 060C)
             (Serial Number: unknown)

Description : A level 1 cache fault on a processor has occurred.

Response    : If running Solaris, the processor will be off-lined upon
             system reboot to prevent future interruptions.

Impact      : System will panic and reset and system performance may be
             impacted.

Action      : The administrator should review the ILOM event log for
             additional information pertaining to this diagnosis.  Please
             refer to the Details section of the Knowledge Article for
             additional information.

ilom/@usr@local@bin@spshexec_show_-script_@X@logs@event@list.out

2302   Tue Mar 26 19:08:08 2013  Fault     Fault     critical
      Fault detected at time = Tue Mar 26 20:08:08 2013. The suspect component:
       /SYS/MB/P1 has fault.cpu.intel.l1cache with probability=100. Refer to ht
      tp://www.sun.com/msg/SPX86-8000-P5 for details.

 

The Following ASR errors may be reported on CPU0 or CPU1:

Hostname: xda001cel01-ilom
Product Type: SUN FIRE X4270 M2 SERVER
Summary:ASR:Level 1 Cache Fault on Processor

Note: This is a component of an Exadata Device.
Note: This Service Request is for an Exadata product [http://www.oracle.com/exadata].
Refer to - http://www.sun.com/msg/SPX86-8000-P5
http://www.sun.com/msg/SPX86-8000-P5
sunHwTrapSystemIdentifier = Exadata Database Machine X2-2 AK00019313
sunHwTrapChassisId = 1119FMM06A
sunHwTrapProductName = SUN FIRE X4270 M2 SERVER
sunHwTrapSuspectComponentName = /SYS/MB/P0
sunHwTrapFaultClass = fault.cpu.intel.l1cache
sunHwTrapFaultCertainty = 100
sunHwTrapFaultMessageID = http://www.sun.com/msg/SPX86-8000-P5
sunHwTrapFaultUUID = 363d10d7-e0ac-eafd-9596-be78fc105f78
sunHwTrapAssocObjectId = .1.3.6.1.2.1.47.1.1.1.1.2.9
sunHwTrapAdditionalInfo =

Alerts received in last 30 days (limit 10)
Date Summary SR           

ASR:Internal Processor Fault
Hostname: serverx4270m2
Product Type: SUN FIRE X4270 M2 SERVER

SunHwTrapFaultDiagnosed
Note: This is a component of an Exadata Device.
Note: This Service Request is for an Exadata product [http://www.oracle.com/exadata].
Details: Refer to https://support.oracle.com/msg/SPX86-8000-F4

Event Time = Thu Oct 31 00:58:31 2013
Fault Message ID = SPX86-8000-F4
Fault UUID = b754e4e5-ff44-49cc-f8b1-8e1cede3216a
Knowledge Article URL = https://support.oracle.com/msg/SPX86-8000-F4
Fault Description =
Fault Severity = 0
Product Manufacturer = Oracle Corporation
Product Name = SUN FIRE X4270 M2 SERVER
Product Serial Number = 1234FMM0RX
Product Part Number = 602-4982-02
Chassis Manufacturer = Oracle Corporation
Chassis Name = SUN FIRE X4270 M2 SERVER
Chassis Serial Number =
1234FMM0RX
Chassis Part Number = 602-4982-02
DiagEntity = fdd(1)
SystemIdentifier = Exadata Database Machine X2-2 AK00000465

SuspectCount = 1
Event Suspect 1 Information
SuspectFruFaultCertainty = 100
SuspectFruFaultClass = fault.cpu.intel.internal
SuspectFruName = Intel(R) Xeon(R) CPU L5640 @ 2.27GHz
SuspectFruLocation = /SYS/MB/P0
SuspectFruChassisId = 1234FMM0RX
SuspectFruManufacturer =
SuspectFruPn = 060C
SuspectFruSn =
SuspectFruRevision =
SuspectFruStatus = faulted(3)


Alerts received in last 30 days (limit 10)
Date Summary SR


Changes

Initial reboot after applying Exadata patch updates.

It is believed this event is a false positive and evidence suggests there is nothing wrong with the CPU

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms