FMA and ASR incorrectly reporting L1/L0 CPU cache or internal CPU faults on Exadata CELLs during patching
(Doc ID 1582782.1)
Last updated on FEBRUARY 01, 2024
Applies to:
Oracle Exalogic Elastic Cloud X2-2 One-Eighth Rack - Version X2 and laterOracle Exalogic Elastic Cloud X2-2 Hardware - Version X2 and later
Exadata Database Machine X2-2 Qtr Rack - Version All Versions and later
Exadata Database Machine X2-2 Full Rack - Version All Versions and later
Exadata Database Machine X2-2 Hardware - Version All Versions and later
Information in this document applies to any platform.
Symptoms
When applying Exadata patch updates on CELLs, a L0/L1 CPU Cache or internal CPU fault occurs on either CPU0 or CPU1 during the first reboot of the HOST after the staging of the patch. Note: X4800 and X2-8 DB nodes are not affected by this issue, only the X2 CELLs are. You might see any of these FMA faults or others:
SPX86-8000-P5
SPX86-8000-LH
SPX86-8000-QS
SPX86-8000-F4
The Following FMA Error is an example of one fault that may be seen on either CPU0 or CPU1:
fma/@usr@local@bin@fmadm_faulty.out
------------------- ------------------------------------ -------------- --------
Time UUID msgid Severity
------------------- ------------------------------------ -------------- --------
2013-03-26/19:08:08 fd8b3d5d-ce70-e4a3-83cf-8e97b84c4ba2 SPX86-8000-P5 Critical
Fault class : fault.cpu.intel.l1cache
FRU : /SYS/MB/P1
(Part Number: 060C)
(Serial Number: unknown)
Description : A level 1 cache fault on a processor has occurred.
Response : If running Solaris, the processor will be off-lined upon
system reboot to prevent future interruptions.
Impact : System will panic and reset and system performance may be
impacted.
Action : The administrator should review the ILOM event log for
additional information pertaining to this diagnosis. Please
refer to the Details section of the Knowledge Article for
additional information.
ilom/@usr@local@bin@spshexec_show_-script_@X@logs@event@list.out
2302 Tue Mar 26 19:08:08 2013 Fault Fault critical
Fault detected at time = Tue Mar 26 20:08:08 2013. The suspect component:
/SYS/MB/P1 has fault.cpu.intel.l1cache with probability=100. Refer to ht
tp://www.sun.com/msg/SPX86-8000-P5 for details.
The Following ASR errors may be reported on CPU0 or CPU1:
Hostname: xxxxxxxxx-ilom
Product Type: SUN FIRE X4270 M2 SERVER
Summary:ASR:Level 1 Cache Fault on Processor
Note: This is a component of an Exadata Device.
Note: This Service Request is for an Exadata product [http://www.oracle.com/exadata].
Refer to - http://www.sun.com/msg/SPX86-8000-P5
http://www.sun.com/msg/SPX86-8000-P5
sunHwTrapSystemIdentifier = Exadata Database Machine X2-2 xxxxxxxxxx
sunHwTrapChassisId = xxxxxxxxxx
sunHwTrapProductName = SUN FIRE X4270 M2 SERVER
sunHwTrapSuspectComponentName = /SYS/MB/P0
sunHwTrapFaultClass = fault.cpu.intel.l1cache
sunHwTrapFaultCertainty = 100
sunHwTrapFaultMessageID = http://www.sun.com/msg/SPX86-8000-P5
sunHwTrapFaultUUID = 363d10d7-e0ac-eafd-9596-be78fc105f78
sunHwTrapAssocObjectId = .1.3.6.1.2.1.47.1.1.1.1.2.9
sunHwTrapAdditionalInfo =
Alerts received in last 30 days (limit 10)
Date Summary SR
ASR:Internal Processor Fault
Hostname: xxxxxxxxxx
Product Type: SUN FIRE X4270 M2 SERVER
SunHwTrapFaultDiagnosed
Note: This is a component of an Exadata Device.
Note: This Service Request is for an Exadata product [http://www.oracle.com/exadata].
Details: Refer to https://support.oracle.com/msg/SPX86-8000-F4
Event Time = Thu Oct 31 00:58:31 2013
Fault Message ID = SPX86-8000-F4
Fault UUID = b754e4e5-ff44-49cc-f8b1-8e1cede3216a
Knowledge Article URL = https://support.oracle.com/msg/SPX86-8000-F4
Fault Description =
Fault Severity = 0
Product Manufacturer = Oracle Corporation
Product Name = SUN FIRE X4270 M2 SERVER
Product Serial Number = xxxxxxxxxx
Product Part Number = 602-4982-02
Chassis Manufacturer = Oracle Corporation
Chassis Name = SUN FIRE X4270 M2 SERVER
Chassis Serial Number = xxxxxxxxxx
Chassis Part Number = 602-4982-02
DiagEntity = fdd(1)
SystemIdentifier = Exadata Database Machine X2-2 xxxxxxxxxx
SuspectCount = 1
Event Suspect 1 Information
SuspectFruFaultCertainty = 100
SuspectFruFaultClass = fault.cpu.intel.internal
SuspectFruName = Intel(R) Xeon(R) CPU L5640 @ 2.27GHz
SuspectFruLocation = /SYS/MB/P0
SuspectFruChassisId = xxxxxxxxxx
SuspectFruManufacturer =
SuspectFruPn = 060C
SuspectFruSn =
SuspectFruRevision =
SuspectFruStatus = faulted(3)
Alerts received in last 30 days (limit 10)
Date Summary SR
Changes
Initial reboot after applying Exadata patch updates.
It is believed this event is a false positive and evidence suggests there is nothing wrong with the CPU
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |