X4-2/X4-2L Host or ILOM Reboot With SPX86-8000-JY
(Doc ID 2744222.1)
Last updated on JUNE 03, 2022
Applies to:
Sun Server X4-2 - Version All Versions to All Versions [Release All Releases]Sun Server X4-2L - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Symptoms
System will either crash or the ILOM will reboot and the ILOM will report a fault of SPX86-8000-JY:
------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2021-01-14/14:25:39 <UUID> SPX86-8000-JY Critical Problem Status : open Diag Engine : fdd 1.0 System Manufacturer : Oracle Corporation Name : SUN SERVER X4-2 Part_Number : 32403799+1+1 Serial_Number : <serial number> System Component Firmware_Manufacturer : Oracle Corporation Firmware_Version : (ILOM)4.0.4.41 Firmware_Release : (ILOM)2019.04.29 ---------------------------------------- Suspect 1 of 1 Problem class : fault.cpu.intel.l0dcache Certainty : 100% Affects : /SYS/MB/P1 Status : faulted FRU Status : faulty Location : /SYS/MB/P1 Name : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz Part_Number : 060E Chassis Manufacturer : Oracle Corporation Name : SUN SERVER X4-2 Part_Number : 32403799+1+1 Serial_Number : <serial number> Description : A level 0 data cache fault on a processor has occurred. Response : If running Solaris, the processor will be off-lined upon system reboot to prevent future interruptions. Impact : System will panic and reset and system performance may be impacted. Action : Please refer to the associated reference document at http://support.oracle.com/msg/SPX86-8000-JY for the latest service procedures and policies regarding this diagnosis.
Ereports should show the fault:
2021-01-14/14:25:39 ereport.cpu.intel.l0dcache_uc@/SYS/MB/P1 bank name = DCU bank number = 0x1 core number = 0x2 IA32_MCi_STATUS = 0xbb80000000000174 IA32_MCi_MISC = 0x86 IA32_MCi_ADDR = 0x41003d1a40 IA32_MCi_CTL = 0x0 IA32_MCi_CTL2 = 0x0 2021-01-14/14:26:00 ereport.chassis.boot.power-off-requested@/SYS <---- System may crash or the ILOM may reboot right after
Snapshot debug.host.err will show what kind of error it was (uncorrectable in this case):
## Boot time: MSR core scrape ## Thu Jan 14 14:25:39 2021 ID 0186 ******** Data Cache Unit(DCU) Errors ******** CPU(Valid) Core Bank: IA32_MC_STATUS, IA32_MC_MISC IA32_MC_ADDR, IA32_MC_CTL, IA32_MC_CTL2 1( V), 2, DCU: 0xbb80000000000174,0x0000000000000086,0x00000041003d1a40,0x0000000000000000,0x0000000000000000 DCU:IA32_MC_STATUS = 0xbb80000000000174 Field Name Field Value VAL | 0x1 OVER | 0x0 UC | 0x1 EN | 0x1 MISCV | 0x1 DCU:IA32_MC_MISC = 0x0000000000000086 ADDRV | 0x0 PCC | 0x1 S | 0x1 AR | 0x1 THRESHOLD | 0x0 Threshold-based error tracking is not provided CE_COUNT | 0x0 OTHER_INFO | 0x0 MSCOD | 0x0 Non-APIC Error MCACOD | 0x174 Error Type =~: Uncorrectable-> UC = 1, EN = 1, PCC = 1 (First Error) Generic MCACOD Error: Cache Hierarchy Error Sent: ereport.cpu.intel.l0dcache_uc@/SYS/MB/P1 DCU:IA32_MC_MISC = 0x0000000000000086 Field Name Field Value Model_specific_info | 0x0 Address_mode | 0x2 Physical Address Recoverable_address | 0x6 DCU:IA32_MC_ADDR = 0x00000041003d1a40 Field Name Field Value Reserved | 0x0 HiPhyAddr | 0x41 LoPhyAddr | 0x3d1a40 DCU:IA32_MC_CTL = 0x0000000000000000 Field Name Field Value reserved0 | 0x0 EN_DTLB | 0x0 EN | 0x0 DCU:IA32_MC_CTL2 = 0x0000000000000000 Field Name Field Value reserved1 | 0x0 CMCI_EN | 0x0 reserved0 | 0x0 CE_COUNT | 0x0
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |