My Oracle Support Banner

X4-2/X4-2L Host or ILOM Reboot With SPX86-8000-JY (Doc ID 2744222.1)

Last updated on JANUARY 14, 2021

Applies to:

Sun Server X4-2 - Version All Versions to All Versions [Release All Releases]
Sun Server X4-2L - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

System will either crash or the ILOM will reboot and the ILOM will report a fault of SPX86-8000-JY:

------------------- ------------------------------------ -------------- --------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- --------
2021-01-14/14:25:39        <UUID>                        SPX86-8000-JY  Critical

Problem Status           : open
Diag Engine              : fdd 1.0
System                  
   Manufacturer          : Oracle Corporation
   Name                  : SUN SERVER X4-2
   Part_Number           : 32403799+1+1
   Serial_Number         : <serial number>

System Component        
   Firmware_Manufacturer : Oracle Corporation
   Firmware_Version      : (ILOM)4.0.4.41
   Firmware_Release      : (ILOM)2019.04.29

----------------------------------------
Suspect 1 of 1
   Problem class  : fault.cpu.intel.l0dcache
   Certainty      : 100%
   Affects        : /SYS/MB/P1
   Status         : faulted

   FRU                 
      Status            : faulty
      Location          : /SYS/MB/P1
      Name              : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
      Part_Number       : 060E
      Chassis          
         Manufacturer   : Oracle Corporation
         Name           : SUN SERVER X4-2
         Part_Number    : 32403799+1+1
         Serial_Number  : <serial number>

Description : A level 0 data cache fault on a processor has occurred.

Response    : If running Solaris, the processor will be off-lined upon
              system reboot to prevent future interruptions.

Impact      : System will panic and reset and system performance may be
              impacted.

Action      : Please refer to the associated reference document at
              http://support.oracle.com/msg/SPX86-8000-JY for the latest
              service procedures and policies regarding this diagnosis.

Ereports should show the fault:

2021-01-14/14:25:39  ereport.cpu.intel.l0dcache_uc@/SYS/MB/P1
                         bank name       = DCU
                         bank number     = 0x1
                         core number     = 0x2
                         IA32_MCi_STATUS = 0xbb80000000000174
                         IA32_MCi_MISC   = 0x86
                         IA32_MCi_ADDR   = 0x41003d1a40
                         IA32_MCi_CTL    = 0x0
                         IA32_MCi_CTL2   = 0x0

2021-01-14/14:26:00  ereport.chassis.boot.power-off-requested@/SYS  <---- System may crash or the ILOM may reboot right after

Snapshot debug.host.err will show what kind of error it was (uncorrectable in this case):

## Boot time: MSR core scrape ##

Thu Jan 14 14:25:39 2021 ID 0186  ******** Data Cache Unit(DCU) Errors ********
CPU(Valid) Core       Bank:    IA32_MC_STATUS,      IA32_MC_MISC      IA32_MC_ADDR,       IA32_MC_CTL,      IA32_MC_CTL2
 1(  V),    2,         DCU: 0xbb80000000000174,0x0000000000000086,0x00000041003d1a40,0x0000000000000000,0x0000000000000000
 DCU:IA32_MC_STATUS = 0xbb80000000000174
    Field Name               Field Value       
        VAL                  | 0x1
        OVER                 | 0x0
        UC                   | 0x1
        EN                   | 0x1
        MISCV                | 0x1
               DCU:IA32_MC_MISC = 0x0000000000000086
        ADDRV                | 0x0
        PCC                  | 0x1
        S                    | 0x1
        AR                   | 0x1
        THRESHOLD            | 0x0
                            Threshold-based error tracking is not provided
        CE_COUNT             | 0x0
        OTHER_INFO           | 0x0
        MSCOD                | 0x0
                            Non-APIC Error
        MCACOD               | 0x174
Error Type =~: Uncorrectable-> UC = 1, EN = 1, PCC = 1 (First Error)

Generic MCACOD Error: Cache Hierarchy Error

Sent: ereport.cpu.intel.l0dcache_uc@/SYS/MB/P1

 DCU:IA32_MC_MISC = 0x0000000000000086
    Field Name               Field Value       
        Model_specific_info  | 0x0
        Address_mode         | 0x2
                            Physical Address    
        Recoverable_address  | 0x6
 DCU:IA32_MC_ADDR = 0x00000041003d1a40
    Field Name               Field Value       
        Reserved             | 0x0
        HiPhyAddr            | 0x41
        LoPhyAddr            | 0x3d1a40
 DCU:IA32_MC_CTL = 0x0000000000000000
    Field Name               Field Value       
        reserved0            | 0x0
        EN_DTLB              | 0x0
        EN                   | 0x0
 DCU:IA32_MC_CTL2 = 0x0000000000000000
    Field Name               Field Value       
        reserved1            | 0x0
        CMCI_EN              | 0x0
        reserved0            | 0x0
        CE_COUNT             | 0x0

 

Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.