Understanding and Decoding Machine Check Errors on Opteron systems running Solaris[TM] Operating System for x86 Platforms

(Doc ID 1012598.1)

Last updated on NOVEMBER 16, 2016

Applies to:

Sun Java Workstation W1100z - Version Not Applicable to Not Applicable [Release N/A]
Sun Fire X2100 M2 Server - Version Not Applicable to Not Applicable [Release N/A]
Sun Ultra 20 M2 Workstation - Version Not Applicable to Not Applicable [Release N/A]
Sun Fire X4200 Server - Version Not Applicable to Not Applicable [Release N/A]
Sun Fire X2200 M2 Server - Version Not Applicable to Not Applicable [Release N/A]
All Platforms

Goal

The machine check mechanism on the AMD64 processor allows it to detect and report hardware errors.

When an unrecoverable Machine Check Errors is detected, a Machine Check Exception is generated.

When such condition occurs, Solaris[TM] Operating System for x86 Platforms can panic with a Trap of type 0x12 (Machine check exception). When the Solaris{TM} Operating System (OS) receives a Machine check exception, it shows MCE warning messages just before the panic.

Examples of the warning:

WARNING: MCE: Bank 0: error code 15:addr = cf53c000, model errcode = 0
WARNING: MCE: Bank 2: error code 152:addr = e748, model errcode = 2
WARNING: MCE: Bank 4: error code 0xf0f, mserrcode = 0x7

The meaning of MCE messages depends on processors. This document explains how to understand MCE messages on AMD Opteron based systems.

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms