Sun SPARC[TM] Enterprise M4000/M5000 - MBU_B and MEMB being faulted with SCF-8004-8X, SCF-8000-1D, and SCF-8005-MJ errors.

(Doc ID 1296435.1)

Last updated on SEPTEMBER 07, 2017

Applies to:

Sun SPARC Enterprise M5000 Server - Version Not Applicable and later
Sun SPARC Enterprise M4000 Server - Version Not Applicable and later
Information in this document applies to any platform.

Symptoms

Showlogs monitor have the following errors:

Jan 20 14:59:35 XSCF Warning: /UNSPECIFIED:SCF:spurious unit interrupt
Jan 20 15:00:00 XSCF last message repeated 11 times
Jan 20 15:01:53 XSCF Alarm: /MBU_B/MEMB#5:ANALYZE:MAC detected clock fatal failure
Jan 20 15:02:41 XSCF Warning: /MBU_B:SCF:SC test error
Jan 20 15:02:57 XSCF Warning: /MBU_B:SCF:SC test error
Jan 20 15:03:16 XSCF Warning: /MBU_B:SCF:SC test error
Jan 20 15:03:17 XSCF Warning: /MBU_B:SCF:SC test error

FMA will record these same events as:

Jan 20 20:59:48.7163 051fbd14-0465-492b-aa42-6a705e31f05f SCF-8004-8X
Jan 20 21:01:46.3456 10355763-1a55-424c-ba69-bd2f3789e3d4 SCF-8000-1D
Jan 20 21:02:26.2880 12bdb2c7-6b8c-49ff-b37f-0ec60d958d40 SCF-8005-MJ

or
Jan 20 20:59:48.7163 ereport.chassis.SPARC-Enterprise.asic.cpu.power.intr-fail
Jan 20 21:01:46.3456 ereport.chassis.SPARC-Enterprise.if.fe-asic-clk
Jan 20 21:02:26.2880 ereport.chassis.SPARC-Enterprise.asic.sc.test

The error to note is the SCF-8000-1D which is a clock distribution error.

Additional failure scenario:

In certain rare occasion, the 'clock fatal failure' is not seen and the following failure signature is seen. The following solution still applies.

Showlogs monitor:

Nov  5 20:56:01 lonshspltp10a-m Alarm: /MBU_B/MEMB#0,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:11 lonshspltp10a-m Alarm: /MBU_B/MEMB#1,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:32 lonshspltp10a-m Alarm: /MBU_B/MEMB#2,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:45 lonshspltp10a-m Alarm: /MBU_B/MEMB#3,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:50 lonshspltp10a-m Warning: /MBU_B:SCF:Abnormal reaction of LSI (compare)
Nov  5 20:57:08 lonshspltp10a-m Alarm: /MBU_B/MEMB#4,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:57:13 lonshspltp10a-m Warning: /MBU_B:SCF:Abnormal reaction of LSI (compare)
Nov  5 20:57:16 lonshspltp10a-m Warning: /MBU_B:SCF:Abnormal reaction of LSI (compare)

Following FMA MSG-IDs are seen:
   UUID: 2841095a-2956-4f63-b3e4-ec38124de691 MSG-ID: SCF-8004-3Y
   UUID: 68506c5e-c87b-4d36-93bc-a5100fb7faf3 MSG-ID: SCF-8002-K2
Showstatus:
*   MBU_B Status:Faulted;
*       MEMB#0 Status:Faulted;
*       MEMB#1 Status:Faulted;
*       MEMB#2 Status:Faulted;
*       MEMB#3 Status:Faulted;
*       MEMB#4 Status:Faulted;

Loosing power from IOU#0 may cause other errors that are not listed here but should be considered victims of the power loss.

Changes

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms