Random "BBU Overheated" Alarm for Battery Backup Unit on Sun Storage 2500-M2 and 6180 Arrays (Doc ID 1392313.1)

Last updated on FEBRUARY 24, 2017

Applies to:

Sun Storage 2530-M2 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 2540-M2 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 6180 Array - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

Due to the unpublished <Bug 15762521> "SUNBT7123598 Battery Temp value out-of-range on 2500-M2/6180 Battery Backup Unit", a Sun Storage 2500-M2 or 6180 array can randomly report the event "BBU Overheated", where BBU means Battery Backup Unit.
You can take a supportdata from the array as per the instructions in the <Document 1002514.1> for Sun Storage Common Array Manager (CAM) or <Document 1014074.1> for SANtricity. Then open the file majorEventLog.txt and see if you have the following event:

Date/Time: Tue Dec 20 17:48:38 CET 2011
Sequence number: 4064
Event type: 7300
Event category: Notification
Priority: Critical
Description: BBU Overheated
Event specific codes: 0/0/0
Component type: Battery
Component location: Tray.85.Controller.A.Battery.A
Logged by: Controller in slot A


You may also have the following alarm from CAM:

Event Code         : xx.66.1261
Event Type         : ProblemEvent.REC_BATTERY_OVERTEMP
Severity           : 0
Sample Description : Battery X is over temperature.
Probable Cause     :
                     Notes:
                     If it was previously enabled, write caching will be
                     suspended for all volumes while this condition is present.
                    
                     The possible causes for this battery being over it's
                     critical temperature threshold are:
                     A fan has failed
                     An obstruction is blocking the air flow to or from the
                     tray, or
                     The ambient room temperature is too high.
Recommended Action :
                     Check the current alarms for a Fan fault.
                     If one or more Fan alarms exist, follow the recovery steps
                     for that alarm.
                     If there are no Fan alarms, check for obstructions or for
                     any room cooling problems.


In order to confirm that you are suffering from this issue, you need to proceed with the following steps:

  1. Follow the instructions in the <Document 1002514.1> for Sun Storage Common Array Manager (CAM) or <Document 1014074.1> for SANtricity in order to collect a supportdata from the array.
  2. Unzip the supportdata and open the file stateCaptureData.dmp using a text editor.
  3. Determine for what battery (A or B) the over temperature alarm is reported.
  4. Look for the line "bidShow(255,0,0,0,0,0,0,0,0,0) on controller A" or "bidShow(255,0,0,0,0,0,0,0,0,0) on controller B" depending on what battery is reported in the alarm.
  5. From this line, scroll down until you see the line "BID Log".
  6. In the "BID Log" output, look at the "Temp" column to see if there is a line reporting a temperature like "65xxx". Example:

    Battery Charging Log        Gas Gauge Attributes                                                                     Extended Attributes
    Date       Time       State MfgSts B_Mode Temp A_Volt A_Crnt MaxErr RelSOC RemCap FulCap C_Crnt C_Volt BatSts Op_Sts AbsSOC CellV4 CellV3 CellV2 CellV1 FetSts Safety PF_Sts ChgSts
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    12/20/2011 11:52:49    Idle 0x010A 0x8081   26   6700      0      1     97    627    648      0   7200 0x40C0 0xE441     87      0      0   3348   3353 0x0006 0x0000 0x0000 0x1000
    12/20/2011 15:20:36    Idle 0x010A 0x8081   26   6695      0      1     97    624    648      0   7200 0x40C0 0xE441     86      0      0   3346   3349 0x0006 0x0000 0x0000 0x1000
    12/20/2011 16:48:38    Idle 0x010A 0x8081 65289   6694      0      1     97    623    648      0   7200 0x40C0 0xE441     86      0      0   3346   3348 0x0006 0x0000 0x0000 0x1000
    12/20/2011 16:48:49    Idle 0xC10A 0x8081   26   6694      0      1     97    623    648      0   7200 0x40C0 0xE441     86      0      0   3346   3348 0x0006 0x0000 0x0000 0x1000
    12/20/2011 16:49:00    Idle 0x010A 0x8081   26   6694      0      1     97    623    648      0   7200 0x40C0 0xE441     86      0      0   3346   3348 0x0006 0x0000 0x0000 0x1000
    12/20/2011 17:17:59    Idle 0x010A 0x8081   26   6694      0      1     96    622    648      0   7200 0x40C0 0xE441     86      0      0   3345   3348 0x0006 0x0000 0x0000 0x1000
  7. If you find such line for the date/time when the alarm was reported, this means that you are suffering from this bug covered in this document and you need to proceed to the "Solution" section of this document. If however you do not find such line, this likely means that you have a real over temperature issue.

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms