Server running Oracle Linux repeatedly crashes with Critical Temperature Threshold Exceeded error in server management logs
(Doc ID 2442532.1)
Last updated on OCTOBER 27, 2023
Applies to:
Linux OS - Version Oracle Linux 4.4 and laterLinux x86-64
Linux x86
Symptoms
A physical server running Oracle Linux rebooted multiple times on the below dates:
reboot system boot 3.8.13-98.7.1.el Sun Aug 26 08:33 - 08:44 (1+00:10)
reboot system boot 3.8.13-98.7.1.el Wed Aug 22 17:13 - 08:44 (4+15:31)
reboot system boot 3.8.13-98.7.1.el Wed Aug 22 02:31 - 08:44 (5+06:13)
There were no errors logged to the system logs in /var/log/messages.
However the server's management module logs show the the following hardware errors:
54 Critical Environment 08/26/2018 07:09 08/26/2018 07:08 4 Critical Temperature Threshold Exceeded
53 Critical Environment 08/26/2018 09:06 08/26/2018 09:05 3 Critical Temperature Threshold Exceeded
52 Critical OS 08/26/2018 09:05 08/26/2018 09:05 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
51 Caution Environment 08/26/2018 09:05 08/26/2018 09:05 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
50 Informational OS 08/26/2018 09:04 08/26/2018 09:04 1 Automatic Operating System Shutdown Due to Overheat Aborted
49 Repaired OS 08/26/2018 09:04 08/26/2018 09:04 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
48 Repaired Environment 08/26/2018 09:04 08/26/2018 09:04 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
47 Critical Environment 08/26/2018 08:54 08/26/2018 08:54 1 Critical Temperature Threshold Exceeded (Temperature Sensor 1, Location Ambient, Temperature 51C)
46 Critical OS 08/26/2018 06:34 08/26/2018 06:34 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
45 Caution Environment 08/26/2018 06:34 08/26/2018 06:34 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 43C)
44 Repaired OS 08/26/2018 06:34 08/26/2018 06:33 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
43 Repaired Environment 08/26/2018 06:34 08/26/2018 06:33 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
42 Informational OS 08/26/2018 06:34 08/26/2018 06:33 2 Automatic Operating System Shutdown Due to Overheat Aborted
41 Repaired OS 08/26/2018 06:33 08/26/2018 06:33 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
40 Repaired Environment 08/26/2018 06:33 08/26/2018 06:33 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
39 Caution POST Message 08/26/2018 06:43 08/26/2018 06:32 2 Option ROM POST Error: 1792-Slot 0 Drive Array - Valid Data Found in Write-Back Cache. Data will automatically be written to drive array. Action: No action required.
38 Critical Environment 08/26/2018 08:28 08/26/2018 08:28 1 Critical Temperature Threshold Exceeded (Temperature Sensor 1, Location Ambient, Temperature 56C
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |