My Oracle Support Banner

Server running Oracle Linux repeatedly crashes with Critical Temperature Threshold Exceeded error in server management logs (Doc ID 2442532.1)

Last updated on APRIL 24, 2020

Applies to:

Linux OS - Version Oracle Linux 4.4 and later
Linux x86-64
Linux x86

Symptoms

A physical server running Oracle Linux rebooted multiple times on the below dates:

reboot system boot 3.8.13-98.7.1.el Sun Aug 26 08:33 - 08:44 (1+00:10)
reboot system boot 3.8.13-98.7.1.el Wed Aug 22 17:13 - 08:44 (4+15:31)
reboot system boot 3.8.13-98.7.1.el Wed Aug 22 02:31 - 08:44 (5+06:13)

There were no errors logged to the system logs in /var/log/messages.
However the server's management module logs show the the following hardware errors:

54 Critical Environment 08/26/2018 07:09 08/26/2018 07:08 4 Critical Temperature Threshold Exceeded
53 Critical Environment 08/26/2018 09:06 08/26/2018 09:05 3 Critical Temperature Threshold Exceeded
52 Critical OS 08/26/2018 09:05 08/26/2018 09:05 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
51 Caution Environment 08/26/2018 09:05 08/26/2018 09:05 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
50 Informational OS 08/26/2018 09:04 08/26/2018 09:04 1 Automatic Operating System Shutdown Due to Overheat Aborted
49 Repaired OS 08/26/2018 09:04 08/26/2018 09:04 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
48 Repaired Environment 08/26/2018 09:04 08/26/2018 09:04 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
47 Critical Environment 08/26/2018 08:54 08/26/2018 08:54 1 Critical Temperature Threshold Exceeded (Temperature Sensor 1, Location Ambient, Temperature 51C)
46 Critical OS 08/26/2018 06:34 08/26/2018 06:34 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
45 Caution Environment 08/26/2018 06:34 08/26/2018 06:34 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 43C)
44 Repaired OS 08/26/2018 06:34 08/26/2018 06:33 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
43 Repaired Environment 08/26/2018 06:34 08/26/2018 06:33 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
42 Informational OS 08/26/2018 06:34 08/26/2018 06:33 2 Automatic Operating System Shutdown Due to Overheat Aborted
41 Repaired OS 08/26/2018 06:33 08/26/2018 06:33 1 Automatic Operating System Shutdown Initiated Due to Overheat Condition
40 Repaired Environment 08/26/2018 06:33 08/26/2018 06:33 1 System Overheating (Temperature Sensor 1, Location Ambient, Temperature 42C)
39 Caution POST Message 08/26/2018 06:43 08/26/2018 06:32 2 Option ROM POST Error: 1792-Slot 0 Drive Array - Valid Data Found in Write-Back Cache. Data will automatically be written to drive array. Action: No action required.
38 Critical Environment 08/26/2018 08:28 08/26/2018 08:28 1 Critical Temperature Threshold Exceeded (Temperature Sensor 1, Location Ambient, Temperature 56C

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.