cssdagent or cssdmonitor initiated node reboots in HP-UX servers

(Doc ID 1633478.1)

Last updated on JUNE 16, 2014

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
HP-UX PA-RISC (64-bit)
HP-UX Itanium

Symptoms

Grid Infrastructure node eviction occurs.  The cluster alert.log shows that cssdagent and/or cssdmonitor initiated node reboots because they did not get local heartbeats from the ocssd.bin.

The messages in the cluster alert log are like

[ohasd(29566)]CRS-8011:reboot advisory message from host: erpdbp1, component: cssagent, with time stamp: L-2014-01-28-11:16:43.513
[ohasd(29566)]CRS-8013:reboot advisory message text: Rebooting after limit 58173 exceeded; disk timeout 58173, network timeout 57822, last heartbeat from CSSD at epoch seconds 1390925745.311, 58204 milliseconds ago based on invariant clock value of 793608130

and/or

[ohasd(29566)]CRS-8011:reboot advisory message from host: erpdbp1, component: cssmonit, with time stamp: L-2014-02-06-10:34:48.745
[ohasd(29566)]CRS-8013:reboot advisory message text: Rebooting after limit 58421 exceeded; disk timeout 57991, network timeout 58421, last heartbeat from CSSD at epoch seconds 1391700830.248, 58497 milliseconds ago based on invariant clock value of 3942103

 

Other nodes do not report any missing network heartbeat from the node that crashed.  The missing network heartbeat messages start only after the problem node is rebooted as missing network heartbeats are expected when the node is down.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms