Attempting to gracefully reboot hung Exadata cell or database node
(Doc ID 1352805.1)
Last updated on JANUARY 31, 2020
Applies to:
Oracle Exadata Hardware - Version 11.2.0.1 and laterInformation in this document applies to any platform.
Goal
When the system is in an unresponsive state/locked up, exhibiting symptoms such as kernel oopses or panics, it is desirable to attempt a graceful reboot instead of doing a hard reset (power cycle) of the machine. A graceful reboot offers several advantages, such as: avoiding filesystem corruption, giving programs a chance to clean up by sending a SIGINT kill signal, avoiding a time consuming fsck at reboot. This can be achieved by sending the appropriate magic SysRq keys to the kernel. The below procedure uses the system's ILOM console. Depending on the state of the kernel and the nature of the issue or failure, the below procedure may or may not be successful. It should only be performed as a last resort, when all other attempts to restore the system besides a reboot have failed.
Note: The method relies on the SysRq feature being enabled in the kernel. It is enabled by default on both storage cells and database nodes. Disabling it, even on the database nodes, is not supported, as it is used by the Oracle software stack to reboot the system following various failure scenarios. To confirm that it is enabled on the database nodes, run the command cat /proc/sys/kernel/sysrq and sysctl -a |grep sysrq, which will return a value of 1 when this feature is enabled.
Solution
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |