How to restart all compute nodes in a Exalogic Virtual Pool to recover operation of compute nodes where "D" state processes are running

(Doc ID 1950446.1)

Last updated on DECEMBER 09, 2014

Applies to:

Oracle Exalogic Elastic Cloud Software - Version 2.0.6.0.0 and later
Linux x86-64
Oracle Virtual Server (x86-64)

Goal

While running an Exalogic Virtual system you may, in exceptional circumstances, encounter a situation where the majority of the compute nodes in a server pool have become unhealthy due to processes from key services such as "o2cb", "ocsf2" or "ovs-agent" reaching a deep "Disk Sleep" state (processes list with a status of "D" in "ps -ef" output). When this occurs it can become necessary to shutdown and reboot all nodes in the affected pool to restore the system to full operation. This note illustrates the process that should be followed to shutdown and restart all nodes in a pool without exposing additional "HA" related complications that can impact the subsequent restart of vServers after the nodes have been rebooted.

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms