Exalogic Virtual 18.104.22.168.X, 22.214.171.124.X, 126.96.36.199.X Releases ONLY: Increase O2CB Cluster Heartbeat Timeout On Dom0 OVS Compute Nodes
(Doc ID 1995593.1)
Last updated on JUNE 23, 2020
Applies to:Oracle Exalogic Elastic Cloud Software - Version 188.8.131.52.0 to 184.108.40.206.2
Oracle Exalogic Elastic Cloud Software - Version 220.127.116.11.0 to 18.104.22.168.2
Oracle Exalogic Elastic Cloud Software - Version 22.214.171.124.1 to 126.96.36.199.170117
Oracle Virtual Server x86-64
★★★★ O2CB Timeout Settings in this Note DOES NOT APPLY to 188.8.131.52.X and 184.108.40.206.X Versions ★★★★
IMPORTANT NOTE: The procedure in this document is intended to be performed as part of applying the April 2015 PSU (EECS 220.127.116.11.1) and higher. The documentation for the April 2015 PSU (and higher) for Virtual contains a link to this MOS note. Implementing this procedure in a standalone manner (not as part of April 2015 PSU) on earlier EECS releases requires explicit approval from the Exalogic Development team.
Please note, procedure in this Note to increase o2cb timeput value on Dom0 compute nodes applies to only below Exalogic Virtual releases.
Procedure in this Note to increase o2cb timeout value DOES NOT APPLY to 18.104.22.168.X & 22.214.171.124.X Virtual releases.
A detailed step-by-step procedure is provided in this document to increase the O2CB cluster heartbeat timeout from 5 min to a very large value of 24 hr. The increased timeout effectively prevents a catastrophic reboot of all compute nodes on an Exalogic rack in a virtual configuration due to fencing, in the event of ZFS taking a long time to complete a takeover. A ZFS takeover is performed during the upgrade of the ZFS software version, as well as during an HA failover event when the standby control head becomes active.
The reconfiguration will also stop fencing from all long ZFS and network availability conditions including:
- NFS server unreachable/non-responsive
- Hardware failure (ZFS or on-node HCA)
- Network outage conditions on individual compute nodes or in the fabric (failures unrelated to ZFS)
- Any other condition where the NFS service becomes unavailable - both planned and unplanned
Note that this configuration change may cause applications and control stack behavior to be indeterminate in the case of very long network or storage availability conditions.
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document
|DCLI configuration and node list|
|Stop the Exalogic Control Stack|
|Reconfigure the O2CB cluster heartbeat timeout in the pool|
|Stop and Start Guest vServers|
|Start the Exalogic Control Stack|
|Sample Console Output|