My Oracle Support Banner

Reconfigure OC2B Cluster Heartbeat Timeout to Default Value on Exalogic Virtual (Doc ID 2435272.1)

Last updated on AUGUST 31, 2018

Applies to:

Oracle Exalogic Elastic Cloud Software - Version 2.0.6.2.170418 to 2.0.6.2.170418
Oracle Exalogic Elastic Cloud Software - Version 2.0.6.3.170718 and later
Linux x86-64
Oracle Virtual Server x86-64

★★★★ O2CB Timeout Settings in this Note DOES NOT APPLY to 2.0.6.4.X Versions ★★★★

NOTE: The procedure in this document may be performed as required if the rack is running Apr 2017 PSU or a higher 2.0.6.3.x version and is currently configured with an O2CB cluster heartbeat timeout of 24 hours, implemented as described in the following MOS note:
Increase O2CB Cluster Heartbeat Timeout on Exalogic Virtual (Doc ID 1995593.1)

Customers who have implemented custom timeout values with approval from the Exalogic engineering team may also perform this reconfiguration to decrease the cluster heartbeat timeout to the original default value of 5 minutes.

This reconfiguration is not integrated into an Exalogic PSU at this time.

Purpose

A detailed step-by-step procedure was provided in the following Document to increase the O2CB cluster heartbeat timeout from 5 min to a very large value of 24 hr:

Increase O2CB Cluster Heartbeat Timeout on Exalogic Virtual (Doc ID 1995593.1)

The increased timeout effectively prevented a catastrophic reboot of all compute nodes on an Exalogic rack in a virtual configuration due to fencing, in the event of storage loss for an extended period of time, such as during an ZFS SW upgrade or a ZFS HA failover event.

The UEK2 kernel bundled in April 2017 PSU (and higher) delivers an enhancement fix with the following bug:

(UNPUBLISHED) BUG:24305598
BUG 24305598 - Backport fix for bug 21862940 to UEK2 codeline

(UNPUBLISHED) BASE BUG:21862940
BUG 21862940 - OCFS2: ALL NODES WILL FENCE SELF WHEN STORAGE DOWN

The backport bug 24305598 is fixed in UEK2 v2.6.39-400.288.1. The Exalogic April 2017 PSU bundles UEK2 kernel v2.6.39-400.294.1 which includes this fix.

The fix implemented provides a key enhancement relative to compute node behavior upon storage loss:

IMPORTANT

This enhancement prevents such rack-wide compute node reboots due to fencing upon extended loss of storage. Consequently, we recommend that the O2CB cluster heartbeat timeout be reconfigured to its original default value of 5 minutes.

Scope

This document provides a step-by-step procedure to reconfigure the O2CB cluster heartbeat timeout to the default value of 5 minutes.

Details

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Purpose
Scope
Details
 Procedure
 DCLI configuration and node list
 Stop the Exalogic Control Stack
 Reconfigure the O2CB cluster heartbeat timeout in the pool
 Stop and Start Guest vServers
 Start the Exalogic Control Stack
 Sample Console Output
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.