Solaris Cluster 4.x Suffers Unexpected Restarts of Failover Zone Resources After Upgrade to Solaris 11.2
(Doc ID 1966105.1)
Last updated on AUGUST 14, 2020
Applies to:Solaris Cluster - Version 4.0 to 4.2 [Release 4.0 to 4.2]
Solaris Operating System - Version 11.2 to 11.2 [Release 11.0]
Oracle Solaris on x86-64 (64-bit)
Oracle Solaris on SPARC (64-bit)
In this case Solaris Cluster 4.2 is configured to provide a Failover Zone using the HA-Container agent.
The HA-Container resources get randomly restarted due to a probe failure with no apparent error or reason
justifitying the probe failure in messages files or any other log.
In such cases, it would be best to enable debugging following the instructions in section:
Enabling debug for Solaris Containers HA Data Service (aka zone, sczbt)
Document 1010497.1 Solaris Cluster Resource How to Enable Data Service Debug Mode
After enabling debug, you could see something like:
If the zone had been up and running for days before the problem, the above debug output could be giving misleading information.
Since the start_script is only used during start method execution, if the start completed a long time before, and yet the start_sczbt script
was actually still running, the probe should have triggered a restart much earlier.
In order to debug the problem first verify whether or not the issue is caused by:
Bug 19930993 - ksh use exit code of the previous child during if branch test
As described below, you might want to do further troubleshooting (eg using truss on the probe scripts themselves). Due to lack of explicit errors,
it is recommended that a Service Request is opened to get guidance on such troubleshooting.
Starting with Solaris 11 a new version of ksh based on opensource ksh93 code is being delivered (different from older ksh used in Solaris 10).
Any system running Solaris 11 (in conjunction with Solaris Cluster and failover zones) could be affected but it appears that the problem happens more frequently with Solaris 11.2.
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!