Ops Center: Steps to Clear a failed EC HA restart and increase timeout for Clusteware
Last updated on DECEMBER 30, 2015
Applies to:Enterprise Manager Ops Center - Version 12C and later
Information in this document applies to any platform.
The EC services look to have gone offline due to an authentication event exceeding the allowed number of failed login attempts. The EC then tried to restart those services and at the same time the HA Clusterware software tried to start the secondary EC instance. This Clusterware start eventually timed out and the EC's services finally started but this left the environment in a strange state.
The environment typically took 6-7 minutes to start with 2 Proxy Controllers and 8 CDOM's, the current environment has doubled to 4 Proxy Controllers and 16 CDOM's so estimated time to start is now 12-14 minutes. The timeout for a failover start is 9 minutes which is now too short for the current environment. So the time out needs to be increased as well.
Ops Center: ecadm ha-start failed to start the EC after crash
The EC crashed and put the HA Cluster into an inconsistent state. After rebooting the servers, the ./ecadm ha-start failed to start the EC.
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms