Ops Center: Steps to Clear a failed EC HA restart and increase timeout for Clusteware

(Doc ID 1515725.1)

Last updated on DECEMBER 30, 2015

Applies to:

Enterprise Manager Ops Center - Version 12C and later
Information in this document applies to any platform.

Goal

The EC services look to have gone offline due to an authentication event exceeding the allowed number of failed login attempts.  The EC then tried to restart those services and at the same time the HA Clusterware software tried to start the secondary EC instance. This Clusterware start eventually timed out and the EC's services finally started but this left the environment in a strange state.

The environment typically took 6-7 minutes to start with 2 Proxy Controllers and 8 CDOM's, the current environment has doubled to 4 Proxy Controllers and 16 CDOM's so estimated time to start is now 12-14 minutes. The timeout for a failover start is 9 minutes which is now too short for the current environment.   So the time out needs to be increased as well.  


Ops Center: ecadm ha-start failed to start the EC after crash

The EC crashed and put the HA Cluster into an inconsistent state. After rebooting the servers, the ./ecadm ha-start failed to start the EC.


 

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms