My Oracle Support Banner

Ops Center: Steps to Clear a failed EC HA restart and increase timeout for Clusteware (Doc ID 1515725.1)

Last updated on AUGUST 23, 2022

Applies to:

Enterprise Manager Ops Center - Version 12C and later
Information in this document applies to any platform.

Goal

The EC services look to have gone offline due to an authentication event exceeding the allowed number of failed login attempts.  The EC then tried to restart those services and at the same time the HA Clusterware software tried to start the secondary EC instance. This Clusterware start eventually timed out and the EC's services finally started but this left the environment in a strange state.

The environment typically took 6-7 minutes to start with 2 Proxy Controllers and 8 CDOM's, the current environment has doubled to 4 Proxy Controllers and 16 CDOM's so estimated time to start is now 12-14 minutes. The timeout for a failover start is 9 minutes which is now too short for the current environment.   So the time out needs to be increased as well.  

Ops Center: ecadm ha-start failed to start the EC after crash

The EC crashed and put the HA Cluster into an inconsistent state. After rebooting the servers, the ./ecadm ha-start failed to start the EC.  Note the <EC> could be the primary or the secondary EC in the example below.

Solution

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Goal
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.