My Oracle Support Banner

ECC High Availability Setup Fails To Bring Available Cluster Nodes When One Server Goes Down And Data Load Fails With "There is an error while registering the collection" Error (Doc ID 3075610.1)

Last updated on MARCH 07, 2025

Applies to:

Oracle Enterprise Command Center Framework - Version 12.2.4 and later
Information in this document applies to any platform.

Symptoms

When attempting to run data load after setting up Enterprise Command Center (ECC) cluster and bringing down one of the nodes (of 3 nodes), data load fails and the following error occurs in data load logs from $ECC_BASE/Oracle/quickInstall/logs/ecc directory:

There is an error while registering the collection Collection eam-wo-op Deletion -> Success, Configset eam-wo-op Deletion -> Success, eam-wo-op Collection -> Dataset eam-wo-op registration failed with error Error from server at <servername>:port/core_ecc: Cannot create collection eam-wo-op. Value of maxShardsPerNode is 1, and the number of nodes currently live or live and part of your createNodeSet is 2. This allows a maximum of 2 to be created. Value of numShards is 1, value of nrtReplicas is 3, value of tlogReplicas is 0 and value of pullReplicas is 0. This requires 3 shards to be created (higher than the allowed number).

solr.log under $ECC_BASE/Middleware/user_projects/domains/ecc_domain/solr_logs directory will show below error:

ERROR (updateExecutor-11-thread-2-processing-n:servername:port_core_ecc x:inv-transaction-management_shard1_replica_n1 c:inv-transaction-management s:shard1 r:core_node3) [c:inv-transaction-management s:shard1 r:core_node3 x:inv-transaction-management_shard1_replica_n1] o.a.s.c.SyncStrategy servername:port/core_ecc/inv-transaction-management_shard1_replica_n1/: Could not tell a replica to recover:org.apache.solr.client.solrj.SolrServerException: IOException occurred when talking to server at: servername:port/core_ecc
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:695)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:323)
at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to servername:port
[servername:port] failed: Read timed out
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:571)
... 9 more

The issue can be reproduced at will with the following steps:

1. Setup EBS 12.2.x and integrate with ECC version using Doc ID: 2495053.1
2. Make sure that ECC is working fine
3. Create ECC 3 node cluster as per Enterprise Command Center Framework High Availability, Release 12.2 (Doc ID 2745581.1)
4. Check to see if ECC nodes are working when one of the nodes is brought down.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.