ECE Active-Active-Active Cluster Outage
(Doc ID 3015411.1)
Last updated on APRIL 08, 2024
Applies to:
Oracle Communications BRM - Elastic Charging Engine - Version 12.0.0.8.0 and laterInformation in this document applies to any platform.
Symptoms
In a three sites Active-Active-Active cluster, if one cluster goes in a partition data loss situation, then Elastic Charging Engine (ECE) state becomes "Initial" in all the 3 clusters, which results in complete outage across all 3 sites.
STEPS
-----------------------
The issue can be reproduced at will with the following steps:
1. There are 3 site Active-Active-Active cluster.
2. In cluster (C2) goes in partition data loss situation, this partitions has the ECE state information. Hence, ECS pods are restarted by Kubernetes, resulting in those ECE nodes with lost state cache.
3. During restart, ECE state is re-initialized to "Initial state" in C2 cluster. During this time, ECE initial state is federated to other 2 working clusters, resulting in complete outage of all 3 clusters.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |