My Oracle Support Banner

OCSG Cluster Mode Failover Time Too Long (Doc ID 2002729.1)

Last updated on FEBRUARY 28, 2019

Applies to:

Oracle Communications Services Gatekeeper - Version 5.0.0.1 to 5.1.0 [Release 5.0 to 5.1]
Information in this document applies to any platform.

Symptoms

When one Oracle Communications Services Gatekeeper (OCSG) Network Tier (NT) node experiencea hardware/network crashes, like unplug cable or reboot machine, the Application Tier (AT) is out of service for a period of time.
Even after turning the cluster heartbeat “period length” to minimum value, the time is around 20 seconds.
It's not acceptable if the out of service time is more than 1 second in active-active deployment.


High Availability (HA) tests in 2x2 cluster environment.
1) REST / Parlay QoS-applyQos: Disable Ethernet of one NT during traffic run
2) REST / Parlay QoS-applyQos: shutdown of one NT Server during traffic run

Both cases fais with below symptoms:
Requests sent to AT1 and AT2 get no response,no matter what kind communication services.

This symptoms last for about 5 minutes, after this period of time the system recovers.
Killing the NT server doesn't have such service interrupt issue

Same issue observed on OCSG 5.0.0.1 and 5.1.

Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.