In PCP/RAC Environment CM Failover Takes Long Time
(Doc ID 464601.1)
Last updated on APRIL 04, 2025
Applies to:
Oracle Concurrent Processing - Version 11.5.10.2 to 11.5.10.2 [Release 11.5]Information in this document applies to any platform.
This problem can occur on any platform.
Symptoms
In a PCP/RAC environment when one concurrent node fails, failover to surviving
node takes 30-35 minutes. The database sessions associated with the managers, on the node which is failed, take 30 minutes to exit.
Even with sqlnet.expire_time in sqlnet.ora is set to 1, it takes 30-35 minutes for DCD (Dead Connection Detection) to clean up the sessions from the failed node.
The following steps were attempted, but did not help
On the database nodes, for the database listeners (not apps listener) set sqlnet.expire_time = 1
On Concurrent manager nodes, set the following env variable before starting the managers:
FDCPTRW=3
Linux, tune the kernel as follows (other platforms, Please find the analogous tcp/ip parameters and tune them)
On the database nodes AND the Concurrent nodes:
tcp_keepalive_intvl = 45
tcp_keepalive_probes = 4
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |