Oracle Grid Infrastructure: How to Troubleshoot Missed Network Heartbeat Evictions
(Doc ID 1534949.1)
Last updated on MARCH 24, 2022
Applies to:
Oracle Database - Enterprise Edition - Version 11.2.0.1 and laterOracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Information in this document applies to any platform.
Purpose
Missed Network Heartbeat (NHB) evictions happen when the Oracle Cluster Synchronization Services Daemon (ocssd) of the surviving node loses contact with the evicted node over the private network (interconnect). Cluster nodes must be able to communicate over the interconnect in order to avoid a "split brain" situation. Note that in some cases, a node can actually abort itself to avoid "split brain" when communication over the interconnect is compromised.
The most common (but not exclusive) cause of missed NHB is network problems communicating over the private interconnect.
The purpose of this document is to provide steps to check the network after missed NHB eviction.
Troubleshooting Steps
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Purpose |
Troubleshooting Steps |
1. Check OS statistics from the evicted node from the time of the eviction. |
2. Validate the interconnect network setup. |
3. Check that the OS network settings are correct by running the ORAchk tool. |
4. Check communication over the private network. |
5. Platform specific checks. |
6. Known issues which can cause NHB node eviction. |
7. For more information. |
References |