My Oracle Support Banner

HAIP Can't Failover To Available NIC When The Cable To Current NIC Is Broken (Doc ID 2383905.1)

Last updated on NOVEMBER 11, 2020

Applies to:

Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Information in this document applies to any platform.


This issue has been found during test.  In a cluster with 2 private NICs (<nicname1> & <nicname2>), the network cable to one of the NIC has been unplugged to simulate the network cable failure.  It's observed the HAIP on unplugged NIC couldn't failover to another available NIC. Following is the symptoms:

1. We can find the NIC link down message in OS log

Feb 13 15:53:26 TEST1 kernel: ixgbe 0000:81:00.0: p6<nicname1>: NIC Link is Down <<<<<<
Unplug the cable

2. Can not find any abnormal error/warning in ohasd_orarootagent_root.trc during the issue.  As a compare during another test to shutdown the NIC (ifconfig <nicname1> down), we can find the HAIP failover message in the same ohasd_orarootagent_root.trc:

2018-02-13 15:46:08.017 : USRTHRD:2613040896: HAIP: event GIPCD_IF_UPDATE <<<<<< Get event: GIPCD_IF_UPDATE after issue the command "ifconfig <nicname1> down"
2018-02-13 15:46:08.018 : USRTHRD:2615142144: {0:0:4601} dequeue change event 0x7f67940639f0, GIPCD_IF_UPDATE
2018-02-13 15:46:08.018 : USRTHRD:2615142144: {0:0:4601} HAIP: IF state gipcdadapterstateDown 
2018-02-13 15:46:08.018 : USRTHRD:2615142144: {0:0:4601} HAIP: remove inf 1,  0x7f678c06ce60



2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} HAIP: Moving ip '169.XXX.XXX.181' from inf '<nicname1>' to inf '<nicname2>'  <<<<<< Failed over the HAIP (address) to another available NIC(<nicname2>)
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} pausing thread
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} posting thread
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} Thread::start { acquire thndMX:8c1e0ba0
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} Thread::start spawn pThnd:0x7f678c39f010 thndType:1
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} Thread::start thread spawned tid:2579191552
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} Thread::start spawned release thndMX:8c1e0ba0 }
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} to verify routes
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} to verify start completion 2
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} HAIP: check ip '169.XXX.XXX.181'
2018-02-13 15:46:08.943 : USRTHRD:2615142144: {0:0:4601} HAIP: CleanDeadThreads entry 

3. After unplug the cable, we can find the "UP" flag for the NIC, but no "RUNNING":

p6p1 Link encap:Ethernet HWaddr XX:XX:XX:XX:XX:XX
inet addr:XX.XX.XX.XX Bcast: Mask:
inet6 addr: XX::XX:XX:XX:6188/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1 <<<<<< No RUNNING flag 


 Simulating Private NIC failure.


To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!

In this Document

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.