Exadata Compute Node Shows All Network Interfaces Down After Normal Reboot or After HW Maintenance Such As RAID batteries And ILOM Shows " - exec of program '/sbin/ifup' failed"
(Doc ID 2319375.1)
Last updated on AUGUST 22, 2023
Applies to:
Exadata Database Machine X2-2 Hardware - Version All Versions to All Versions [Release All Releases]Linux x86-64
Symptoms
One of the Exadata compute nodes is not coming up properly after RAID batteries are replaced or any other H/W replacement with all configured network interfaces such as ETH0, BONDETH0, IB0, IB1 OR BONDIB0 are unreachable.
Compute node can only be accessed via ILOM console and get into OS level. From OS level, "ip address" command shows following output:
1: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 <<<<<<<<<<<<<<<<<< link/ether 00:21:28:b2:21:6a brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 00:21:28:b2:21:6b brd ff:ff:ff:ff:ff:ff 4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 00:21:28:b2:21:6c brd ff:ff:ff:ff:ff:ff 5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 00:21:28:b2:21:6d brd ff:ff:ff:ff:ff:ff 6: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 <<<<<<<<<<<<<<<<<< In this case, ETH4 is the slave for BONDETH0 link/ether 00:1b:21:92:0a:a4 brd ff:ff:ff:ff:ff:ff 7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 <<<<<<<<<<<<<<<<<< In this case, ETH5 is the slave for BONDETH0 link/ether 00:1b:21:92:0a:a5 brd ff:ff:ff:ff:ff:ff 8: ib0: <BROADCAST,MULTICAST> mtu 65520 qdisc noop state DOWN qlen 4096 <<<<<<<<<<<<<<<<<< In this case, IB0 is the slave for BONDIB0 link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:21:28:00:01:a1:3b:d1 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 9: ib1: <BROADCAST,MULTICAST> mtu 65520 qdisc noop state DOWN qlen 4096 <<<<<<<<<<<<<<<<<< In this case, IB1 is the slave for BONDIB0 link/infiniband 80:00:00:49:fe:80:00:00:00:00:00:00:00:21:28:00:01:a1:3b:d2 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
Check all the network configuration files under /etc/sysconfig/network-scripts directory such as ifcfg-*, rule-* & route-*, they are exist and valid. Further checking within sosreport shows no apparent H/W failure on those network interfaces such as below:
For ETH4 and ETH5 as example:
ixgbe 0000:13:00.0: (PCI Express:5.0GT/s:Width x8) 00:1b:21:92:0a:a4 ixgbe 0000:13:00.0: eth4: MAC: 2, PHY: 15, SFP+: 5, PBA No: E70856-007 ixgbe 0000:13:00.0: eth4: Enabled Features: RxQ: 24 TxQ: 24 FdirHash RSC <<<<<<<<<<<< ixgbe 0000:13:00.0: eth4: Intel(R) 10 Gigabit Network Connection
ixgbe 0000:13:00.1: (PCI Express:5.0GT/s:Width x8) 00:1b:21:92:0a:a5 ixgbe 0000:13:00.1: eth5: MAC: 2, PHY: 15, SFP+: 6, PBA No: E70856-007 ixgbe 0000:13:00.1: eth5: Enabled Features: RxQ: 24 TxQ: 24 FdirHash RSC <<<<<<<<<<<< ixgbe 0000:13:00.1: eth5: Intel(R) 10 Gigabit Network Connection
For InfiniBand interfaces "ibstat" or "ibstatus" command also shows link as active:
CA 'mlx4_0' CA type: MT26428 Number of ports: 2 Firmware version: 2.11.2012 Hardware version: b0 Node GUID: 0x0021280001a13bd0 System image GUID: 0x0021280001a13bd3 Port 1: State: Active <<<<<<<<<<<<<<<<< Physical state: LinkUp <<<<<<<<<<<<<<<<< Rate: 40 Base lid: 16 LMC: 0 SM lid: 1 Capability mask: 0x02510868 Port GUID: 0x0021280001a13bd1 Link layer: IB Port 2: State: Active <<<<<<<<<<<<<<<<< Physical state: LinkUp <<<<<<<<<<<<<<<<< Rate: 40 Base lid: 17 LMC: 0 SM lid: 1 Capability mask: 0x02510868 Port GUID: 0x0021280001a13bd2 Link layer: IB
Changes
No major changes that might affect all network interfaces on OS level. It was only RAID battery common maintenance activity.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |