My Oracle Support Banner

Exadata Compute Node Shows All Network Interfaces Down After Normal Reboot or After HW Maintenance Such As RAID batteries And ILOM Shows " - exec of program '/sbin/ifup' failed" (Doc ID 2319375.1)

Last updated on OCTOBER 12, 2018

Applies to:

Exadata Database Machine X2-2 Hardware - Version All Versions to All Versions [Release All Releases]
Linux x86-64

Symptoms

One of the Exadata compute nodes is not coming up properly after RAID batteries are replaced or any other H/W replacement with all configured network interfaces such as ETH0, BONDETH0, IB0, IB1 OR BONDIB0 are unreachable.

Compute node can only be accessed via ILOM console and get into OS level. From OS level, "ip address" command shows following output:

1: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN                                              
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000            <<<<<<<<<<<<<<<<<<
link/ether 00:21:28:b2:21:6a brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:21:28:b2:21:6b brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:21:28:b2:21:6c brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:21:28:b2:21:6d brd ff:ff:ff:ff:ff:ff
6: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000            <<<<<<<<<<<<<<<<<< In this case, ETH4 is the slave for BONDETH0
link/ether 00:1b:21:92:0a:a4 brd ff:ff:ff:ff:ff:ff
7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000            <<<<<<<<<<<<<<<<<< In this case, ETH5 is the slave for BONDETH0
link/ether 00:1b:21:92:0a:a5 brd ff:ff:ff:ff:ff:ff
8: ib0: <BROADCAST,MULTICAST> mtu 65520 qdisc noop state DOWN qlen 4096            <<<<<<<<<<<<<<<<<< In this case, IB0 is the slave for BONDIB0
link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:21:28:00:01:a1:3b:d1 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
9: ib1: <BROADCAST,MULTICAST> mtu 65520 qdisc noop state DOWN qlen 4096            <<<<<<<<<<<<<<<<<< In this case, IB1 is the slave for BONDIB0
link/infiniband 80:00:00:49:fe:80:00:00:00:00:00:00:00:21:28:00:01:a1:3b:d2 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff

 

Check all the network configuration files under /etc/sysconfig/network-scripts directory such as ifcfg-*, rule-* & route-*, they are exist and valid. Further checking within sosreport shows no apparent H/W failure on those network interfaces such as below:

 

For ETH4 and ETH5 as example:

ixgbe 0000:13:00.0: (PCI Express:5.0GT/s:Width x8) 00:1b:21:92:0a:a4
ixgbe 0000:13:00.0: eth4: MAC: 2, PHY: 15, SFP+: 5, PBA No: E70856-007
ixgbe 0000:13:00.0: eth4: Enabled Features: RxQ: 24 TxQ: 24 FdirHash RSC          <<<<<<<<<<<<
ixgbe 0000:13:00.0: eth4: Intel(R) 10 Gigabit Network Connection
ixgbe 0000:13:00.1: (PCI Express:5.0GT/s:Width x8) 00:1b:21:92:0a:a5
ixgbe 0000:13:00.1: eth5: MAC: 2, PHY: 15, SFP+: 6, PBA No: E70856-007
ixgbe 0000:13:00.1: eth5: Enabled Features: RxQ: 24 TxQ: 24 FdirHash RSC          <<<<<<<<<<<<
ixgbe 0000:13:00.1: eth5: Intel(R) 10 Gigabit Network Connection

 

For InfiniBand interfaces "ibstat" or "ibstatus" command also shows link as active:

CA 'mlx4_0'
CA type: MT26428
Number of ports: 2
Firmware version: 2.11.2012
Hardware version: b0
Node GUID: 0x0021280001a13bd0
System image GUID: 0x0021280001a13bd3
Port 1:   
State: Active                     <<<<<<<<<<<<<<<<<
Physical state: LinkUp            <<<<<<<<<<<<<<<<<
Rate: 40
Base lid: 16
LMC: 0
SM lid: 1
Capability mask: 0x02510868
Port GUID: 0x0021280001a13bd1
Link layer: IB
Port 2:
State: Active                    <<<<<<<<<<<<<<<<<
Physical state: LinkUp           <<<<<<<<<<<<<<<<<
Rate: 40
Base lid: 17
LMC: 0
SM lid: 1
Capability mask: 0x02510868
Port GUID: 0x0021280001a13bd2
Link layer: IB

Changes

 No major changes that might affect all network interfaces on OS level. It was only RAID battery common maintenance activity.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.