Exalogic Virtual 2.0.6.4.X: Infiniband Private Network (IPoIB) Communications Between 2.0.6.4.X Guest vServers Not Working
(Doc ID 2603722.1)
Last updated on NOVEMBER 14, 2022
Applies to:
Oracle Exalogic Elastic Cloud Software - Version 2.0.6.4.0 to 2.0.6.4.190716 Linux x86-64
Oracle Virtual Server x86-64
Symptoms
In Exalogic 2.0.6.4.X Virtual racks, issue is seen on 2.0.6.4.X Guest vServers where IPoIB communication between the Guest vServers does not work. If you have private vnet configured and assigned to Guest vServers, what we notice is that the communication between few set of VM's (not all) over private vnet will not work.
Issue mentioned in this Note affects all following Exalogic image Guest vServers
2.0.6.4.0
2.0.6.4.190115 - January 2019 PSU
2.0.6.4.190716 - July 2019 PSU
To demonstrate the issue for e.g. lets assume you have 4 vservers - vserver1,vserver2, vserver3 & vserver4. Below is what we notice.
Communication between vserver1 & vserver2/vserver3 works both the ways over private vnet, But communication between vserver1 and vserver4 will not work.
However communication between vserver4 and vserver2, vserver3 works both the ways without any issues over private vnet.
We do not see any errors in system logs or in the Infiniband Switch logs on what is causing above issue of Infiniband network communication not to work between Guest vservers. There is no misconfiguration on the network bonds (MTU settings, HW address etc.) on the vserver, network bonds and slaves are UP on the vservers and there are no issues on the Infiniband switches in terms of configuration or health check wise.
The issue mentioned in this MOS note affects all the Infiniband Private networks on the VMs like IPoIB-default, IPoIB vserver shared storage, IPoIB virt admin and not just Private vnet. So if there are use cases (from app side or NIS setup etc.) where membership for IPoIB default and IPoIB vserver shared storage networks was changed to full from default limited for Guest vServers, you will face issues described in this note on other Infiniband private networks as well. EoIB networks of the Guest vServers are not affected by the issue mentioned in this Note.
Changes
Restarting few vServers on different compute nodes other than the compute nodes where they were originally running often leads to the problem mentioned in this Note. In some cases IPoIB communication breaks between few vservers all of a sudden after working for sometime.
Cause
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!