My Oracle Support Banner

Exalogic Infiniband Switch Replacement - Follow-up Actions (Restoration) (Doc ID 2218689.1)

Last updated on JUNE 18, 2021

Applies to:

Exalogic Elastic Cloud X4-2 Eighth Rack - Version X6 to X6 [Release X6]
Exalogic Elastic Cloud X4-2 Eighth Rack - Version X4 to X4 [Release X4]
Oracle Exalogic Elastic Cloud Software - Version 2.0.0.0.0 and later
Exalogic Elastic Cloud X3-2 Hardware - Version X3 to X3 [Release X3]
Linux x86-64
Oracle Solaris on x86-64 (64-bit)
Oracle Virtual Server x86-64



Purpose

This Note provides follow-up reconfiguration steps for restoring the Exalogic Infiniband (IB) switch configuration after replacement.

For Exalogic racks running with January 2018 PSU 2.0.6.3.180116, steps in following MOS Note have to be followed for restoring the switch configuration.

<Note 2482924.1>: Exalogic: How To Restore Infiniband Switch Configuration After Switch Replacement In Exalogic Racks Running with January 2018 PSU 2.0.6.3.180116

This document is intended to be used immediately after physical replacement steps have been completed. 

Prior to a customer running these procedures, an Oracle Field Engineer should have completed replacing the switch by following the two Canned Action Plan documents below.

<Note 1383773.1>: How to Replace a Failed Sun Network QDR InfiniBand Gateway Switch 
<Note 1341658.1>: How to Replace a Failed Sun Datacenter InfiniBand Switch 36 

Scope

Infiniband Switch replacements in Exalogic racks (Physical and Virtual).

Details

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Purpose
Scope
Details
 1. Validate the Firmware version on the Newly Replaced Switch
 2. Validate whether Subnet Manager is disabled on newly replaced Switch
 3. Validate SM controller_handover on current switch running the Master in the Fabric
 4. Validate that the physical installation of the new switch into the fabric was completed successfully. Run the "ibnetdiscover" and "ibswitches" command.
 5. Change the passwords of the root and ilom-admin users on the replacement switch to their previous values.
 Steps for changing the password for "root" user
 Steps for changing the password for "ilom-admin" user
 6. Update the smnodes list on the replaced switch with the IP addresses of all the Exalogic switches running the subnet manager.
 7. Set SM Priority To Recommended Values on the replaced switch
 8. Validate if ocadmin SNMP community exists and create it if it does not exists
 9. Copy the /conf/partitions.current file from the Switch running the SM Master to the newly replaced Switch under /conf directory
 10. Restore the Switch configuration from Exabr backups.
 a. Check if you are able to login to newly replaced switch using "root" user from Compute Node 1 and EMOC Control vServer
 b. View a list of backups by running ExaBR list command on replacement Infiniband Switch
 c. Use ExaBR restore command on newly replaced InfiniBand switch to restore the configuration from exabr backups:
 d. Validate if the exabr restore command restored the Switch configuration successfully.
 e. Run "smpartition start && smpartition commit" command on the Switch running the SM Master to propagate the partitions from SM Master to newly replaced Switch
 11. Validate whether the VNICs and VLANs are seen after Exabr restore on the newly replaced Switch.
 12. Register the Port GUIDs of Newly Replaced Switch with EoIB Partitions using "exabr ib-register" command.
 13. Validate the status of VNICs and VLANs
 14. Additional restoration steps for Virtual & Hybrid racks with EMOC
 14(a) Remove old SSH keys for Replaced Infiniband Switch IP from known_hosts file of Proxy Controller VM's PC1 & PC2.
 14(b) Rediscover newly replaced Infiniband Switch from EMOC - Virtual and Hybrid Racks
 Final checkup and verification
 a. Check/set firewall rule settings on port 623
 b. Check the opensm status and smpriorities on all switches in the IB fabric
 c. Check network/fabric is operating normally
 d. Take a fresh Exalogic Control vServers backup using Exabr.
 e. Collect fresh full exalogs from the rack after Switch replacement.
 KNOWN ISSUES WHICH CAN BE ENCOUNTERED DURING SWITCH REPLACEMENT
 Exalogic Exabr ib-register Command To Register New Replaced Infiniband Switch Port GUIDs Fails With "Unable to get rpc version on some nodes in the fabric" Error
 Exalogic: "smpartition start" & "exabr ib-register" Commands Failing With "cli commit is in progress" Error
 Exalogic: SM Status Of New Replaced Standby NM2-GW IB Switches (With Firmware Versions 2.2.7 Or Older) Shown As "DISCOVER" Instead Of "STANDBY"
 Exalogic: Exabr ib-register Command On IB Switches Failing With "Error: partition key <Pkey> does not exist on the Infiniband fabric" Errors
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.