Exalogic Infiniband Switch Replacement - Follow-up Actions (Restoration)
(Doc ID 2218689.1)
Last updated on JUNE 18, 2021
Applies to:Exalogic Elastic Cloud X4-2 Eighth Rack - Version X6 to X6 [Release X6]
Exalogic Elastic Cloud X4-2 Eighth Rack - Version X4 to X4 [Release X4]
Oracle Exalogic Elastic Cloud Software - Version 22.214.171.124.0 and later
Exalogic Elastic Cloud X3-2 Hardware - Version X3 to X3 [Release X3]
Oracle Solaris on x86-64 (64-bit)
Oracle Virtual Server x86-64
This Note provides follow-up reconfiguration steps for restoring the Exalogic Infiniband (IB) switch configuration after replacement.
For Exalogic racks running with January 2018 PSU 126.96.36.199.180116, steps in following MOS Note have to be followed for restoring the switch configuration.
This document is intended to be used immediately after physical replacement steps have been completed.
Prior to a customer running these procedures, an Oracle Field Engineer should have completed replacing the switch by following the two Canned Action Plan documents below.
Infiniband Switch replacements in Exalogic racks (Physical and Virtual).
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document
|1. Validate the Firmware version on the Newly Replaced Switch|
|2. Validate whether Subnet Manager is disabled on newly replaced Switch|
|3. Validate SM controller_handover on current switch running the Master in the Fabric|
|4. Validate that the physical installation of the new switch into the fabric was completed successfully. Run the "ibnetdiscover" and "ibswitches" command.|
|5. Change the passwords of the root and ilom-admin users on the replacement switch to their previous values.|
|Steps for changing the password for "root" user|
|Steps for changing the password for "ilom-admin" user|
|6. Update the smnodes list on the replaced switch with the IP addresses of all the Exalogic switches running the subnet manager.|
|7. Set SM Priority To Recommended Values on the replaced switch|
|8. Validate if ocadmin SNMP community exists and create it if it does not exists|
|9. Copy the /conf/partitions.current file from the Switch running the SM Master to the newly replaced Switch under /conf directory|
|10. Restore the Switch configuration from Exabr backups.|
|a. Check if you are able to login to newly replaced switch using "root" user from Compute Node 1 and EMOC Control vServer|
|b. View a list of backups by running ExaBR list command on replacement Infiniband Switch|
|c. Use ExaBR restore command on newly replaced InfiniBand switch to restore the configuration from exabr backups:|
|d. Validate if the exabr restore command restored the Switch configuration successfully.|
|e. Run "smpartition start && smpartition commit" command on the Switch running the SM Master to propagate the partitions from SM Master to newly replaced Switch|
|11. Validate whether the VNICs and VLANs are seen after Exabr restore on the newly replaced Switch.|
|12. Register the Port GUIDs of Newly Replaced Switch with EoIB Partitions using "exabr ib-register" command.|
|13. Validate the status of VNICs and VLANs|
|14. Additional restoration steps for Virtual & Hybrid racks with EMOC|
|14(a) Remove old SSH keys for Replaced Infiniband Switch IP from known_hosts file of Proxy Controller VM's PC1 & PC2.|
|14(b) Rediscover newly replaced Infiniband Switch from EMOC - Virtual and Hybrid Racks|
|Final checkup and verification|
|a. Check/set firewall rule settings on port 623|
|b. Check the opensm status and smpriorities on all switches in the IB fabric|
|c. Check network/fabric is operating normally|
|d. Take a fresh Exalogic Control vServers backup using Exabr.|
|e. Collect fresh full exalogs from the rack after Switch replacement.|
|KNOWN ISSUES WHICH CAN BE ENCOUNTERED DURING SWITCH REPLACEMENT|
|Exalogic Exabr ib-register Command To Register New Replaced Infiniband Switch Port GUIDs Fails With "Unable to get rpc version on some nodes in the fabric" Error|
|Exalogic: "smpartition start" & "exabr ib-register" Commands Failing With "cli commit is in progress" Error|
|Exalogic: SM Status Of New Replaced Standby NM2-GW IB Switches (With Firmware Versions 2.2.7 Or Older) Shown As "DISCOVER" Instead Of "STANDBY"|
|Exalogic: Exabr ib-register Command On IB Switches Failing With "Error: partition key <Pkey> does not exist on the Infiniband fabric" Errors|