My Oracle Support Banner

How to Replace an Infiniband HCA Card in Oracle SuperCluster Compute / DB Nodes (Doc ID 2044499.1)

Last updated on NOVEMBER 01, 2018

Applies to:

Oracle SuperCluster T5-8 Hardware - Version All Versions and later
SPARC SuperCluster T4-4 Half Rack - Version All Versions and later
Oracle SuperCluster T5-8 Full Rack - Version All Versions and later
Oracle SuperCluster T5-8 Half Rack - Version All Versions and later
SPARC SuperCluster T4-4 Full Rack - Version All Versions and later
Oracle Solaris on SPARC (64-bit)
SPARC

Goal

Replacement of HCA IB network port cards in SuperCluster requires additional steps to ensure the new component has the correct firmware installed and to reconfigure the HCA port GUID's in the Infiniband Fabric. Failure to do so leaves the system in a dysfunctional state and causes further downtime for the customer. This document describes how to replace a faulty Infiniband card in SuperCluster Compute and Database Nodes and ensure that the FW is updated and the port GUID's are correct.

Solution

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Goal
Solution
 Process
 1. Identify failing card from fmadm or explorer analysis
 2. Collect firmware and port GUID information
 3. Determine which IB switch is master.
 4. Collect IB partition data
 5. Collect ibstat(1M)
 6. Ordering parts.
 7. Preparing the server
 8. Record new component GUID's and replace the card.
 9. Update the IB fabric master partition by changing the port GUIDs on the master IB siwtch.
 10. Power on and reboot the primary LDOM
 11a. Check HCA Firmware.
 11b. Flashing HCA Firmware
 11c. Checking the new HCA card port GUID's and other sanity checks
 12. Restore operation.
 13. Verify Infiniband topology.
 14. DB Node Startup Verification

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.