SuperCluster - Infiniband switch reboot may cause database evictions on if there are a large number of RDS connections (Doc ID 2043654.1)

Last updated on FEBRUARY 07, 2017

Applies to:

SPARC SuperCluster T4-4 Full Rack - Version All Versions to All Versions [Release All Releases]
SPARC SuperCluster T4-4 Half Rack - Version All Versions to All Versions [Release All Releases]
Oracle SuperCluster T5-8 Full Rack - Version All Versions to All Versions [Release All Releases]
Oracle SuperCluster T5-8 Half Rack - Version All Versions to All Versions [Release All Releases]
Oracle SuperCluster M6-32 Hardware - Version All Versions to All Versions [Release All Releases]
Oracle Solaris on SPARC (64-bit)

Symptoms

One of more database nodes may evict due to diskmon split brain for RAC or diskmon may fence off cells for non-RAC.

If this has occurred, indications similar to the following can be found in the diskmon.trc file after the rebooted switch comes online:

To fully recover from a hung zone, the domain containing the database zones needs to be rebooted.

Changes

When a SuperCluster Infiniband switch is rebooted, database nodes may evict on domains with a large number of RDS connections. The numbers stated are on a per LDom (global zone)  basis not the cumulative across all LDoms(global zones)

There are a number of factors which impact the number of RDS connections, in particular Network Resource Management (a.k.a. Exadata QOS / TOS), which is enabled in Database 12.1.0.2, Solaris 11.2 SRU1, and 12.1.2.x.x Exadata Storage Cells. These, or later, versions are included in the SuperCluster QFSDP from July 2015 onwards. Other factors include the number of Databases, the number of Exadata Storage Cells they use, etc.

This Doc provides information for users to determine their current number of RDS connections and prediction formulae to determine the likely number of RDS connections after applying the July 2015 QFSDP or later.

This July 2015 QFSDP or later will be suitable for most SuperCluster customers to install as is.  In particular, enhancements contained in the July 2016 QFSDP and further enhancements contained in Oct 2016 QFSDP will make these QFSDPs suitable for the vast majority of customers.

If the prediction formulae indicate that applying this QFSDP would exceed the current safe threshold of RDS connections, then file an SR and await further advice.

The prediction formulae should also be used before deploying additional Databases on a system.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms