My Oracle Support Banner

Flash Disk Group and Cache Performance Problems in a Clustered Configuration Including ODA X5-2; Symptoms: Very High DBWR CPU, DB FLASH cache waits (User I/O) ; Buffer Cache / Busy Waits (Concurrency) ; enq: TX-row lock contention (Concurrency) (Doc ID 2120400.1)

Last updated on JUNE 13, 2023

Applies to:

Oracle Database Appliance X5-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database - Enterprise Edition - Version 11.2.0.1 to 12.2.0.1 [Release 11.2 to 12.2]
Oracle Database Cloud Schema Service - Version N/A and later
Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
ODA, performance problems, CPU 100%, Hang, FLASH,ODI TRUNCATE,LOCK,RAC,DBWR,SPIN,CHKPT, Lock, outage

Symptoms

This note discusses two independent forms of FLASH usage:

 

 

Symptoms 

This note will discuss a collection of symptoms that have a common source:
     Database Flash Cache enabled on two or more nodes.
The FLASH / buffer cache symptoms discussed share one factor:                     Disabling FLASH CACHE will workaround the problem.
Storing data in the FLASH DISK GROUP in the ODA X5-2 can encounter similar symptoms even if FLASH CACHE is not explicitly enabled.

Using Database Flash Cache does not mean you will hit this problem with your usage.
There are potentially two or more bugs which have similar symptoms but unique fixes, but all share the pre-requisites of:

  • Flash Enabled or storage on shared FLASH disks
  • Two or more nodes ¹

This can occur on any platform that uses Flash Cache on more than the local node,
When using the DB Smart Flash Cache feature in ODA X5-2 the remote FLASH CACHE is automatically accessible to the opposing node
Otherwise, this problem typically requires a RAC configuration with database Flash cache enabled and accessible for two or more nodes.

 

(Doc ID 2120400.1). Formerly Labeled     -- ODA X5-2: Lock, Hang for Checkpoint (CKPT) or DBwriter (DBWR), 'GC CURRENT REQUEST' or Intermittent Bad Performance, High CPU and / or Very Poor IO 

* The Oracle Database Appliance (ODA) X5-2 by default automatically enables remote access to the FLASH DISK GROUP* on both nodes if cache size <> 0:
    regardless of the instance type: even if using Enterprise Edition (EE) which is a single-instance database.

   Please upgrade to the newest version of the ODA if using FLASH on X5-2: There are also several single-patches and merges available or in the process of being created.

 
Note:     Oracle Database Appliance X5-2 introduces four additional 400 GB SSDs in slot numbers 16-19 that can be used to host database files, or they can be used as a database flash cache in addition to the buffer cache.

An ASM diskgroup named +FLASH with Normal Redundancy is provisioned on these SSDs. All of the storage in the +FLASH diskgroup is allocated to an ASM Dynamic Volume (flashdata), and formatted as an ACFS file system.
Storage in this flashdata file system is then made available as an ACFS file system and is used to create database flash cache files that accelerate read operations.

     The file that contains the flash cache is automatically created for each database and is specified using the database init.ora parameter db_flash_cache_file.
By default, flash_cache_file_size is set to 3 times the size of SGA, up to 196 GB, unless there is not enough space, in which case the size parameter is set to 0.
Changing the flash_cache_file_size parameter requires restarting the database in order to use the newly sized flash cache.

Rediscovery

RAC or Clustered* configuration experiences very poor performance², up to and including complete hangs;


Pre-requisite

db_flash_cache_size <> 0
X5-2 ODA  or RAC / Cluster configuration
Database flash cached enabled on two or more nodes
or remote Storage enabled in the Flash Diskgroup



Avoids symptoms


 Disable db flash cache

SYMPTOMS

High CPU 

Flash metrics confirm active usage during slow performance time windows:
NOTE - The existence of these metrics does not indicate a problem in progress:
          These AWR metrics will confirm Flash Cache usage: