ODA Bug 2706016 symptoms: VMs and REPO in UNKNOWN State ; OS, Xen or OAK Commands Hang ; High IO Looping ; TOP High I/O Wait, "INFO: task nfsd:"#### "blocked for more than 120 seconds"; lo_rw_aio_complete:454:ERROR:-116 received. Device loop10 MAJOR:7 MI
(Doc ID 2403746.1)
Last updated on JUNE 01, 2018
Applies to:Oracle Database Appliance X5-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database Appliance Software - Version 188.8.131.52 to 184.108.40.206 [Release 12.1]
Oracle Database Appliance X6-2 HA Hardware - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
VM unknown, VM hang, VM offline, Repo UNKNOWN, ODA hang, OS hang, task nfsd, NFSD loop;
You may encounter two or more of the following confirmed symptoms:
VM and Repo Symptoms
- VM unresponsiveness going into UNKNOWN state.
- VM's that were running on the two nodes disappeared.
- Attempting to shutdown the repos will fail
- Eventually the oakcli command may show no REPO filesystems present -- athough they may still show as mounted.
High IO, or Load
- Servers experience an I/O on loop devices that belong to VMs
- high load average (> 100) on basenode with high CPU I/O wait utilization
- TOP may show 50% for IO
- "df -h" where NFS mounts appear to be hung
- "crsctl check crs" is unable to the cluster ready services.
- The srvctl command from ODA_BASE canhang
- The server(s) become unresponsive = effectively locked up.
- Gracefully Power Off the DOM 0's at the ILOM level may fail.
- Power down from ILOM and startup manually may be required.
Messages file errors:
- sequential NFSD tasks incrementally fail as they are "blocked for more than 120 seconds"
- lo_rw_aio_complete:454:ERROR:-116 received. Device loop10 MAJOR:7 MINOR:10
From the messages files you may find a few key indicators
220.127.116.11 ( or ODA X7-2 18.104.22.168)
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!