DB Instances Not Starting After OS Patching - runaway NFS READ activity on Oracle binary NFS mount
(Doc ID 2403709.1)
Last updated on AUGUST 27, 2018
Applies to:Solaris Operating System - Version 10 1/13 U11 to 11.3 [Release 10.0 to 11.0]
Information in this document applies to any platform.
DB instances, with all DB data, and binaries mounted over NFS, failed to start after patching to or beyond Solaris 10 kernel patch 150400-52 (SPARC) or 150401-52 (x64).
The database instance attempted to be started never came up - sqlplus connection to the database received no response - the database appears to have hung at startup.
The DB binary mountpoint was observed to be very aggressively issuing NFS READS - mostly small reads, at a very high rate - up to 15,000/sec, enough to saturate the 1Gbps link. This would go on apparently forever, at least for 30-60 minutes.
Further research indicated that most - a few dozen at least - DB processes were constantly issuing NFS READS via page faults - on mmap'ed files.
Threadlists indicated just about every Oracle DB daemon was constantly issuing NFS READs due to page faulted on mmap-ed files.
Here is one example thread:
PC: _resume_from_idle+0xfb CMD: ora_dbw1_PYH
stack pointer for thread ffffffffdabfd0a0: fffffe80020843b0
[ fffffe80020843b0 _resume_from_idle+0xfb() ]
Just about any DB daemon was in this state - constantly.
Customer had patched, from prior to 150401-48 to 150491-59. Rolling back the kernel patch - to anything before 150401-52 (x64) or 150400-52 (SPARC) - restored the customer's databases to normal operation.
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!