Servers running Solaris 11.4 may hang or show ps(1) or other /proc commands hang when a process calls the system call spawn()
(Doc ID 2589560.1)
Last updated on OCTOBER 12, 2023
Applies to:
Solaris Operating System - Version 11.4 to 11.4 [Release 11.0]Oracle Solaris on x86-64 (64-bit)
Oracle Solaris on SPARC (64-bit)
Symptoms
In Solaris we block the process against /proc while we manipulate p->p_tlist.
We want to do this early enough so that we don't drop p->p_lock until the thread is put on the p->p_tlist.
In order to avoid another thread to access the /proc files while a thread holds the pr_p_lock, we call prbarrier() until the pr_p_lock is released and the process is unlocked.
Due to this condition, the system was not hung, but it reached a point were some commands that needed to access procfs start to hang (because the files for a process that can be found in /proc are locked).
If this problem is affecting a server, a system crash dump collected while the hang is happening would show threads blocked for a mutex held by a thread sleeping on a condition variable from pr_p_lock as follows:
Changes
spawn() is new syscall added to implement posix_spawn(3C). Some code changes and new code have been added in s11.4 to support this. Java in Grid software is using spawn.
The spawn() calls acquires the following locks: fi_lock, uf_lock, f_tlock.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |