User Processes with static Dtrace probes can cause Performance issues or System Hangs
(Doc ID 1355119.1)
Last updated on AUGUST 16, 2023
Applies to:
Solaris Operating System - Version 10 3/05 and laterJava Platform, Standard Edition - Version 6 to 6 [Release 6]
Information in this document applies to any platform.
Symptoms
The Symptoms described by this document all revolve around the performance of processes which are either starting up, creating child processes, or exiting. In extreme cases it may appear as if processes are hung during start up or failing to exit/die. This issue as described can only happen on Solaris 10 systems and is most commonly observed with Java 6 prior to update 25 ( also see: <Doc ID 1296670.1> ) , but can happen in other situations with other processes.
If lockstat data collected or a crash dump, there may be many threads observed waiting for locks related to dtrace, such as:
Count indv cuml rcnt nsec Lock Hottest Caller
2709 12% 12% 0.00 10140749 dtrace_provider_lock dtrace_register+0x2b4
nsec ------ Time Distribution ------ count Stack
32768 | 10 fasttrap_provider_lookup+0x26c
65536 |@@@@@@ 551 fasttrap_meta_provide+0x8c
131072 |@ 101 dtrace_helper_provide_one+0xe0
262144 | 23 dtrace_helper_provide+0x48
524288 |@@@@@@@ 639
524288 |@@@@@@@ 639
1048576 |@@ 264
...
...
16777216 | 2
33554432 |@@@@@@@@@ 850
67108864 |@ 141
...
...
or
Count indv cuml rcnt nsec Lock Hottest Caller
748 11% 94% 0.00 7379028 dtrace_meta_lock dtrace_helpers_destroy+0x90
nsec ------ Time Distribution ------ count Stack
8192 | 8 proc_exit+0x1b4
16384 | 4 exit+0x8
32768 |@@ 69 syscall_trap32+0xcc
65536 |@@@@@@ 155
...
...
16777216 |@@@@ 117
33554432 |@ 46
67108864 |@ 31
134217728 | 11
or
Count indv cuml rcnt nsec Lock Hottest Caller
748 11% 94% 0.00 7379028 dtrace_meta_lock dtrace_helpers_destroy+0x90
nsec ------ Time Distribution ------ count Stack
8192 | 8 proc_exit+0x1b4
16384 | 4 exit+0x8
32768 |@@ 69 syscall_trap32+0xcc
65536 |@@@@@@ 155
...
...
8388608 |@@ 55
16777216 |@@@@ 117
33554432 |@ 46
67108864 |@ 31
134217728 | 11
# lockstat -A -l dtrace_lock -h -s5 -D 5 -P -w sleep 15
# lockstat -A -l dtrace_meta_lock -h -s5 -D 5 -P -w sleep 15
# lockstat -A -l dtrace_provider_lock -h -s5 -D 5 -P -w sleep 15
The issues is not with the Dtrace probes themselves but with the probes being registered and de-registered with the system which by default happens when the process starts up, every time a process with probes forks, and those probes need to be de-registered every time a processes with probes exits either directly for because it's exec'ing a new process.
This last part brings up a another way to confirm if this is the issue that is causing problems. If you are familiar dtrace you will recall that probes are in the format of provider:module:function:name and for probes associated with running processes the provider includes the process id appended to the provider name.
# /usr/sbin/dtrace -l | head -5
ID PROVIDER MODULE FUNCTION NAME
1 dtrace BEGIN
2 dtrace END
3 dtrace ERROR
4 Xserver636 Xsun AddResource resource-alloc
##
## Note, process 636 has probes registered
##
# /bin/ps -fp 636
UID PID PPID C STIME TTY TIME CMD
root 636 627 0 Jul 29 ? 132:51 /usr/openwin/bin/Xsun :0 -defdepth 24 -nobanner -auth /var/dt/A:0-8Gayo
##
## That isn't likely going to cause a issue but what other processes have probes
##
##
## Note, that the list includes an arbitrary probe number when it lists
## probes, so the above checks for numbers at the provider
##
You can try something similar to:
# /usr/sbin/dtrace -l | awk '$2 ~ /[0-9]+$/ { print $2 }' | sed -e 's/^[^0-9]*//g' | sort | uniq
375
636
7721
# /bin/ps -fp 375,636,7721
UID PID PPID C STIME TTY TIME CMD
daemon 375 1 0 Jul 29 ? 4:40 /usr/lib/nfs/nfsmapid
root 636 627 0 Jul 29 ? 132:52 /usr/openwin/bin/Xsun :0 -defdepth 24 -nobanner -auth /var/dt/A:0-8Gayob
user 7721 22647 0 17:26:28 pts/2 0:00 ./jre1.6.0_23/bin/java -version
Changes
The most likely change (if any) will be either increase in number of processes running ( starting, forking, etc...) or an upgrade to an application or library version which didn't formerly include probes.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |