User Processes with static Dtrace probes can cause Performance issues or System Hangs (Doc ID 1355119.1)

Last updated on AUGUST 16, 2023

Applies to:

Solaris Operating System - Version 10 3/05 and later
Java Platform, Standard Edition - Version 6 to 6 [Release 6]
Information in this document applies to any platform.

Symptoms

The Symptoms described by this document all revolve around the performance of processes which are either starting up, creating child processes, or exiting. In extreme cases it may appear as if processes are hung during start up or failing to exit/die. This issue as described can only happen on Solaris 10 systems and is most commonly observed with Java 6 prior to update 25 ( also see: <Doc ID 1296670.1> ) , but can happen in other situations with other processes.

If lockstat data collected or a crash dump, there may be many threads observed waiting for locks related to dtrace, such as:

Count indv cuml rcnt nsec     Lock    Hottest Caller
2709   12% 12% 0.00 10140749 dtrace_provider_lock dtrace_register+0x2b4

    nsec ------ Time Distribution ------ count Stack
   32768 |                            10 fasttrap_provider_lookup+0x26c
   65536 |@@@@@@       551 fasttrap_meta_provide+0x8c
131072 |@            101 dtrace_helper_provide_one+0xe0
262144 |                  23 dtrace_helper_provide+0x48
524288 |@@@@@@@          639
    524288 |@@@@@@@      639
1048576 |@@           264
...
...
16777216 |                 2
33554432 |@@@@@@@@@    850
67108864 |@          141
...
...

or

Count indv cuml rcnt nsec    Lock             Hottest Caller
748    11% 94% 0.00 7379028 dtrace_meta_lock dtrace_helpers_destroy+0x90

     nsec ------ Time Distribution ------ count Stack
     8192 | 8 proc_exit+0x1b4
    16384 |     4 exit+0x8
    32768 |@@                                69 syscall_trap32+0xcc
    65536 |@@@@@@       155
...
...
16777216 |@@@@                             117
33554432 |@                                 46
67108864 |@                                 31
134217728 |                                  11

or

Count indv cuml rcnt nsec     Lock             Hottest Caller
748    11% 94% 0.00 7379028 dtrace_meta_lock dtrace_helpers_destroy+0x90

     nsec ------ Time Distribution ------ count Stack
     8192 |                                   8 proc_exit+0x1b4
    16384 |                                   4 exit+0x8
    32768 |@@                                69 syscall_trap32+0xcc
    65536 |@@@@@@                           155
...
...
8388608 |@@                                55
16777216 |@@@@                             117
33554432 |@                                 46
67108864 |@                                 31
134217728 |    11

Lockstat data such as the above can be captured by using the lockstat command, see "$man lockstat" for details. If you only want to check for this particular issue, you may want to use one of the following to look at issues concerning the dtrace_lock , dtrace_meta_lock and dtrace_provider_lock:

# lockstat -A -l dtrace_lock -h -s5 -D 5 -P -w sleep 15
# lockstat -A -l dtrace_meta_lock -h -s5 -D 5 -P -w sleep 15
# lockstat -A -l dtrace_provider_lock -h -s5 -D 5 -P -w sleep 15

The issues is not with the Dtrace probes themselves but with the probes being registered and de-registered with the system which by default happens when the process starts up, every time a process with probes forks, and those probes need to be de-registered every time a processes with probes exits either directly for because it's exec'ing a new process.

This last part brings up a another way to confirm if this is the issue that is causing problems. If you are familiar dtrace you will recall that probes are in the format of provider:module:function:name and for probes associated with running processes the provider includes the process id appended to the provider name.

# /usr/sbin/dtrace -l | head -5
ID PROVIDER MODULE    FUNCTION NAME
1 dtrace BEGIN
2 dtrace       END
3 dtrace ERROR
4 Xserver636 Xsun AddResource resource-alloc

##
## Note, process 636 has probes registered
##

# /bin/ps -fp 636
UID PID PPID C STIME TTY TIME CMD
root 636 627 0 Jul 29 ? 132:51 /usr/openwin/bin/Xsun :0 -defdepth 24 -nobanner -auth /var/dt/A:0-8Gayo

##
## That isn't likely going to cause a issue but what other processes have probes
##

##
## Note, that the list includes an arbitrary probe number when it lists
## probes, so the above checks for numbers at the provider
##

You can try something similar to:

# /usr/sbin/dtrace -l | awk '$2 ~ /[0-9]+$/ { print $2 }' | sed -e 's/^[^0-9]*//g' | sort | uniq
375
636
7721

# /bin/ps -fp 375,636,7721
   UID PID PPID C STIME    TTY     TIME CMD
daemon 375 1 0 Jul 29   ?     4:40 /usr/lib/nfs/nfsmapid
root 636 627 0 Jul 29   ? 132:52 /usr/openwin/bin/Xsun :0 -defdepth 24 -nobanner -auth /var/dt/A:0-8Gayob
user 7721 22647 0 17:26:28 pts/2 0:00 ./jre1.6.0_23/bin/java -version

Recall that the probe registrations only need to change when a process starts, exits or forks, so the long running processes are unlikely to be an issue, and even a single Java process by itself may not be an issue.

Changes

The most likely change (if any) will be either increase in number of processes running ( starting, forking, etc...) or an upgrade to an application or library version which didn't formerly include probes.

Cause

	To view full details, sign in with your My Oracle Support account.
	Don't have a My Oracle Support account? Click to get started!

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.

User Processes with static Dtrace probes can cause Performance issues or System Hangs (Doc ID 1355119.1)

Applies to:

Symptoms

Changes

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!