My Oracle Support Banner

User Processes with static Dtrace probes can cause Performance issues or System Hangs (Doc ID 1355119.1)

Last updated on JUNE 23, 2020

Applies to:

Solaris Operating System - Version 10 3/05 and later
Java Platform, Standard Edition - Version 6 to 6 [Release 6]
Information in this document applies to any platform.

Symptoms

The Symptoms described by this document all revolve around the performance of processes which are either starting up, creating child processes, or exiting. In extreme cases it may appear as if processes are hung during start up or failing to exit/die. This issue as described can only happen on Solaris 10 systems and is most commonly observed with Java 6 prior to update 25 ( also see: <Doc ID 1296670.1> )  , but can happen in other situations with other processes.

If lockstat data collected or a crash dump, there may be many threads observed waiting for locks related to dtrace, such as:


Count indv cuml rcnt nsec     Lock                  Hottest Caller
2709   12%  12% 0.00 10140749 dtrace_provider_lock  dtrace_register+0x2b4

    nsec ------ Time Distribution ------ count Stack
   32768 |                                  10 fasttrap_provider_lookup+0x26c
   65536 |@@@@@@                           551 fasttrap_meta_provide+0x8c
  131072 |@                                101 dtrace_helper_provide_one+0xe0
  262144 |                                  23 dtrace_helper_provide+0x48
  524288 |@@@@@@@                          639

    524288 |@@@@@@@                          639
 1048576 |@@                               264
...
...
16777216 |                                   2
33554432 |@@@@@@@@@                        850
67108864 |@                                141
...
...

or


Count indv cuml rcnt nsec    Lock             Hottest Caller
748    11%  94% 0.00 7379028 dtrace_meta_lock dtrace_helpers_destroy+0x90

     nsec ------ Time Distribution ------ count Stack
     8192 |                                   8 proc_exit+0x1b4
    16384 |                                   4 exit+0x8
    32768 |@@                                69 syscall_trap32+0xcc
    65536 |@@@@@@                           155

...
...
  16777216 |@@@@                             117
 33554432 |@                                 46
 67108864 |@                                 31
134217728 |                                  11



or


Count indv cuml rcnt nsec     Lock             Hottest Caller
748    11% 94%  0.00 7379028  dtrace_meta_lock dtrace_helpers_destroy+0x90

     nsec ------ Time Distribution ------ count Stack
     8192 |                                   8 proc_exit+0x1b4
    16384 |                                   4 exit+0x8
    32768 |@@                                69 syscall_trap32+0xcc
    65536 |@@@@@@                           155
...
...
  8388608 |@@                                55
 16777216 |@@@@                             117
 33554432 |@                                 46
 67108864 |@                                 31
134217728 |                                  11

Lockstat data such as the above can be captured by using the lockstat command, see "$man lockstat" for details.  If you only want to check for this particular issue, you may want to use one of the following to look at issues concerning the dtrace_lock , dtrace_meta_lock and dtrace_provider_lock:

# lockstat -A -l dtrace_lock -h -s5 -D 5 -P -w sleep 15
# lockstat -A -l dtrace_meta_lock -h -s5 -D 5 -P -w sleep 15
# lockstat -A -l dtrace_provider_lock -h -s5 -D 5 -P -w sleep 15




The issues is not with the Dtrace probes themselves but with the probes being registered and de-registered with the system which by default happens when the process starts up, every time a process with probes forks, and those probes need to be de-registered every time a processes with probes exits either directly for because it's exec'ing a new process.

This last part brings up a another way to confirm if this is the issue that is causing problems. If you are familiar dtrace you will recall that probes are in the format of provider:module:function:name and for probes associated with running processes the provider includes the process id appended to the provider name.


# /usr/sbin/dtrace -l | head -5
ID PROVIDER  MODULE    FUNCTION NAME
 1 dtrace                       BEGIN
 2 dtrace                       END
 3 dtrace                       ERROR
 4 Xserver636  Xsun AddResource resource-alloc

##
## Note, process 636 has probes registered
##


# /bin/ps -fp 636
UID  PID PPID   C STIME  TTY  TIME CMD
root 636   627  0 Jul 29 ?  132:51 /usr/openwin/bin/Xsun :0 -defdepth 24 -nobanner -auth /var/dt/A:0-8Gayo


##
## That isn't likely going to cause a issue but what other processes have probes
##


##
## Note, that the list includes an arbitrary probe number when it lists
## probes, so the above checks for numbers at the provider
##

You can try something similar to:


# /usr/sbin/dtrace -l | awk '$2 ~ /[0-9]+$/ { print $2 }' | sed -e 's/^[^0-9]*//g' | sort | uniq
375
636
7721

# /bin/ps -fp 375,636,7721
   UID  PID  PPID C STIME    TTY     TIME CMD
daemon  375     1 0 Jul 29   ?       4:40 /usr/lib/nfs/nfsmapid
  root  636   627 0 Jul 29   ?     132:52 /usr/openwin/bin/Xsun :0 -defdepth 24 -nobanner -auth /var/dt/A:0-8Gayob
  user 7721 22647 0 17:26:28 pts/2   0:00 ./jre1.6.0_23/bin/java -version



 

Recall that the probe registrations only need to change when a process starts, exits or forks, so the long running processes are unlikely to be an issue, and even a single Java process by itself may not be an issue.



Changes

The most likely change (if any) will be either increase in number of processes running ( starting, forking, etc...) or an upgrade to an application or library version which didn't formerly include probes.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.