Solaris Clock Thread Performance On Systems with >= 256 VCPUs May Degrade Resulting In High %SYS Or System Hangs

(Doc ID 1377071.1)

Last updated on JULY 29, 2016

Applies to:

Solaris Operating System - Version 10 3/05 to 10 8/11 U10 [Release 10.0]
Solaris Operating System
Information in this document applies to any platform.

Symptoms

Systems with more than 256 Virtual CPUs (VCPUs) running Solaris 10 or Solaris 11 Express may hang or encounter performance issues due to the scalability and performance of the clock thread. 

If this issue is encountered, one or more of the symptoms listed below may be observed:
Example of vmstat output:

r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id
676 0 0 453401192 253089784 4 26 0 3 3 0 0 0 0 0 0 1883 1626 455 19 11 70
685 0 0 453397160 253088488 2 9 0 0 0 0 0 0 0 0 0 3123 619 407 17 13 70
690 0 0 453397184 253088520 0 0 0 0 0 0 0 0 0 0 0 2569 999 255 17 17 66
696 0 0 453392888 253088560 54 15 0 0 0 0 0 1 1 0 0 1950 780 215 17 15 68
699 0 0 453391384 253085536 114 66 0 1 1 0 0 0 0 0 0 2675 1445 235 17 13 70


Example from Solaris Crash Analysis Tool (SCAT) 'thread' command looking at the clock thread:

CAT> thread 0x2a100047ca0
==== interrupt thread: 0x2a100047ca0 PID: 0 on CPU: 0 affinity CPU: 0 (last_swtch: -22.56s) PIL: 10 ====

Example from Solaris Crash Analysis Tool (SCAT) 'coreinfo' command:

CAT> coreinfo
.....
sysent...clock...
WARNING: genunix:clock+0x0 CPU0 cyclic pend: 3229547 (8h58m15.470000000s)
.....


Example from Solaris Crash Analysis Tool (SCAT) 'coreinfo' and 'dispq [cpu]' commands

CAT> coreinfo
.....
WARNING: CPU0 has 30 threads in its dispatch queue
.....

CAT)> dispq 0
CPU thread pri PID cmd
0 @ 0x180c000 0x2a10007fca0 169 0 sched (PIL10 interrupt)
              pinned -=> 0x2a10001fca0 -1 (idle)
              pri 99 -=> 0x2a109d5dca0 99 0 sched
              0x2a10a44dca0 99 0 sched
              0x2a10a445ca0 99 0 sched
              pri 60 -=> 0x2a10f793ca0 60 0 sched
              0x2a100a89ca0 60 0 sched
              0x2a10b077ca0 60 0 sched
              .....


An example of the clock thread from a hung system may look similar to the following:

CAT> thread 0x2a100047ca0
==== interrupt thread: 0x2a100047ca0 PID: 0 on CPU: 0 affinity CPU: 0 (last_swtch: -22.56s) PIL: 10 ====
cmd: sched
t_procp: 0x18886c0(proc_sched)
p_as: 0x188a2e0(kas)
zone: global
t_stk: 0x2a100047a90 sp: 0x2a100046e21 t_stkbase: 0x2a100042000
t_pri: 169(SYS) pctcpu: 0.000000
t_lwp: 0x3015b1fedd8 machpcb: 0x2a10c6f1ae0
psrset: 0 last CPU: 0
idle: 7869 ticks (1 minutes 18.69 seconds)
start: Tue Jan 6 22:52:50 1970
age: 1319874598 seconds (15276 days 7 hours 49 minutes 58 seconds)
interrupted (pinned) thread: 0x3015aaf8a20
tstate: TS_ONPROC - thread is being run on a processor
tflg: T_INTR_THREAD - thread is an interrupt thread
T_TALLOCSTK - thread structure allocated from stk
tpflg: none set
tsched: TS_LOAD - thread is in memory
TS_DONT_SWAP - thread/LWP should not be swapped
pflag: SSYS - system resident process

pc: unix:panic_idle+0x14: call unix:setjmp
startpc: genunix:thread_create_intr+0x0: save %sp, -0xc0, %sp

unix:panic_idle+0x14(0x2a100047780, 0x0, 0x3c0, , , 0x0)
unix:ktl0+0x64()
-- ktl0 regs data rp: 0x2a100047780
pc: 0x1080b58 unix:set_freemem+0x8: subcc %o5, 0x0, %g0 ( cmp %o5, 0x0 )
npc: 0x1080b5c unix:set_freemem+0xc: bleu,pn %icc, unix:set_freemem+0x48
global: %g1 0x18b5600
%g2 0x18b6800 %g3 0
%g4 0x192f800 %g5 0x1
%g6 0 %g7 0x2a100047ca0
out: %o0 0x193c400 %o1 0x80
%o2 0x1932a90 %o3 0x5aeac2
%o4 0x1840400 %o5 0x100
%sp 0x2a100047021 %o7 0x10e219c
loc: %l0 0x193c800 %l1 0x18eb400
%l2 0x20000 %l3 0x7ffffc00
%l4 0x18eb400 %l5 0x18da000
%l6 0x1e01e2 %l7 0
in: %i0 0 %i1 0x10c5
%i2 0x10c5 %i3 0x18eb778
%i4 0x3 %i5 0x18b6800
%fp 0x2a100047121 %i7 0x10eff08
unix:set_freemem+0x8()
genunix:clock+0x20(0x0)
genunix:cyclic_softint+0xac(0x1, 0x1, , , 0x6005307c128)
unix:cbe_level10+0x8(0x0, 0x0, 0x180c000, 0x2a100047d78, 0x1, 0x1010198)
unix:intr_thread+0x198(0xdf2a1b40, 0xdb21c580, 0x20, 0x8, 0x82, 0x9a00)
unix:user_rtt+0x0()
-- switch to pinned user thread's user stack --



Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms