Cellsrv consumes high CPU and no free swap

(Doc ID 1527100.1)

Last updated on AUGUST 13, 2014

Applies to:

Oracle Exadata Storage Server Software - Version 11.2.3.1.1 and later
Information in this document applies to any platform.

Symptoms

Cell alert log complains of of no free swap space:

Tue Nov 06 06:57:10 2012
Errors in file
/opt/oracle/cell11.2.3.1.1_LINUX.X64_120607/log/diag/asm/cell/odal07cel02/trace/rstrc_28641_4.trc  (incident=10):
RS-7445 [No more free swap space] [Service CELLSRV will be restarted] [] []
[] [] [] [] [] [] [] []
Incident details in:
/opt/oracle/cell11.2.3.1.1_LINUX.X64_120607/log/diag/asm/cell/odal07cel02/incident/incdir_10/rstrc_28641_4_i10.trc
Sweep [inc][10]: completed
Tue Nov 06 06:57:17 2012
State dump signal delivered to Cellsrv<24094>
State dump signal delivered to Cellsrv<24094> by RS.

 

cellsrv process takes very high CPU.There was no swap free memory.

Top - 22:03:23 up 46 days, 21:28,  0 users,  load average: 35.13, 10.48, 4.41
Tasks: 388 total,   5 running, 383 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.6%us, 76.0%sy,  0.0%ni, 17.8%id,  4.4%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  24530660k total, 24395944k used,   134716k free,      836k buffers
Swap:  2096376k total,  2096372k used,        4k free,    71800k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                  
29582 root      18   0 16.5g 4.9g  74m S 8194.2 21.1   2237:13 /opt/oracle/cell11.2.3.1.1_LINUX.X64_120607/cellsrv/bin/cellsrv 100 5000 9 5042        
28230 root      18   0 1553m 524m  17m S 1578.8  2.2 707:38.07 /usr/java/jdk1.5.0_15/bin/java -Xms256m -Xmx512m -Djava.library.path=/opt/oracle/cell11.

31234 root      15   0 11.3g 9.3g 1228 S  0.4 39.6   9:10.73 /usr/sbin/lsi_mrdsnmpagent -c /etc/snmp/snmpd.conf       <<<<<<<<<<                                 

@ zzz ***Sat Jan 19 22:03:26 EST 2013
@ MemTotal:     24530660 kB
@ MemFree:        144332 kB
@ .
@ SwapTotal:     2096376 kB
@ SwapFree:            0 kB  ===========>

 

Reviewed osw files, specifically the oswps directory.  After extracting the same timestamp and getting one snapshot of the ps command and ordering by SZ,

We see the the lsi_mrdsnmpagent and cellsrv as one of the top processes using most of the memory:
/usr/sbin/lsi_mrdsnmpagent 2785662 10.6GB
cellsrv 1954933 7.4 GB

* Note the large difference.  It is expected for cellsrv to use that memory, but not for the lsi_mrdsnmpagent.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms