12.1.2.3.x Cellsrv Hang Due to Flash Cache Buffer Allocation Failure Raising RS-7445 [Serv CELLSRV Hang Detected] (Doc ID 2274030.1)

Last updated on AUGUST 15, 2017

Applies to:

Oracle Exadata Storage Server Software - Version 12.1.2.3.0 to 12.1.2.3.5 [Release 12.1]
Linux x86-64

Symptoms

CELLSRV process gets hung and restarted with RS-7445 [Serv CELLSRV hang detected] [It will be restarted] error.

Hanganalysis of the system state generated shows Flash Cache (FC) related threads waiting for rwlock:

2017-05-13 17:08:01.585 RS-7445 [Serv CELLSRV hang detected] [It will be restarted] [] [] [] [] [] [] [] [] [] []

searching for RQ Tags: ['RQ_Tag_2964954634_54259']
17:05:55.034000 RS sent RQ_Tag_2964954634_54259 length 512
17:07:58.875475 RS Timeout, Outstanding small ioctl: 1, buffer size: 512 outstanding large ioctl: 1, buffer size: 1048576

RS attempted seq number 22598 - cellsrv got 22597
*** REMOTE_RECEIVE_PORT_512_0 missing SKGXP seq number from RS ***

Thread wait state breakdown:
2 polling_sendports
17 waiting_for_SKGXP_receive
4 waiting_for_cond_var
4 FlashLogStoreCheckpoint (this is normal)
1 waiting_for_connect
1 waiting_for_ocl_condvar
91 waiting_for_rwlock <-----------
49 CachePut <-----------
42 CacheGet <-----------
1 waiting_for_semaphore
17 waiting_for_system_work
3 working
2 CacheGet
1 CachePut

Current mutex holders (and how many threads are waiting):

91 FlashCacheCore.cpp LINE:7960 (mutex contention) <---------

83 threads await [Flash Cache LRU] <-------

Checking the cellsrv thread trace that is the final blocker in this case svtrc_6494_117.trc below message can be seen:

FCC: nobuf lruid: 50, numsync: 0, numUser: 0, numElem: 12172795, numKeep: 504256, numinAct: 0, numDKWR: 0, num_scan: 129, num_hot: 0, numAux: 281933, num_nolatch: 0, numdirty: 0, num_ts_failed: 0, num_tc_failed: 129 nextlinknull 0, cg:3903435078:2026033119232

*** 2017-05-13 17:07:57.643
FCC: LruList 50: num=12172795 keep=504256 main=5508110 aux=281933 dirtySize=209251127296 dkwr=0 inact=0 trimcnt=0 retrim=0 noRIOs=5 fid=2499931748 nclient_trimgets=0 nclient_nontrimgets=16376138 dwlist 5133716 dwauxlist 744780 borrow 0 clmn 0 replenishes=11 dirty=4258444 dirtyKeep=174417 oltp=6291742 oltpKeep=501782 invalid=1023110

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms