Exadata: RS-7445 [Serv CELLSRV is absent] followed by cell restart (Doc ID 1460477.1)

Last updated on OCTOBER 23, 2013

Applies to:

Oracle Exadata Storage Server Software - Version 11.2.2.4.2 to 11.2.2.4.2 [Release 11.2]
Linux x86-64

Symptoms

 Running on cell image lower than 11.2.3.1.0, 7445 error is experienced followed by a restart of the cel.

You may see one or more of the below symptoms

 

1) Error -->  RS-7445 [Serv CELLSRV is absent][It will be restarted] [] [] [] [] [] [] [] [] [] [] error occured on cell server in cell alertlog.


Similar errors to  the following may be found in the database alert log leading up to the incident of the RS-7445 error described:

     IO Error on dev=/dev/<device> cdisk=<cell disk> [op=RD offset=678756352 (in sectors) sz=1048576 bytes] (errno: Input/output error [5])

2) In cell alertlogs you may see an message like "Uncaught error 1841 in user thread"

3) You may also see ORA-600 at nls function in cell alertlog, i.e.

ORA-00600: internal error code, arguments: [UserThread::run_1], [1861]
     
or

ORA-00600: internal error code, arguments: [UserThread::run_1], [1841]

4) Errors related to date fomatting in the cell alertlog

ORA-01841: (full) year must be between -4713 and +9999, and not be 0

ORA-01861: literal does not match format string

5) Incident trace related to erros shows SQLID which is identified to be dealing with date related functions.

6) In state dump related to incident the current thread "PredicateDiskRead" is found 

7) The cell alert log may show output similar to:

Uncaught error 1841 in user thread 

Cellsrv encountered a fatal signal 11 

ERROR: Unable to normalize symbol name for the following short stack (at 

offset 1): 

 

Sat Mar 23 08:34:19 2013 

Cellsrv encountered a fatal signal 11 

Sat Mar 23 08:34:20 2013 

Create Relation INCIDENT 

Errors in file 

/opt/oracle/cell11.2.2.4.2_LINUX.X64_120504/log/diag/asm/cell/odal01cel09/trac 

e/rstrc_9110_4.trc  (incident=1): 

RS-7445 [Serv CELLSRV is absent] [It will be restarted] [] [] [] [] [] [] [] 

[] [] []

 

Call stack should be similar to:

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- --------------------
----------------------------
ossex_dump_stack()+  call     kgdsdst()            000000000 ? 000000000 ?
1130                                               050DF8EA8 ? 000000001 ?
                                                   6B61742E00000001 ?
                                                   000000002 ?
ossex_dump_all()+67  call     ossex_dump_stack()   000000000 ? 000000006 ?
                                                   000000002 ? 000000001 ?
                                                   6B61742E00000001 ?
                                                   000000002 ?
ossp_state_dump()+1  call     ossex_dump_all()     000000006 ? 000000006 ?
60                                                 000000002 ? 000000001 ?
                                                   6B61742E00000001 ?
                                                   000000002 ?
dbgexPhaseII()+1960  call     ossp_state_dump()    01A05C198 ? 000000006 ?
                                                   000000005 ? 000000001 ?
                                                   6B61742E00000001 ?
                                                   000000002 ?
dbgexProcessError()  call     dbgexPhaseII()       0273BAA30 ? 2AADE80D1830 ?
+5535                                              050F04088 ? 000000001 ?
                                                   2AAD00000000 ? 000000002 ?
dbgeExecuteForError  call     dbgexProcessError()  0273BAA30 ? 2AADE80D1830 ?
()+138                                             000000001 ? 000000000 ?
                                                   100000000 ? 000000002 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  0273BAA30 ? 2AADE80D1830 ?
2683                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000002 ?
osspdde()+163        call     dbgePostErrorKGE()   000000000 ? 01A060898 ?
                                                   000000258 ? 2AADE80D1830 ?
                                                   100000000 ? 000000002 ?
kgerinv()+615        call     osspdde()            01A05C198 ? 01A060898 ?
                                                   000000258 ? 2AADE80D1830 ?
                                                   100000000 ? 000000002 ?
kgeasnmierr()+143    call     kgerinv()            01A05C198 ? 01A060898 ?
                                                   050DF56E0 ? 000000001 ?
                                                   01A05C198 ? 000000002 ?
_ZN10UserThread3run  call     kgeasnmierr()        01A05C198 ? 01A060898 ?
Ev()+4377                                          050DF56E0 ? 000000001 ?
                                                   000000000 ? 000000745 ?
_ZN9Scheduler8sched  call     _ZN10UserThread3run  01A0135D0 ? 01A060898 ?
uleEv()+318                   Ev()                 050DF56E0 ? 000000001 ?
                                                   000000000 ? 000000745 ?
_Z16kernelThreadMai  call     _ZN9Scheduler8sched  01A013C78 ? 01A060898 ?
nPv()+4                       uleEv()              050DF56E0 ? 000000001 ?
                                                   000000000 ? 000000745 ?
oracle_fp_thread_ma  call     _Z16kernelThreadMai  01A013C78 ? 01A060898 ?
in()+110                      nPv()                050DF56E0 ? 000000001 ?
                                                   000000000 ? 000000745 ?
start_thread()+221   call     oracle_fp_thread_ma  7FFFE27655B0 ? 01A060898 ?
                              in()                 050DF56E0 ? 000000001 ?
                                                   000000000 ? 000000745 ?
clone()+109          call     start_thread()       050F06940 ? 01A060898 ?
                                                   050DF56E0 ? 000000001 ?
                                                   000000000 ? 000000745 ?
0000000000000000     call     clone()              050F06940 ? 01A060898 ?
                                                   050DF56E0 ? 000000001 ?

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms