Frequent Instance Evictions with ORA-29770 errors on SLES-11 (Doc ID 2163270.1)

Last updated on AUGUST 14, 2016

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.1 to 12.1.0.2 [Release 11.2 to 12.1]
Linux x86-64

Symptoms

Frequent Instance Evictions with ORA-29770 errors on SLES-11

1. LMS stacks show it stuck with IPC calls
__poll()+47<-ssskgxp_poll()+40<-sskgxp_select()+263<-skgxpiwait()+3680<-skgxpwaiti()+1544<-skgxpwait()+162<-ksxpwait()+2501<-ksliwat()+12852<-kslwaitctx()+1
63<-kslwait()+141<-ksxprcv_int()+6092<-ksxprcvimd()+36<-kjctr_rksxp()+313<-kjctrcv()+398<-kjcsrmg()+102<-kjmsm()+4953<-ksbrdp()+971<-opirip()+623<-opidrv()+
603<-sou2o()+103<-opimai_real()+266<-ssthrdmain()+252<-main()+201<-__libc_start_main()+230<-_start()+36

sendmsg()+16<-sskgxp_sndmsg()+444<-skgxpsegsnda()+146<-skgxpimcpy()+5195<-skgxpmcpy()+238<-ksxpmcpy_with_bcb()+1115<-kclbcpy2()+324<-kcl_snd_cur()+2376<-kcl
grantlk()+3167<-kclcrrf()+1092<-kjblcrcbk()+639<-kjblpcr()+1398<-kjbmpbast()+1920<-kjmxmpm()+966<-kjmpmsgi()+1634<-kjmsm()+8157<-ksbrdp()+971<-opirip()+623<
-opidrv()+603<-sou2o()+103<-opimai_real()+266<-ssthrdmain()+252<-main()+201<-__libc_start_main()+230<-_start()+36

 

2. The instance alert log shows frequent LMS has not moved messages prior to the instance eviction
LMS4 (ospid: 81849) has not moved for 98 sec (1461829990.1461829892)
Incident 1624814 created, dump file:
xxx/incident/incdir_1624814/xxxx_3_lmhb_81901_i1624814.trc
ORA-29770: global enqueue process LMS4 (OSID 81849) is hung for more than 70 seconds

 

3. The MPSTAT shows SYS CPU were consumed 100%
09:25:35 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
09:25:36 all 1.58 0.00 5.08 0.00 0.00 0.32 0.00 0.00 93.02
09:25:36 0 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 <<== Here
09:25:36 32 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 <<== Here

 

4. Top output shows the LMSs processes consume 100% CPU
Tasks: 1128 total, 13 running, 1109 sleeping, 6 stopped, 0 zombie
Cpu(s): 4.0%us, 6.4%sy, 0.0%ni, 86.9%id, 1.8%wa, 0.0%hi, 1.0%si, 0.0%st
Mem: 2067510M total, 2057277M used, 10233M free, 114M buffers
Swap: 4095M total, 12M used, 4083M free, 648614M cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
105628 oracle -2 0 1300g 32m 12m R 100 0.0 122:37.43 oracle  <<== LMS process
105652 oracle -2 0 1300g 32m 12m R 100 0.0 118:16.43 oracle  <<== LMS process

 

5. The packet receive errors are seen as a side effect of LMS process spinning on CPU.
4389 packet receive errors <<==START HERE
11651 packet receive errors
19715 packet receive errors
27101 packet receive errors
27101 packet receive errors
32902 packet receive errors
33351 packet receive errors
34651 packet receive errors
42671 packet receive errors
51640 packet receive errors
60948 packet receive errors
70843 packet receive errors
81002 packet receive errors
90479 packet receive errors
95770 packet receive errors <<== STOP HERE

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms