Fatal background process termination leads to RAC instance crash (Doc ID 2289334.1)

Last updated on JULY 22, 2017

Applies to:

Oracle Database - Enterprise Edition - Version 12.2.0.1 and later
Information in this document applies to any platform.

Symptoms

There was database level hang .but impacted backgroud process was waiting on idle event.

2017-06-28T11:45:31.440178+04:00
Errors in file /u01/app/oracle/diag/rdbms/xldr/xldr1/trace/xldr1_dia0_68074.trc (incident=776416):
ORA-32701: Possible hangs up to hang ID=5 detected
Incident details in: /u01/app/oracle/diag/rdbms/xldr/xldr1/incident/incdir_776416/xldr1_dia0_68074_i776416.trc
2017-06-28T11:45:33.653579+04:00
DIA0 terminating instance 1 (local instance) due to hang with ID=5
which is a LOCAL, HIGH confidence hang.
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
DIA0 (ospid: 68074): terminating the instance
2017-06-28T11:45:33.728536+04:00
System state dump requested by (instance=1, osid=68074 (DIA0)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xldr/xldr1/trace/xldr1_diag_68033_20170628114533.trc
2017-06-28T11:45:34.229621+04:00
ORA-1092 : opitsk aborting process
2017-06-28T11:45:35.202749+04:00
License high water mark = 262
2017-06-28T11:45:40.688540+04:00
Instance terminated by DIA0, pid = 68074
2017-06-28T11:45:40.772028+04:00
Warning: 2 processes are still attach to shmid 5799949:
(size: 2097152 bytes, creator pid: 67596, last attach/detach pid: 68017)
2017-06-28T11:45:41.204122+04:00
USER (ospid: 31186): terminating the instance
2017-06-28T11:45:41.205392+04:00
Instance terminated by USER, pid = 31186

IDLE BG wait events such as 'rdbms ipc message' and 'pmon timer', etc were set up as terminal wait events but not trusted waits.
With all of the changes recently, this is causing Hang Manager to treat a hang whose root is progressing to be resolved.
In this particular case, the hang was resolved via instance termination because the root was a Fatal BG (SMON).

Resolvable Hangs in the System
Root Chain Total Hang
Hang Hang Inst Root #hung #hung Hang Hang Resolution
ID Type Status Num Sess Sess Sess Conf Span Action
----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
5 HANG RSLNPEND 1 3796 2 2 HIGH LOCAL Terminate Instance
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.

Inst Sess Ser Proc Wait Wait
Num ID Num OSPID Name Time(s) Event
----- ------ ----- --------- ----- ------- -----
1 282 48197 399572 FG 483 enq: TS - contention
1 3796 23918 68154 SMON 51 smon timer

 

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms