- Customer killed a large DELETE statement of a large table in instance 1 in high workload period in RAC environment of 2 instances.
- Transaction recovery did not yet complete for more than 1 hour by querying V$FAST_START_TRANSACTIONS.
- Update on the same table on instance 2 hung more than 1 hour.
- Transaction recovery complete quickly after shutdown instance 2.
- ASH shows high waiting on "enq: TX - row lock contention" and "transaction" without blocking session until instance 2 shutdown after DELETE session killed in instance 1.
SQL> select instance_number,program,event,BLOCKING_INST_ID b_inst,count(session_id) session_cnt,count(BLOCKING_SESSION) b_session_cnt,min(sample_time) start_time,max(sample_time) end_time,count(*) cnt
2 from DBA_HIST_ACTIVE_SESS_HISTORY where sql_id='d1xg4v7rqx3kt' and instance_number=2
group by instance_number,program,event,BLOCKING_INST_ID
order by instance_number,start_time
3 4 5 ;
INSTANCE_NUMBER PROGRAM EVENT B_INST SESSION_CNT B_SESSION_CNT START_TIME END_TIME CNT
--------------- ------------------------------------ ---------------------------------------- ------ ----------- ------------- ---------- ---------- ------------
2 JDBC Thin Client enq: TX - row lock contention 728939 0 19:20:34 21:04:25 728939
2 JDBC Thin Client transaction 489837 0 20:12:44 21:04:25 489837
- ASH shows that all sessions of the "enq: TX - row lock contention" without blocking session on instance 2 were waiting for the transaction of the Killed DELETE on instance 1.
SQL> select instance_number,event,trunc(p2/65536) XIDUSN, trunc(mod(p2,65536)) XIDSLOT, p3 XIDSQN,count(session_id) session_cnt,count(BLOCKING_SESSION) b_session_cnt,min(sample_time) start_time,max(sample_time) end_time,count(*) cnt
from DBA_HIST_ACTIVE_SESS_HISTORY where sql_id='d1xg4v7rqx3kt' and instance_number=2 and to_char(sample_time,'HH24:MI') >= '20:12' AND BLOCKING_SESSION IS NULL AND EVENT='enq: TX - row lock contention'
group by instance_number,event,trunc(p2/65536) , trunc(mod(p2,65536)) , p3
order by start_time
;
2 3 4 5
INSTANCE_NUMBER EVENT XIDUSN XIDSLOT XIDSQN SESSION_CNT B_SESSION_CNT START_TIME END_TIME CNT
--------------- ---------------------------------------- -------- -------- -------- ----------- ------------- ---------- ---------- ------------
2 enq: TX - row lock contention 932 15 623055 717850 0 20:12:55 21:04:25 717850
SQL>
- ASH shows that SMON of instance 1 was waiting for "DFS lock handle" and "enq: TX - contention" among DELETE session killed in instance 1 and instance 2 shutdown.
SQL> select instance_number,session_id,session_serial#,program,event,BLOCKING_SESSION b_sid,BLOCKING_SESSION_SERIAL# b_serial,BLOCKING_INST_ID b_inst,min(sample_time) start_time,max(sample_time) end_time,count(*) cnt
2 from DBA_HIST_ACTIVE_SESS_HISTORY where upper(program) like '%SMON%'
group by instance_number,session_id,session_serial#,program,event,BLOCKING_SESSION ,BLOCKING_SESSION_SERIAL# ,BLOCKING_INST_ID
order by instance_number,start_time
;
3 4 5
INSTANCE_NUMBER SESSION_ID SESSION_SERIAL# PROGRAM EVENT B_SID B_SERIAL B_INST START_TIME END_TIME CNT
--------------- ---------- --------------- ------------------------------------ ---------------------------------------- --------- --------- ------ ---------- ---------- --------
1 3761 1 oracle@<instance_name> (SMON) DFS lock handle 19:05:24 21:05:07 128
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 20:14:31 21:04:47 50
- The P1 of "DFS lock handle" shows that SMON was requesting TA enqueue through cross instance calls which is used for serializing operations on undo segments and undo tablespaces.
SQL> col event for a20
select instance_number,sample_time,session_id,session_serial#,program,event,p1,p2,p3
from DBA_HIST_ACTIVE_SESS_HISTORY where upper(program) like '%SMON%' AND EVENT='DFS lock handle' ;
SQL> 2
INSTANCE_NUMBER SAMPLE_TIME SESSION_ID SESSION_SERIAL# PROGRAM EVENT P1 P2 P3
--------------- --------------- ---------- --------------- ------------------------------------ -------------------- ------------ ------ ------
1 19:05:24 3761 1 oracle@<instance_name> (SMON) DFS lock handle 1413545989 3 4722
1 20:12:27 3761 1 oracle@<instance_name> (SMON) DFS lock handle 1413545989 3 6668
1 20:13:40 3761 1 oracle@<instance_name> (SMON) DFS lock handle 1413545989 3 5787
1 20:13:50 3761 1 oracle@<instance_name> (SMON) DFS lock handle 1413545989 3 5580
1 20:14:00 3761 1 oracle@<instance_name> (SMON) DFS lock handle 1413545989 3 5826
... ...
1 21:04:57 3761 1 oracle@<instance_name> (SMON) DFS lock handle 1413545989 3 1347
1 21:05:07 3761 1 oracle@<instance_name> (SMON) DFS lock handle 1413545989 3 4582
144 rows selected.
* All of P1 is 1413545989, converted to hexadecimal: 54410005, 54 ASCII code: T, 41 ASCII code: A => TA enqueue
- The P2,P3 of "enq: TX - row lock contention" shows that SMON process was requesting TX enqueue of the UNDO block of the transaction of the Killed DELETE on instance 1.
SQL> select instance_number,session_id,session_serial#,program,event,trunc(p2/65536) XIDUSN, trunc(mod(p2,65536)) XIDSLOT, p3 XIDSQN,COUNT(1) CNT,min(sample_time) start_time,max(sample_time) end_time
from DBA_HIST_ACTIVE_SESS_HISTORY WHERE session_id =3761 and SESSION_SERIAL# = 1 and instance_number=1 AND event like 'enq: TX - contention%'
group by instance_number,session_id,session_serial#,program,event,trunc(p2/65536) , trunc(mod(p2,65536)) , p3
;
2 3 4
INSTANCE_NUMBER SESSION_ID SESSION_SERIAL# PROGRAM EVENT XIDUSN XIDSLOT XIDSQN CNT START_TIME END_TIME
--------------- ---------- --------------- ------------------------------------ -------------------- -------- -------- -------- -------- ---------- ----------
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 932 15 623055 50 20:14:31 21:04:47
SQL>
SQL> select instance_number,session_id,session_serial#,program,event,CURRENT_OBJ#, CURRENT_FILE#, CURRENT_BLOCK#,CURRENT_ROW#,COUNT(1) CNT,min(sample_time) start_time,max(sample_time) end_time
2 from DBA_HIST_ACTIVE_SESS_HISTORY WHERE session_id =3761 and SESSION_SERIAL# = 1 and instance_number=1 AND event like 'enq: TX - contention%'
group by instance_number,session_id,session_serial#,program,event,CURRENT_OBJ#, CURRENT_FILE#, CURRENT_BLOCK#,CURRENT_ROW#
order by start_time
;
3 4 5
INSTANCE_NUMBER SESSION_ID SESSION_SERIAL# PROGRAM EVENT CURRENT_OBJ# CURRENT_FILE# CURRENT_BLOCK# CURRENT_ROW# CNT START_TIME END_TIME
--------------- ---------- --------------- ------------------------------------ -------------------- ------------ ------------- -------------- ------------ -------- ---------- ----------
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 1 42 4 0 1 20:14:31 20:14:31
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 1 3 3 0 1 20:16:22 20:16:22 <<<
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 1 3 5 0 4 20:22:17 20:25:49 <<<
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 0 3 1920 0 3 20:30:13 20:30:33 <<<
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 0 3 15728 0 23 20:37:58 20:52:39 <<<
1 3761 1 oracle@<instance_name> (SMON) enq: TX - contention 1 1002 2 0 18 20:53:09 21:04:47
6 rows selected.
SQL>
Killed a large transaction.