My Oracle Support Banner

Services Are Not Failing Over To Node2 Automatically When First Node1 Is Down And Vice Versa (Doc ID 2177143.1)

Last updated on APRIL 16, 2023

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.4 to 12.1.0.2 [Release 11.2 to 12.1]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.

Symptoms

Created services using 'srvctl' utility with an expectation that services would failover to the surviving node when one of instance crashes or is down. During our test, service failover is not happening, for example, if instance 1 is down after abort, services associated with that instance go offline without automatically starting on the other node. This event causes connection errors on the application side as service is offline. Have to manually start services and everything works fine.

Command "crscrtl stat resource -t" will show service resource status OFFLINE

Cluster logs "crsd/oraagent"  will show state changed from: ONLINE to: PLANNED_OFFLINE

[

alert_node1.log
===========

Thu Jun 16 11:02:46 2016
Shutting down instance (abort)
License high water mark = 36
Thu Jun 16 11:02:46 2016
USER (ospid: 29753564): terminating the instance
Thu Jun 16 11:02:47 2016
Instance terminated by USER, pid = 29753564

crsd_25.trc
========

2016-06-16 11:02:46.560154 : AGFW:9767: {0:21:12} Received state change for ora.orcl.orclwebsvc.svc 1 1 [old state = ONLINE, new state = PLANNED_OFFLINE]
.

2016-06-16 11:02:46.574053 : CRSD:11052: {0:21:8} {0:21:8} Created alert : (:CRSPE00191:) : Failover cannot be completed for [ora.orcl.admsvc.svc 1 1]. Stopping it and the resource tree
2016-06-16 11:02:46.574654 : CRSCOMM:9767: {1:25809:47312} IpcL: removeConnection: Member 21 does not exist in pending connections.
2016-06-16 11:02:46.575564 : CRSOCR:10795: {0:21:8} Multi Write Batch processing...
2016-06-16 11:02:46.578387 : CRSPE:11052: {0:21:8} Operation 116e2f530 has 14 WOs
2016-06-16 11:02:46.579561 : CRSPE:11052: {0:21:8} ICE has queued an operation. Details: Operation [STOP of [ora.orcl.admsvc.svc 1 1] on [testnode1] : Op:116e2f530, Cmd:116df9610, SeqId:126] cannot run cause it needs R lock for: WO for Placement Path RI:[ora.orcl.admsvc.svc 1 1] server [] target states [OFFLINE ], locked by op [STOP of [ora.orcl.db 1 1] on [<hostname>] : Op:116ff23f0, Cmd:116d17910, SeqId:125]. Owner: CRS-2683: It is locked by 'SYSTEM' for command 'Unplanned Resource State Change : ora.orcl.db'

]

 



Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.