Services, instances, or other resources are not restarted or relocated after a state change
Last updated on JANUARY 30, 2018
Applies to:Oracle Database - Enterprise Edition - Version 22.214.171.124 to 126.96.36.199 [Release 11.2]
Information in this document applies to any platform.
Symptom 1. After a resource which has dependent resources undergoes a state change, some of the dependent resources are not relocated or restarted when they should be.
Example 1. The ora.net1.network resource briefly goes to UNKNOWN state. As a result, ora.ons resource is stopped. Then, ora.net1.network comes back ONLINE. But ora.ons does not come back online.
Example 2. The network resource state changes to UNKNOWN on node1, then changes back to ONLINE immediately. But, the dependent resource SCAN VIP goes OFFLINE instead of getting restarted or relocated. The SCAN VIP doesn't get restarted again until 15 minutes later when routine check finds it offline.
Example 3. The database instance check command times out on node 2. As a consequence, the database service is stopped on node 2. It should be relocated to node 3, but it is not. The service is neither started on any other node nor restarted on the original node.
Example 4. ora.net1.network resource status changed to UNKNOWN. As a result the dependent resource of node VIP stopped. But, the VIP never started up after ora.net1.network was online.
Example 5. The VIP resource becomes intermediate for a short time. Service resources goes offline when vip becomes intermediate, since service has hard stop dependency against vip. When VIP resource becomes ONLINE again, the service doesn't get started.
Example 6. The network resource on node4 becomes intermediate for a short time, then back online. Both the ONS and Node Listener resources got stopped and not restarted.
Symptom 2. The CRSD log contains a line that says "Operation [START of <non-started resource name> on <node>... has been replaced with [START of <other resource name> on <node>...".
IMPORTANT NOTES -
1. This line can be in the CRSD log of a DIFFERENT node from the affected one.
2. If more than one resources fail to get started, we can still see the "Operation [START... has been replaced with ..." only once. This doesn't necessarily mean that only one resource was affected.
This issue is known to affect Grid Infrastructure 188.8.131.52 and 184.108.40.206 .
It can affect any version of database or service resource that is used with these versions of Grid Infrastructure.
The problem can be triggered when there's a state change of a resource which has MORE THAN ONE dependent resource. Examples of this kind of parent resouce, which has more than one dependent resource, include a database instance with many services, or ora.net1.network , or VIP resource.
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms