APPAGENT Process Failover Frequently with Goldengate Vip Fails to Start Due to Wrong Interface in Resource Profile

(Doc ID 2310712.1)

Last updated on OCTOBER 04, 2017

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.4 and later
Information in this document applies to any platform.

Symptoms

Goldengate VIP keeps restarting and following messages are seen in different logs 

1. Clusterware alert log:

2017-09-01 00:04:19.422 [APPAGENT(78137)]CRS-8500: Oracle Clusterware APPAGENT process is starting with operating system process ID 78137
2017-09-01 00:05:20.750 [CRSD(69018)]CRS-2771: Maximum restart attempts reached for resource 'ggate-vip'; will not restart.
2017-09-01 00:06:23.191 [CRSD(69018)]CRS-2771: Maximum restart attempts reached for resource 'ggate-vip'; will not restart.

<<<< Same messages are repeated >>>>>

2017-09-01 01:05:23.224 [APPAGENT(85403)]CRS-8500: Oracle Clusterware APPAGENT process is starting with operating system process ID 85403
2017-09-01 01:06:24.532 [APPAGENT(88808)]CRS-8500: Oracle Clusterware APPAGENT process is starting with operating system process ID 88808
2017-09-01 01:07:25.755 [CRSD(69018)]CRS-2771: Maximum restart attempts reached for resource 'ggate-vip'; will not restart.
2017-09-01 01:09:28.318 [CRSD(69018)]CRS-2771: Maximum restart attempts reached for resource 'ggate-vip'; will not restart.
2017-09-01 01:09:28.340 [APPAGENT(96029)]CRS-8500: Oracle Clusterware APPAGENT process is starting with operating system process ID 96029
2017-09-01 01:10:29.578 [APPAGENT(98364)]CRS-8500: Oracle Clusterware APPAGENT process is starting with operating system process ID 98364
2017-09-01 01:11:30.847 [CRSD(69018)]CRS-2771: Maximum restart attempts reached for resource 'ggate-vip'; will not restart.
2017-09-01 01:13:33.354 [CRSD(69018)]CRS-2771: Maximum restart attempts reached for resource 'ggate-vip'; will not restart.
2017-09-01 01:13:33.390 [APPAGENT(106610)]CRS-8500: Oracle Clusterware APPAGENT process is starting with operating system process ID 106610

 

2. Appagent log (crsd_appagent_root.trc)

2017-09-01 00:04:05.858816 : AGFW:3967391488: {0:129:5} ggate-vip 1 1 state changed from: UNKNOWN to: STARTING
2017-09-01 00:04:05.859749 :CLSDYNAM:3971593984: [ggate-vip]{0:129:5} [start] START action called.
2017-09-01 00:04:05.859784 :CLSDYNAM:3971593984: [ggate-vip]{0:129:5} [start] Executing action script: /u01/app/12.1.0.2/grid/bin/usrvip[start]
2017-09-01 00:04:06.962596 : AGFW:3971593984: {0:129:5} Command: start for resource: ggate-vip 1 1 completed with status: SUCCESS
81328:2017-09-01 00:04:06.962793 :CLSFRAME:4135611968: TM [MultiThread] is changing desired thread # to 3. Current # is 2
2017-09-01 00:04:06.962930 :CLSDYNAM:3971593984: [ggate-vip]{0:129:5} [check] Check action called..
2017-09-01 00:04:06.963012 : AGFW:3967391488: {0:129:5} Agent sending reply for: RESOURCE_START[ggate-vip 1 1] ID 4098:449579
2017-09-01 00:04:06.963147 : AGFW:3967391488: {0:129:5} ggate-vip 1 1 state changed from: STARTING to: ONLINE
2017-09-01 00:04:06.963276 : AGFW:3967391488: {0:129:5} Started implicit monitor for [ggate-vip 1 1] interval=60000 delay=60000
2017-09-01 00:04:06.963350 : AGFW:3967391488: {0:129:5} Agent sending last reply for: RESOURCE_START[ggate-vip 1 1] ID 4098:449579
81334:2017-09-01 00:04:36.963714 :CLSFRAME:4135611968: TM [MultiThread] is changing desired thread # to 2. Current # is 3
2017-09-01 00:04:36.963777 :CLSFRAME:3971593984: {0:121:4} Worker thread is exiting in TM [MultiThread] to meet the desired count of 2. New count is 2
2017-09-01 00:05:05.814027 : AGFW:3967391488: {0:129:5} Agent received the message: AGENT_HB[Engine] ID 12293:449814
2017-09-01 00:05:06.965105 :CLSDYNAM:3975796480: [ggate-vip]{0:129:5} [check] Check action called..
2017-09-01 00:05:06.965133 :CLSDYNAM:3975796480: [ggate-vip]{0:129:5} [check] CHECK action called.
2017-09-01 00:05:06.965150 :CLSDYNAM:3975796480: [ggate-vip]{0:129:5} [check] Executing action script: /u01/app/12.1.0.2/grid/bin/usrvip[check]
2017-09-01 00:05:07.066501 :CLSDYNAM:3975796480: [ggate-vip]{0:129:5} [stop] Stop action called..
2017-09-01 00:05:07.066524 :CLSDYNAM:3975796480: [ggate-vip]{0:129:5} [stop] STOP action called.
2017-09-01 00:05:07.066538 :CLSDYNAM:3975796480: [ggate-vip]{0:129:5} [stop] Executing action script: /u01/app/12.1.0.2/grid/bin/usrvip[stop]
2017-09-01 00:05:07.117725 : AGFW:3967391488: {0:129:5} ggate-vip 1 1 state changed from: ONLINE to: OFFLINE
2017-09-01 00:05:07.117771 : AGFW:3967391488: {0:121:5} Generating new Tint for unplanned state change. Original Tint: {0:129:5}
2017-09-01 00:05:07.117803 : AGFW:3967391488: {0:121:5} Agent sending message to PE: RESOURCE_STATUS[Proxy] ID 20481:31
2017-09-01 00:05:07.117830 : AGFW:3967391488: {0:121:5} Agent has no resources to be monitored, Shutting down ..
2017-09-01 00:05:07.117850 : AGFW:3967391488: {0:121:5} Agent sending message to PE: AGENT_SHUTDOWN_REQUEST[Proxy] ID 20486:32
2017-09-01 00:05:07.118961 : AGFW:3967391488: {0:121:5} Agent is shutting down.
2017-09-01 00:05:07.118970 : AGENT:3967391488: {0:121:5} Agfw calling user exitCB, will exit on return
2017-09-01 00:05:07.118995 : USRTHRD:3967391488: {0:121:5} Application agent is exiting..

<<<< Same behavior is repeated >>>>>

2017-09-01 00:05:07.153460 : AGFW:3381503744: {0:121:5} ggate-vip 1 1 state changed from: UNKNOWN to: STARTING
2017-09-01 00:05:07.154210 :CLSDYNAM:3385706240: [ggate-vip]{0:121:5} [start] START action called.
2017-09-01 00:05:07.154236 :CLSDYNAM:3385706240: [ggate-vip]{0:121:5} [start] Executing action script: /u01/app/12.1.0.2/grid/bin/usrvip[start]
2017-09-01 00:05:08.206706 : AGFW:3385706240: {0:121:5} Command: start for resource: ggate-vip 1 1 completed with status: SUCCESS
81455:2017-09-01 00:05:08.207008 :CLSFRAME:3549724224: TM [MultiThread] is changing desired thread # to 3. Current # is 2
2017-09-01 00:05:08.207037 :CLSDYNAM:3385706240: [ggate-vip]{0:121:5} [check] Check action called..
2017-09-01 00:05:08.207379 : AGFW:3381503744: {0:121:5} Agent sending reply for: RESOURCE_START[ggate-vip 1 1] ID 4098:449842
2017-09-01 00:05:08.207552 : AGFW:3381503744: {0:121:5} ggate-vip 1 1 state changed from: STARTING to: ONLINE
2017-09-01 00:05:08.207655 : AGFW:3381503744: {0:121:5} Started implicit monitor for [ggate-vip 1 1] interval=60000 delay=60000
2017-09-01 00:05:08.207778 : AGFW:3381503744: {0:121:5} Agent sending last reply for: RESOURCE_START[ggate-vip 1 1] ID 4098:449842
81461:2017-09-01 00:05:38.208231 :CLSFRAME:3549724224: TM [MultiThread] is changing desired thread # to 2. Current # is 3
2017-09-01 00:05:38.208273 :CLSFRAME:3385706240: {0:123:4} Worker thread is exiting in TM [MultiThread] to meet the desired count of 2. New count is 2
2017-09-01 00:06:07.121149 : AGFW:3381503744: {1:31081:21296} Agent received the message: AGENT_HB[Engine] ID 12293:450090
2017-09-01 00:06:08.202437 :CLSDYNAM:3389908736: [ggate-vip]{0:121:5} [check] Check action called..
2017-09-01 00:06:08.202464 :CLSDYNAM:3389908736: [ggate-vip]{0:121:5} [check] CHECK action called.
2017-09-01 00:06:08.202498 :CLSDYNAM:3389908736: [ggate-vip]{0:121:5} [check] Executing action script: /u01/app/12.1.0.2/grid/bin/usrvip[check]
2017-09-01 00:06:08.303982 :CLSDYNAM:3389908736: [ggate-vip]{0:121:5} [stop] Stop action called..
2017-09-01 00:06:08.304006 :CLSDYNAM:3389908736: [ggate-vip]{0:121:5} [stop] STOP action called.
2017-09-01 00:06:08.304037 :CLSDYNAM:3389908736: [ggate-vip]{0:121:5} [stop] Executing action script: /u01/app/12.1.0.2/grid/bin/usrvip[stop]
2017-09-01 00:06:08.405727 : AGFW:3381503744: {0:121:5} ggate-vip 1 1 state changed from: ONLINE to: OFFLINE
2017-09-01 00:06:08.405784 : AGFW:3381503744: {0:123:5} Generating new Tint for unplanned state change. Original Tint: {0:121:5}
2017-09-01 00:06:08.405815 : AGFW:3381503744: {0:123:5} Agent sending message to PE: RESOURCE_STATUS[Proxy] ID 20481:31
2017-09-01 00:06:08.405841 : AGFW:3381503744: {0:123:5} Agent has no resources to be monitored, Shutting down ..
2017-09-01 00:06:08.405862 : AGFW:3381503744: {0:123:5} Agent sending message to PE: AGENT_SHUTDOWN_REQUEST[Proxy] ID 20486:32
2017-09-01 00:06:08.407021 : AGFW:3381503744: {0:123:5} Agent is shutting down.
2017-09-01 00:06:08.407033 : AGENT:3381503744: {0:123:5} Agfw calling user exitCB, will exit on return
2017-09-01 00:06:08.407063 : USRTHRD:3381503744: {0:123:5} Application agent is exiting.

 3. <Grid Infrastructure home>/log/$HOST/racg/ggate-vip.log:

bounce eth0 (host=rac1)
forever: error fetching interface information: Device not found
IP:xxx.xxx.201.180 is already up in the network (host=rac1)
forever: error fetching interface information: Device not found
bounce eth0 (host=rac1)
forever: error fetching interface information: Device not found
IP:xxx.xxx.201.180 is already up in the network (host=rac1)
forever: error fetching interface information: Device not found
bounce eth0 (host=rac1)

Changes

 Create application vip

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms