Agent Upload Times Out And OMS Fails To Push Files Into Repository In HA Environment (Doc ID 1332728.1)

Last updated on DECEMBER 23, 2016

Applies to:

Enterprise Manager Base Platform - Version 10.2.0.4 to 11.1.0.1 [Release 10.2 to 11.1]
Information in this document applies to any platform.

Symptoms

Agent is unable to upload the data, upload fails with timeout errors.

Errors observed from emagent.trc file

2011-06-20 14:14:51,444 Thread-4100959136 ERROR ssl: nzos_Handshake failed, ret=28862
2011-06-20 14:14:51,444 Thread-4100959136 ERROR http: 30: Unable to initialize ssl connection with server, aborting connection attempt
2011-06-20 14:14:51,444 Thread-4100959136 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://slb.abc.com:1159/em/upload: retStatus=-1
2011-06-20 14:14:51,486 Thread-4100959136 ERROR ssl: nzos_Handshake failed, ret=28864
2011-06-20 14:14:51,486 Thread-4100959136 ERROR http: 27: Unable to initialize ssl connection with server, aborting connection attempt
2011-06-20 14:14:51,486 Thread-4100959136 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://slb.abc.com:1159/em/upload: retStatus=-1



Agent Status and Upload were as shown below:

$ ./emctl status agent
Oracle Enterprise Manager 10g Release 4 Grid Control 10.2.0.4.0.
Copyright (c) 1996, 2007 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 10.2.0.4.0
OMS Version : 10.2.0.4.0
Protocol Version : 10.2.0.4.0
.
.
Last Reload : 2011-06-22 07:58:50
Last successful upload : (none)
Last attempted upload : (none)
Total Megabytes of XML files uploaded so far : 0.00
Number of XML files pending upload : 34
Size of XML files pending upload(MB) : 3.81
Available disk space on upload filesystem : 40.87%
Last successful heartbeat to OMS : 2011-06-22 07:59:53
---------------------------------------------------------------
Agent is Running and Ready

$ ./emctl upload
Oracle Enterprise Manager 10g Release 4 Grid Control 10.2.0.4.0.
Copyright (c) 1996, 2007 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload error: Upload timed out before completion.
Number of files to upload before the upload: 32, total size (MB): 3.81.
Remaining number of files to upload: 32, total size (MB): 3.81.


OMS is also unable to process the files to Repository. XML files in shared RECV directory not getting uploaded to Repository.


Errors observed from emoms.trc file:

2011-06-18 00:14:07,487 [HealthMonitor] ERROR emd.main run.299 - HealthMonitor : Found errant task : TaskRegn:ID0,Callback:class oracle.sysman.emdrep.failover.OMSHeartbeatRecorder,Iterative:true,Duration:300,DueTime:1308370447485
2011-06-18 00:14:07,489 [HealthMonitor] ERROR emd.main restart.418 - HealthMonitor Jun 18, 2011 12:14:07 AM OMS Heartbeat Recorder error: Heartbeat recorder timed out. OMS id=4100
2011-06-18 00:14:07,497 [HealthMonitor] ERROR emd.main executeCommand.614 - HealthMonitor : Executing diagnostic command for module omsThread. Jun 18, 2011 12:14:07 AM


Observed invalid entries in MGMT_FAILOVER_TABLE

It is possible to verify that this is the case by running the following SQL as the SYSMAN user, connected to the Repository database:

select * from MGMT_FAILOVER_TABLE;

Found that after restarting the OMS tok file under shared receive directory is not getting updated with new number.
Also can be clarified from the below provided EMDIAG output:

Stale OMS id's

FAILOVER_ID OMS                      LAST_TIMESTAMP
---------------------------------------------------------
4099        oms2.abc.com:4889_Manag  18-JUN-2011 00:04:11
4101        oms2.abc.com:4889_Manag  18-JUN-2011 00:08:02
4100        oms1.abc.com:4889_Manag  18-JUN-2011 00:08:07


Changes

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms