On Oracle Big Data Appliance BDR Hive Jobs Fail After Cluster Stop And Start
(Doc ID 2052548.1)
Last updated on NOVEMBER 08, 2022
Applies to:
Big Data Appliance Integrated Software - Version 3.1.0 to 4.1.0 [Release 3.1 to 4.1]Linux x86-64
Symptoms
Production and Disaster Recovery Oracle Big Data Appliances (BDAs) are stopped and started for a RAM upgrade.
But after restart, BDR Hive jobs scheduled on Disaster Recovery cluster are failing.
The message on Cloudera manager (Replication section) for those job's executions is:
"The remote replication task was not found, probably due to a server restart"
Tried Cloudera Manager Server restart on node03 but still the same issue.
In Cloudera Manager Server log files, there are many messages like
Replication result file '/var/lib/cloudera-scm-server/commands/nnnn/summarynnnnnnnnnnnnnnnnnnn.json' for command 'nnn' is missing.
Tying to manually start one of replication job fails with
2015-08-28 15:00:00,022 ERROR [com.cloudera.cmf.scheduler-1_Worker-1:scheduler.CommandDispatcherJob@<JOBID>]
Skipping schedule execution since the state update failed due to an unexpected error. Schedule details: DbCommandSchedule{id=1, commandName=GlobalPoolsRefresh}
javax.persistence.RollbackException: Error while committing the transaction
at org.hibernate.ejb.TransactionImpl.commit(TransactionImpl.java:92)
at com.cloudera.enterprise.AbstractWrappedEntityManager.commit(AbstractWrappedEntityManager.java:110)
at com.cloudera.cmf.persist.CmfEntityManager.commit(CmfEntityManager.java:366)
at com.cloudera.cmf.scheduler.CommandDispatcherJob.execute(CommandDispatcherJob.java:135)
at org.quartz.core.JobRunShell.run(JobRunShell.java:206)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:548)
Caused by: javax.persistence.OptimisticLockException: org.hibernate.StaleObjectStateException:
Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect): [com.cloudera.cmf.model.DbCommandSchedule#1]
at org.hibernate.ejb.AbstractEntityManagerImpl.wrapStaleStateException(AbstractEntityManagerImpl.java:1416)
Skipping schedule execution since the state update failed due to an unexpected error. Schedule details: DbCommandSchedule{id=1, commandName=GlobalPoolsRefresh}
javax.persistence.RollbackException: Error while committing the transaction
at org.hibernate.ejb.TransactionImpl.commit(TransactionImpl.java:92)
at com.cloudera.enterprise.AbstractWrappedEntityManager.commit(AbstractWrappedEntityManager.java:110)
at com.cloudera.cmf.persist.CmfEntityManager.commit(CmfEntityManager.java:366)
at com.cloudera.cmf.scheduler.CommandDispatcherJob.execute(CommandDispatcherJob.java:135)
at org.quartz.core.JobRunShell.run(JobRunShell.java:206)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:548)
Caused by: javax.persistence.OptimisticLockException: org.hibernate.StaleObjectStateException:
Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect): [com.cloudera.cmf.model.DbCommandSchedule#1]
at org.hibernate.ejb.AbstractEntityManagerImpl.wrapStaleStateException(AbstractEntityManagerImpl.java:1416)
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |