Oracle Big Data Appliance V3.0.1 and Higher Mammoth Install Step to Start Hadoop Services Fails - NameNodes report: NameNode is not formatted
(Doc ID 1903785.1)
Last updated on NOVEMBER 05, 2019
Applies to:
Big Data Appliance Integrated Software - Version 3.0.1 and laterx86_64
Symptoms
1. Step 11 (Starting Hadoop Services) fails with:
Error [9910]: (//bdanode03//Stage[main]/Hadoop::Startsvc2/Exec[setup_scm]/returns) change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm.sh &>
/opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_1404122025.out returned 1 instead of one of [0] at /opt/oracle/BDAMammoth/puppet/modules/hadoop/manifests/startsvc2.pp:350
2. /var/log/messages shows:
Jun 30 11:54:30 bdanode03 puppet-agent[10454]: (/Stage[main]/Hadoop::Startsvc2/File[setup_'{md5}<MD5>' to '{md5}<MD5>'
Jun 30 12:16:07 bdanode03 puppet-agent[10454]: (/Stage[main]/Hadoop::Startsvc2/Exec[setup_scm]/returns) change from notrun to 0 failed:
/opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm.sh &> /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_1404122025.out returned 1 instead of one of [0] at
/opt/oracle/BDAMammoth/puppet/modules/hadoop/manifests/startsvc2.pp:350
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[create_user_dir]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[create_user_dir]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[create_oracle_user_sub_dir]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[create_oracle_user_sub_dir]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[change_oracle_user_dir_ownership]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[change_oracle_user_dir_ownership]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[create_oozie_user_sub_dir]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[create_oozie_user_sub_dir]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[change_oozie_user_dir_ownership]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[change_oozie_user_dir_ownership]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[change_hive_user_dir_ownership]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[change_hive_user_dir_ownership]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[unzip_oozie_sharelib]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[unzip_oozie_sharelib]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[put_oozie_sharelib]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[post]/Hadoop::Postcdhstart/Exec[put_oozie_sharelib]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[finish]/Miscsetup::Endstep/Notify[end_step]) Dependency Exec[setup_scm] has failures: true
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: (/Stage[finish]/Miscsetup::Endstep/Notify[end_step]) Skipping because of failed dependencies
Jun 30 12:16:08 ed-bda-102 puppet-agent[10454]: Finished catalog run in 1300.96 seconds
Jun 30 12:16:09 ed-bda-102 puppet-agent[10454]: triggered run
Jun 30 12:16:12 ed-bda-102 puppet-agent[10454]: Finished catalog run in 0.04 seconds
3. In Cloudera Manager (CM) both hdfs and yarn service are in "Bad Health"
4. Further investigating the log files for either NameNode shows: NameNode is not formatted
From CM: hdfs > Instances > Any Role Type of NameNode > Select a NameNode > Log File
NameNode is not formatted. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:216)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:880)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:639)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:440)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:652)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:637)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1286)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1352)
or From CM commands:
NN fails with: Supervisor returned FATAL. Please check the role log file, stderr, or stdout. due to NameNode is not formatted.
...
2015-03-05 05:45:25,063 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2015-03-05 05:45:25,063 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:212)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1005)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:735)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:531)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:587)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:754)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:738)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1427)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1493)
2015-03-05 05:45:25,065 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-03-05 05:45:25,066 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at bdanode01.example.com/<PRIVATE_IP_HOST1>
************************************************************/
5. The Log file for the JobHistory show:
From CM: yarn > Instances > Any Role Type of JobHistory Server > Select jobhistory > Log File
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo. Not retrying because failovers (15) exceeded
maximum allowed (15) java.net.ConnectException: Call From bdanode03.example.com/<IP_ADDRESS> to
bdanode01.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:
...
com.sun.proxy.$Proxy12.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:699)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy13.getFileInfo(Unknown Source)
...
org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1528)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.mkdir(HistoryFileManager.java:640)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.tryCreatingHistoryDirs(HistoryFileManager.java:570)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.createHistoryDirs(HistoryFileManager.java:533)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:501)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.hs.JobHistory.serviceInit(JobHistory.java:94) at
...
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
Background |
Solution |
References |