On Oracle Big Data Appliance CDH Cluster Encountering Space Issues on HDFS Partition due to Large dncp_block_verification Log Files
(Doc ID 1986420.1)
Last updated on AUGUST 03, 2021
Applies to:
Big Data Appliance Integrated Software - Version 4.0 and laterLinux x86-64
Symptoms
On a Oracle Big Data Appliance HDFS Cluster will not start due to space issues on some of the nodes
On some nodes, CDH services (such as NodeManager) wont start and fails with below error since one of the u* partition is full
# df -k /u12
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdl1 2883715292 2877855916 4 100% /u12
Caused by: java.io.IOException: mkdir of /u12/hadoop/yarn/nm/usercache failed
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceInit(ResourceLocalizationService.java:219)
... 9 more
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdl1 2883715292 2877855916 4 100% /u12
Caused by: java.io.IOException: mkdir of /u12/hadoop/yarn/nm/usercache failed
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1062)
at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:157)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:721)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:717)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:717)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceInit(ResourceLocalizationService.java:219)
... 9 more
On some nodes dfsadmin report indicates that DFS is 100% full and Datanode won't start. For example:
Name:<HOSTNAME12> (<HOSTNAME12>.<DOMAIN>)
Hostname: <HOSTNAME12>.<DOMAIN>
Decommission Status : Normal
Configured Capacity: 0 (0 B)
DFS Used: 0 (0 B)
Non DFS Used: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used%: 100.00%
DFS Remaining%: 0.00%
Hostname: <HOSTNAME12>.<DOMAIN>
Decommission Status : Normal
Configured Capacity: 0 (0 B)
DFS Used: 0 (0 B)
Non DFS Used: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used%: 100.00%
DFS Remaining%: 0.00%
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |