My Oracle Support Banner

On Oracle Big Data Appliance , Critical Node becomes Unresponsive after Hdfs/NFS Gateway is Manually Setup on that Node (Doc ID 2075996.1)

Last updated on DECEMBER 18, 2019

Applies to:

Big Data Appliance Integrated Software - Version 4.1.0 and later
Linux x86-64

Symptoms

On Oracle Big Data Appliance, HDFS/NFS Gateway is manually setup on a critical node say node03 where CM, RM, DataNode, Journal Node, Zookeeper, MySQL and MGMT services reside.

Configuring An NFS Gateway on BDA and Client Setup to Access NFS Gateway (Doc ID 1998215.1)

The main purpose it was chosen by customer to setup HDFS/NFS gateway on node03 is that there are large partitions(/u01 and /u02) available on this node which is not used by Datanode. Thus 'Temporary Dump Directory' ( dfs.nfs3.dump.dir) parameter for Hdfs /NFS gateway is set to point to /u02/tmp.

'Temporary Dump Directory' temporarily saves out-of-order writes before writing them to HDFS. This directory is needed because the NFS client often reorders writes, and so sequential writes can arrive at the NFS gateway in random order and need to be saved until they can be ordered correctly. After these out-of-order writes have exceeded 1MB in memory for any given file, they are dumped to the dfs.nfs3.dump.dir (the memory threshold is not currently configurable).

And during the large file transfer from Client (Mainframe) to Hdfs gateway there are so many alerts generated about concerning health of Journal node and ultimately the node becomes inaccessible.

The health test result for JOURNAL_NODE_FSYNC_LATENCY has become concerning: The 99th percentile fsync latency over the previous minute is 1 second(s). Warning threshold: 1 second(s).
Time: Oct 26, 2015 7:14:04 PM
View Details on <HOSTNAME3>.<DOMAIN>
Monitor Startup: false
Role: journalnode (<HOSTNAME3>)
Role Type: JournalNode
Cluster: bda1
Cluster Display Name: <CLUSTER_NAME>
Service: hdfs
Service Display Name: hdfs
Service Type: HDFS
Hosts: <HOSTNAME3>.<DOMAIN>
Health Test Results:
Health Test Name Event Code Severity Content

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.