My Oracle Support Banner

Oracle Big Data Appliance Frequently Asked Questions v2.x and Higher (FAQ) (Doc ID 1518525.1)

Last updated on OCTOBER 18, 2022

Applies to:

Big Data Appliance Integrated Software - Version 2.0.1 and later
Big Data Appliance X3-2 Hardware - Version All Versions and later
Big Data Appliance X4-2 Starter Rack - Version All Versions and later
Big Data Appliance X3-2 Full Rack - Version All Versions and later
Big Data Appliance Hardware - Version All Versions and later
Linux x86-64

Purpose

This document provides answers to frequently asked questions about the Oracle Big Data Appliance v2.x.

Questions and Answers

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Purpose
Questions and Answers
  What is Big Data?
 Is BDA Cloudera Certified?
 Is Big Data SQL 3.2.x Cloudera Certified?
 Is Cloudera Navigator Optimizer is licensed for the BDA?
 What is the Oracle Big Data Appliance X3-2 or X4-2?
 What Hardware Components for Oracle Big Data Appliance X3-2 are included?
 What Hardware Components for Oracle Big Data Appliance X4-2 are included?
 What is the Oracle NoSQL Database?
 Can data in Avro format stored in NoSQL be read directly from a MapReduce job?
 What are the Oracle Big Data Connectors?
 Is it possible to upgrade the Big Data Connectors?
 What is the Oracle SQL Connector for Hadoop Distributed File System (OSCH)?
 What is the Oracle Loader for Hadoop?
 What is the Oracle Data Integrator Application Adapter for Hadoop?
 What is the Oracle R Advanced Analytics for Hadoop?
 What Is Oracle XQuery for Hadoop?
 What formats are available to Copy to Hadoop?
 What Software and versions are supported on Big Data Appliance?
 Can I upgrade or uninstall any of the software installed on the Oracle Big Data Appliance?
 What about CDH that is part of BDA, can I upgrade individual CDH components but not the entire CDH bundle?
 What about Solr?  Can I upgrade Solr independently of the CDH bundle?
 Can I install other software on the Oracle Big Data Appliance?
 If I install other software on the BDA how will it be upgraded?
 If I want to use a different version of Python or some other product than what is installed on the BDA can I do that?
 Are there any plans to include Python 3 as part of a BDA release?
 What about PyCharm which is a Python IDE detailed at "PyCharm"? Can I install this on the BDA?
 What about JDK?  Can I use a different JDK version than what is installed on the BDA?
 What about libgfortran3?  Can I install this on my BDA running OL 5.8?
 What about SBT(Scala Build Tool)?
 What about DataRobot?
 What about Luigi?  Would that fall into a different category that what is discussed here?
 Are Jupyter notebooks and Ipython supported on BDCS or BDA clusters?
 What about installing Spark Notebook in Hue on the BDA?
 What about the Livy Server on the BDA/CDH distribution? Is this considered a third party product?
 What about installing Apache Drill on the BDA?
 Can Apache James software be installed on a BDA cluster?
 Is it possible to upgrade the Apache http server 2.2 included with the BDA software to a newer version?
 What about installing Kudu on the BDA?
 Is VirtualBox used on the BDA?
 Is there any guidance regarding pip on the BDA?
 Is there any guidance on what procedures to follow when installing 3rd party python packages?
 Is it supported to remove data directories from HDFS to be used for Kudo?
 What about supporting MasterSAM on the BDA?
 What about installing Apache Maven on the BDA?
 We need to install Maven for installing a particular tool on BDA 4.7.0. We want to know which versions of Maven is compatible for installation in Oracle BDA servers?
 Can Apache Atlas be installed on a BDA?
 What about installing subversion on the BDA?
 What about installing 'Google Cloud SDK' on the BDA cluster?
 What about installing 'Streamsets' on the BDA Cluster?
 What about Tez.  Is it supported on the BDA?
 What about confluent?  Is it supported on the BDA?
 What about udpipe?  Is it supported on the BDA?
 What about TensorFlow and Keras?  Is that supported on the BDA?
 What about iperf(3)?  Is it supported on the BDA?
 What about upgrading Apache  on the BDA?
 What about upgrading datafu on the BDA?
 What about installing Nagios on the BDA?
 What about installing SmartSVN client on the BDA?
 What about installing Apache Zeppelin on the BDA?
 What about installing Elasticsearch on the BDA?
 If Elasticsearch, or other third party software, requires changing the Linux kernel parameter "vm.max_map_count" from it's default value of 65530 to 262144, is this change permitted on a BDA and will it impact a running cluster?
 What about installing Orient DB - Graph Database, Scala - Language, Kamanja - Decisioning platform?
 What about DataRobot (Machine Learning Software)?
 What about installing Anaconda (a Python distribution for data science) through parcels?
 Is there any additional information about the Anaconda parcel that can be provided?
 Can the Anaconda Parcel install be rolled back if necessary?
 What about g++? Is it possible to update the version of g++ installed on BDA nodes?
 What about installing python 2.7 with spark?
  What about support for software collections (scl)?
 What about installing Malware protection software on the BDA?
 What about Apache NiFi, is that supported on the BDA.
 What about Cloudera Dataflow?
 Is Cloudera Data Platform supported in BDA?
 Can xorg-x11-server-common be removed from the BDA?
 What about Auto-TLS?  Is Auto-TLS supported on the BDA?
 BDA V4.1 OL5.8 appears to have an old version of Apache.  Can it be upgraded?
 Can you clarify exactly when Oracle Linux packages can be upgraded on the BDA?
 Does that mean that if I am on BDA cluster with OL6.5 I can upgrade to OL6.8?
 Does that mean the Oracle Linux nss-softokn package be upgraded to fix a bug?
 Please review the supported Python versions on BDA V4.2 and BDA V4.3?
 Is it possible to install OVM on a BDA?
 Is KVM supported on Edge Nodes?
 Is it ok to install the higher ORD packages on the BDA?
 Can you install an updated version of ORD without overwritting the current installed version so you can go back and forth between the two?
 If you install the updated version and there is compatibility issue, can you rollback to the previous version and restore compatibility?
 What about RStudio? Is it ok to install RStudio? 
 On the BDA can I upgrade the nss-pam-ldapd-0.7.5-18.2.el6_4.x86_64 package to facilitate adding LDAP to the BDA v4.2 to connect to our AD server?
 Is downloading packages in EPEL (Extra Packages for Enterprise Linux) at https://fedoraproject.org/wiki/EPEL supported on the BDA?
 What about installing an official Oracle Linux 6.4 package which is missing on the BDA? Would that be supported?
 On the Oracle Big Data Appliance Can Oracle Enterprise Linux be replaced with Red Hat Enterprise Linux?
 Is running RedHat IPA for user management and Kerberos supported?
 What about installing the two OS libraries on the BDA: libxml2-dev and libcurl4-openssl-dev?  Is that supported?
 What about installing TMUX terminal multiplexer?  Is that supported?
 Which third-party tools may or may not be compatible with the BDA?
 Can a higher version of GIT be supported on BDA V4.1 with X4-2?
 What about installing GNOME rpms?
 What about GNOME-related desktop packages on a BDA node? For a better GUI mode, is it possible to make a groupinstall of "Desktop", "Desktop Platform", "X Window System", and "Fonts" packages?
 What about UEK 3 and OL7? Can I install them on BDA V4.2?
 Is it possible to install Hue on a node off the BDA?
 Is it possible to add an Edge node running a 7.1 OS (Centos/OL/RedHat) into the BDA Cluster based on OL5 or OL6?
 Does the kernel on an edge node need to be the same as the kernel on the rest of the BDA cluster?
 In the case of kernel instability is updating the baud rate in the BIOS in the ILOM recommended?
 On the BDA the nfsnobody user in the password file /etc/passwd has a user id of 5 digits.  Can it be changed to a user id of 4 digits?
 On the BDA is it possible to change the values in /etc/login.defs for PASS_MAX_DAYS, PASS_MIN_DAYS, PASS_MIN_LEN ?
 Is it ok to remove older/weaker ciphers from sshd_config?
 What Hadoop version is Oracle using in the Oracle Big Data Appliance v2.x?
 Can I grow my BDA environment beyond the single rack?
 Are there scale limits in BDA?
 Is it possible to allocate more than 1 or 2 disks per server for Oracle NoSQL DB in v2.x?
 How can I monitor the Hadoop Cluster on my Oracle Big Data Appliance in v2.x?
 How can I monitor the Oracle Big Data Appliance?
 On the BDA is the recommendation to make changes to security configurations like enabling/disabling MIT or AD Kerberos with bdacli or in Cloudera Manager directly?
 How can non-CDH processes (e.g. MySQL master/backup server, Puppet master/agent, NTP server, sshd) be monitored on the BDA?
 How many NTP servers are recommended on the BDA?
 Is ASR (Auto Service Request) supported on Oracle Big Data Appliance v2?
 Where can I find the documentation for the Oracle Big Data Appliance v2.x?
 Where can I find more information for Oracle Big Data Appliance X4-2?
 What is the InfiniBand fabric and what are its advantages/features?
 Are we able to modify our /etc/hosts files to change private/ Infiniband addresses to public IP addresses?
 In the /etc/hosts file why does the private IP address have both the private host name and  the Client Interface host name?
 Is IPv6 supported on Big Data Appliance?
 If ssh  to a BDA server becomes noticeably slower what can be done?
 What could be the reason for being able to "ping" a BDA server but not "ssh" into it?
 Should anti-virus software be installed on the BDA?
 Is there an official re-racking service for the BDA?
 Does Security-Enhanced Linux (i.e. SE Linux) run on the BDA?
 Are there any firewall services running on the individual  BDA nodes?
 Where can I find information on licensing related to Oracle Big Data Appliance and the Oracle Connectors for Hadoop?
 Is the BDA v2.x tested and validated with ZFSSA?
 Is Impala part of the BDA software stack in V2.2.1?
 Is placing the ILOMs on a different subnet and VLAN than the admin network supported?
 Is it possible to divide the Admin network and the ILOM network?
 Is there any way possible to proceed without separate subnets for the management and client networks?
 Would the admin network ever be used by any tools or utilities?
 If the client network is not connected will that be a problem for the network setup scripts?
 Can nodes of BDA receive JumboFrame packets  (MTU=9000) via Client Access Network?  What's the default value of bondeth0?  Can we change it?
 If a BDA has been divided into multiple clusters can Mammoth support different CDH version on each?
 Is it possible to mix multiple BDA versions in the same cluster?
 Can BDA support more than one Cloudera Manager and cluster on a set of BDA nodes?
 Can Mammoth installation work when clusters span several racks?
 Can an X4-2 expansion kit be installed on an X3-2 starter rack?
 Are there any limitations when handling an in-rack expansion with X4-2 with 4TB disks and X5-2 8TB disks?
 What version of the OS image ships on a new X4-2 BDA?
 How can I find the space consumption by OS for each node in the BDA? In particular there is a specific partition on disks 1 and 2 how do I find the size for each?
 Is link aggregation supported on the BDA?
 Does BDA support encryption for the data at rest & data in transit?
 Please review how servers can be partitioned between Oracle NoSQL DB and HDFS?
 Can a BDA Starter Rack (6 nodes) support a Hadoop and NoSQL at the same time?
 In which BDA version did a cluster no longer support both Hadoop and NoSQL at the same time via a Mammoth install?
 What is the smallest recommended footprint on the BDA for running both NoSQL and Hadoop?
 Can a BDA Starter Rack (6 nodes) be split into two Hadoop clusters?
 Can a 9 server BDA  rack be ordered?
 Is it possible to get a starter rack and intall 3 nodes of NoSQL and leave the rest empty?
 Can the BDA be re-IPed?
 Does BDA 2.4 support Hadoop 2.0 + YARN?
 Does reimagecluster make use of  /opt/oracle/bda/cluster-hosts-infiniband?
 How do I  use  "reimagerack --hosts n1,n2..."
 If reimagerack/reimagecluster finishes but the node did not reimage where are the logs found?
 What are the general steps for installing two separate V2.2.1 clusters on a BDA 2/3 rack?
 Is there a mandatory order for shutting down servers in a BDA?
 In the case of a planned network outage where the BDA will be without network connectivity are there any planned actions to take?
 Before a cluster is deployed can you specify the oinstall and dba group IDs with the configuration tool before running Mammoth??
 Is rebooting a BDA cluster expected to significantly boost performance?
 If a cluster reboot is seen to signifcantly boost performance what can be done to determine the contributing factors?
 Can a BDA based Hadoop cluster consist of multiple generations of hardware?
 Can custom parcels be used on the BDA?
 Can DCLI be used to manage servers outside of a BDA rack?
 Can dcli be used by a non-root user on the BDA?
 What is an overview of the BDA Patching Strategy?
 Now that BDA V4.1 and higher support parcels can any parcel patch be downloaded and applied?
 Is it possible to upgrade CDH to a higher major version than what is available with the BDA release?
 What about patching/upgrading Cloudera Manager?
 Is decommissioning or deleting a Cloudera Manager role deployed by Mammoth supported?
 Are official Oracle Linux patches supported?
 After installation, can the puppetmaster and puppet instances that come pre-installed on the BDA be used for application management?
 What troubleshoot steps can be followed if puppet is not starting?
 If a user is filling up /home partition on any node, does the BDA have any utility to prevent that from happening? What would be the recommended approach to set quota's for all users to limit how much space they can consume on /home partition?
 Is creating separate partitions for /tmp, /var, /var/tmp, /var/log, /var/log/audit, /home supported?
 Is there any additional information on why creating a new partition for /tmp on BDA is not supported?
 Can we increase the space allocated to root partition and / ?
 Can we move the /home to a separate partition; that way users filling up space on their home folder will not impact the operation of the system and cluster?
 What about removing files from /var/lib/cloudera-host-monitor, /var/lib/cloudera-scm-eventserver, /var/lib/cloudera-scm-navigator, /var/lib/cloudera-service-monitor?
 What about changing the location of all the log directories from /var/log/<service> to one of the data directories. Is that supported?
 What about adding an additional path in Cloudera Manager for: NameNode Data Directories (dfs.name.dir, dfs.namenode.name.dir)? Is that supported?
 Does Cloudera/Hadoop place any restrictions on the HDFS partitions sizes which can be present on a server?
 Are the /d<NN> (/d01 .. /d12) partitions used for HDFS? Can they be added to HFDS?
 What are partitions /dev/sda4 and /dev/sdb4 used for on BDA servers ? Are they used for HDFS?
 Does  "mount -o remount /home" mean we have to move /home to a different file system (mount point)?
 Can I change the BDA client hostnames or private InfiniBand IP addresses after BDA Mammoth is Installed?
 How to Split an Existing Cluster?
 Can the BDA support a different storage configuration such that HDFS storage is reduced and there is additional local filesystem?
 If necessary can a disk be removed from hdfs on the BDA?
 If root '/' is filling up on Node 3 of the cluster, what can be done?
 If the directory /root/stage is filling up, what can be done?
 If the directory /boot is filling up, what can be done?
 What about swap size? Can it be increased on the BDA?
 What type of directories are used on BDA servers?
 What are /u01 - /u12?
 On the BDA where are the pid files: hadoop-hdfs-namenode.pid and hadoop-hdfs-datanode.pid?
 Is it possible to use cron on BDA servers?
 What happens if a BDA node needs to be rebuilt?
 Why can't I find /etc/syslog.conf on the BDA?
 Why are the Name Node data directories configured on the root ("/") mount point?
 Similar to the previous question why is the Zookeeper data directory configured to be on the root ("/") mount point?
 Again similar to the previous why are the Journal node Edits directory ("dfs.journalnode.edits.dir") configured to be on the root ("/") mount point and not in a dedicated mount point?
 In order to protect the BDA cluster from direct 'root' account login can the 'root' login be converted to a no login account?
 Is it ok to change the root password on the BDA Cluster?
 What is the password for the default users created for below databases?
 Is it ok to reset the passwords above?
 Does the BDA save the Cloudera Manager admin password somewhere?
 Are explicit password expiration dates set on the BDA?
 Is it possible for a single Cloudera Manger to support multiple clusters?
 On BDA V4.2 jstack and jmap seem to crash.  Is that a known issue?
 What is the purpose of /opt/sharedir on BDA v4.*?
 What is the "Oracle Instant Client" on the BDA?  Is it "Oracle Database Client"?
 Does a BDA upgrade support a rolling upgrade?
 If the cluster verification checks didn't run during install/upgrade should I rerun step 1?
 What hardware configurations is BDA certified with?
 I am upgrading from BDA V3.0 to V4.2 is there any problem if the V3.0 cluster is running OL4?
 When upgrading to BDA V4.2 can the Java version be downgraded to 1.7 from 1.8?
 When installing a third party software is it ok to change the configuration of the cluster?
 Is it possible to change the name of the dba group?
 Is there a way to find the Cloudera factory settings on a BDA?
 Reimaging generates /root/BDA_REBOOT_FAILED files with errors that have been resolved.  What can be done?
 Is there 'Reboot System on NMI' in BIOS configuration menu on BDA? If there is,  which default value does it has?
 Is Oracle Big Data Lite supported?
 Is it possible to schedule Preventative Maintenance for the BDA racks?
 Is the CDH 5.5.1 parcel that ships with BDA Mammoth v4.4 available publicly?
 Is OL6 compatible with CDH 5.5.1?
 Is a BDA upgrade possible if one of the servers is down?
 If one of the ILOMs is down is it possible that a BDA upgrade can proceed?
 Does a Mammoth upgrade generally upgrade the firmware?
 Why is email alerting required on the BDA?
 Is it possible to use the MySQL DB on BDA for work other than supporting the Hadoop cluster?
 It is possible to install a separate instance of MySQL on a hybrid data/edge node?
 Is higher latency on one server ever normal?
 Why would bdacheckib -m fail on a single rack?
 Are DNS server settings needed for BDA install?
 Did BDA release a Base Image 4.3.0 and/or 4.4.0?
 Does Cloudera support IBM JDK?
 Can the PermitRootLogin parameter in /etc/ssh/sshd_config be set to "no"?
 How can unencrypted data on the filesystem be securely deleted?
 Is it possible to control services with "chkconfig" at the node startup?
 How can you implement resource management control on a group level for disk space(quota), cpu utilization, memory utilization, etc.
 Is it supported to use Cloudera Manager(CM) with a load balancer on the BDA?
 Is there a general list of cluster health checks for a CDH cluster?
 Is it possible to add an non-BDA Edge node with RHEL7 to the BDA 4.5 cluster?
 Is it possible to have a non BDA server added to the HDFS cluster? 
 With BDD 1.2.2 on Node 5 of the BDA 4.5 is the communication between BDD and Hive performed by the Infiniband network or the client network?
 For how many racks does the distribution of services for multirack configuration hold?
 How do non-root users change their password on all the BDA nodes?
 Does OSWatcher run on all nodes of a BDA 4.5?
 What is the impact of high load on the client network i.e. the bondeth0 interface?
 What are the troubleshooting steps for any kind of network problem?
 What is the recommendation if rebooting a server stalls at an empty grub prompt ?
 Is it possible to disable or bypass dnsmasq on BDA?
 Does the BDA support adding the timeout option to /etc/resolv.conf?
 Should prelink be enabled on BDA nodes?
 Why is it observed that prelink is running daily (/etc/cron.daily) on the BDA?
 What is the recommendation for decommissioning a cluster?
 How can ssh with an identity file be used from a client/edge node to the BDA?
 How is passwordless ssh from a client/edge node to a BDA server set up?
 On a BDA cluster is it supported to remove CDH and install Horton works HDP?
 Is it possible to connect an IB GW Switch in BDA rack with an external ZFSA and use it?
 What is the recommendation for enabling Transparent HugePages?
 Is it possible to merge /u01 to /u12 into four mount points on an BDA server configured as an edge node which is part of the BDA cluster?
 Is there a way to rename the cluster without rebuilding the rack?
 If /var/lib is filling up on Node 3 can that directory be increased?
 Can you remove “127.0.0.1” from the /etc/resolv.conf and disable “dnsmasq” from the server?
 Is creating symbolic links on the BDA supported?
 Is there any performance overhead with the NameNode being the same as a DataNode?
 Can BDA X6-2 servers and X4-2 servers be part of one hadoop cluster?
 Can X7-2 servers be added to an existing X4-2 cluster?
 Is BDA 4.13 supported on X3-4 nodes?
 Is there a support certification matrix describing supported BDA model types (i.e. X3-4, X4-4, etc.)?
 Can Firefox be uninstalled completely from the BDA?
 Is it ok to increase the open file limits on the BDA in /etc/limits.conf?
 Do all bdacli enable/disable commands require the Cloudera Manager API language to be in English?
 How does application data traverse the management network?
 Are all version of BDA supported on all Hardware versions?
 Is AES-NI is supported on all BDA hardware versions?
 Can MAPR be installed on an Oracle Big Data Appliance?
 On the BDA is it expected that "sysctl -a | grep bridge-nf-call" return any lines?
 Does BDA support standard remote logging and auditing that Oracle Linux supports such as syslog?
 Is there a preference for using AVDF or Navigator?
 Will modifying SNMP setting impact the Mammoth cluster?
 Can a BDA be configured such that an external edge node communicates with the BDA via the internal IP addresses instead of public IPs?
 Does Oracle support creating a Logical Volume Manager (LVM) on BDA in the root disk /dev/sda?
 What is the relationship between the Oracle Lifetime Support Policy and Cloudera's support lifecycle policy?
 What is the BDA policy for OL6 clusters when OL6 enters Extended Support?
 Does Cloudera now have restricted access to all their repositories for Cloudera Manager/CDH versions prior to 6.3.3?
 Will Oracle Java 11 be supported on the BDA?
 Does BDA 5.1/CDH 6.2.1 or BDA 5.2/CDH 6.3.4 provide a way to mask sensitive data?
 Where can I find more information on training?
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.