My Oracle Support Banner

Impala on Oracle Big Data Appliance FAQ (Doc ID 1615639.1)

Last updated on FEBRUARY 13, 2022

Applies to:

Big Data Appliance Integrated Software - Version 2.0.1 and later
Linux x86-64

Purpose

Frequently Asked Questions for Impala on the Oracle Big Data Appliance (BDA).

Questions and Answers

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Purpose
Questions and Answers
 What is Cloudera Impala?
 Where is Impala metadata stored on the BDA?
 What is the impala-shell?
 What is an impalad?
 Is Impala V1.2.3 supported on BDA V2.3.1 with Cloudera Manager 4.7?
 Is Impala installed on BDA V2.3.1?
 Can Impala V1.1.1 be installed on versions of the BDA prior to V2.3.1?
 With Impala V1.1.1 why is it the case that the impala-shell works from all nodes of the Oracle Big Data Appliance (BDA) cluster but a table created in the impala-shell invoked from and connected to the impalad on that node is only shown in the impala-shell on that node? Further why is it the case that if the impala-shell on the node where the table was created connects to the impalad on any other node the table does not show up either?
 Is the use of INVALIDATE METADATA the same for Impala V1.2 and higher as with V1.1.1?
 Is the use of INVALIDATE METADATA the same for Impala V1.0.1?
 For  Impala version 1.0 and above is it necessary to install the impala-lzo libraries that match the version installed on the BDA cluster?
 How can I install Impala 1.0.1 on the BDA v2.1/v2.2?
 Does Impala support "org.apache.hadoop.hive.contrib.serde2.RegexSerDe" SerDe?
 What are the tradeoffs of using Impala and HBase together?
 If using Impala and HBase together is there anything in particular to check in CM?
 When Impala is managed by Cloudera Manager why is the impala-state-store service down at the OS level?
 When Impala is managed by CM why is does chkconfig reports that the impala service is off at all runlevels?
 Why would the chkconfig output above report information on the impala-catalog service and impala-server service as well as about impala-state-store?
 Does Impala support har files?
 On the BDA what versions can you upgrade Impala to?
 When upgrading impala why do we only upgrade impala and impala-shell but not impala-catalog, impala-state-store & impala-server?
 Details about using Parquet File Format with Impala Tables?
 What should be the size of Parquet File for use with Impala?
 Is it possible to set up Impala/Llama on BDA 3.0.1 to make Impala queries utilizing YARN resources?
 Is Llama supported for Impala in versions post CDH 5.1?
 Is using Oracle Clusterware instead of HAProxy and Keepalive a certified alternative for HA for Impala?
 Are there any best practices for running Impala queries?
 What is the recommendation on when to run compute stats?
 If the Impala logs shows a SIGSEGV is there any suggestion for collecting more detailed diagnostics?
 If the Impala mem_limit is set, what are queries raising "memory limits exceeded"?
 Is there any documentation on configuring memory for impala?
 How can I check the heap usage of the Impala catalog server?
 How can I see the debug Web UI of the Impala StateStore?
 Are there general guidelines for distributing vcores to yarn / impala with static pools in mind?
 Does Impala Admission Control support sub-pool creation?
 Do you have any option to refresh all the metadata of IMPALA refreshed periodically? Each and every time after creating or altering table in Hive, I have to manually run INVALIDATE ALL METADATA and rebuild index in impala to get the updated result.
 What is the command which is used to INVALIDATE METADATA in impala shell command line for all databases or a particular db so that this can be scheduled to run periodically to refresh the metadata rather doing it manually in GUI?
 Is it possible to see the "explain plan" in Sentry and Impala before executing the sql or only after running it?
 If the Impala Catalog Daemon is found to create large logs what could cause that? 
 Is there a best practice or set of guidelines for implementing implala load balancing on the BDA?
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.