BDD 1.2 : Load from Hive Table in Multiple Subdirectories Fails with "User class threw exception: java.io.IOException: Not a file" (Doc ID 2175525.1)

Last updated on FEBRUARY 11, 2017

Applies to:

Oracle Big Data Discovery - Version 1.2.0.0.0 and later
Information in this document applies to any platform.

Symptoms

 On Oracle Big Data Discovery (BDD),

When using data_processing_cli to load data from a Hive table, an error is returned:

./data_processing_CLI -t apache_logs_accepted -d dwhprd
diagnostics: User class threw exception: java.io.IOException: Not a file: hdfs://bda-cl1-ns/prd/sources/logstash/apache/accepted/year=2016/month=06/day=16/20160620

Hive table being located in *multiple* subdirectories.

Within Hive, the issue was solved by configuring the below two values (to "true") in hive-site.xml:
- mapred.input.dir.recursive
- hive.mapred.supports.subdirectories

However, the operation still fails within BDD.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms