MSCK Repair Table Only Discovers Partition Directories if they are Empty on Oracle Big Data Appliance V2.0 (Doc ID 1539307.1)

Last updated on MARCH 25, 2013

Applies to:

Big Data Appliance Integrated Software - Version 2.0.1 and later
Information in this document applies to any platform.

Symptoms

Once a Hadoop MapReduce job loads Hive data files into Hadoop Distributed File System (HDFS) partitions (HDFS sub-folders), those partitions can be manually added to the Hive Metastore using the "ALTER Table xxx ADD PARTITION ... " command. As per  the Apache Hive Language Manual, Hive Data Definition Language section, the "MSCK REPAIR TABLE table_name;" command will add any new partitions that exist in HDFS to the Hive Metastore. However on BDA 2.0.1, the MSCK Repair option only works when the HDFS sub-folders are empty.  MSCK Repair does not work if the HDFS sub-folders contain data files.

Testcase:

1. Create a Hive table with partitions: 

 
You see that the two empty partitions (dept_code=30 and 40) are added, but the non empty partition (dept_code=20) is not added.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms