Sqoop Import for Parquet Target with Hive Overwrite Option Fails with 'Metadata already exists for dataset' (Doc ID 2102209.1)

Last updated on OCTOBER 11, 2016

Applies to:

Big Data Appliance Integrated Software - Version 4.0 to 4.2.0 [Release 4.0 to 4.2]
Linux x86-64

Goal

On Oracle Big Data Appliance 4.2 / CDH 5.4.0 , using Sqoop import command which imports Parquet data into a Hive table may fail if called a second time with --hive-overwrite option set.

sqoop import --connect xxxxxxxxxxxx --username xxxxx --password xxxxx --query 'select f1,f2 from ZZZZ.YYYYY where $CONDITIONS and rownum < 10000' -m 4 --hive-database ZZZZ --hive-table YYYYY --hive-import --hive-drop-import-delims --hive-overwrite --delete-target-dir --target-dir tmp --split-by STUDENTID --as-parquetfile

ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetExistsException: Metadata already exists for dataset.

 

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms