Join on Parquet Tables Returns Wrong Results

(Doc ID 2316007.1)

Last updated on DECEMBER 07, 2017

Applies to:

Big Data Appliance Integrated Software - Version 4.8.0 and later
Linux x86-64

Symptoms

Common join on parquet tables returns wrong results. The sample is from the JIRA HIVE-12762

CREATE TABLE tbl1(id INT) STORED AS PARQUET;
 INSERT INTO tbl1 VALUES(1), (2);
 CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
 INSERT INTO tbl2 VALUES(1, 'value1');
 INSERT INTO tbl2 VALUES(1, 'value2');
 set hive.optimize.index.filter = true;
 set hive.auto.convert.join=false;
 select tbl1.id, t1.value, t2.value
 FROM tbl1
 JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
 JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms