My Oracle Support Banner

External Table Using COLUMN TRANSFORMS Doesn't Work When Database Character Set Is Multibyte (AL32UTF8) (Doc ID 1359244.1)

Last updated on NOVEMBER 08, 2019

Applies to:

Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.2.0.2 [Release 10.2 to 11.2]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Information in this document applies to any platform.



NOTE: In the images and/or the document content below,
the user information and environment data used
represents fictitious data from the Oracle sample
schema(s), Public Documentation delivered with an
Oracle database product or other training material.
Any similarity to actual environments, actual persons,
living or dead, is purely coincidental and not
intended in any manner.

Symptoms

You created an external table that uses the option COLUMN TRANSFORMS to load data from a flat file containing special characters. The select from external table doesn't show the special characters correctly. The database character set is multibyte (AL32UTF8), NLS_LANG environment variable is properly set.

Errors like:

ORA-29275: partial multibyte character


are also possible.

The next test case demonstrates this:

connect / as sysdba

create or replace directory data_dir as '/tmp';

-- create the environment
create user <schema_name> identified by <password> default tablespace <tablespace_name> temporary tablespace <temp_tablespace_name>;
grant connect, resource to <schema_name>;
grant read, write on directory data_dir to <schema_name>;

connect <schema_name>/<password>

create table extdata
(
   name varchar2(50)
)
organization external
(
   type oracle_loader
   default directory data_dir
   access parameters
   (
      records delimited by newline
      characterset we8iso8859p1
      badfile data_dir:'data.bad'
      logfile data_dir:'data.log'
      fields terminated by '|'
      (
         first_name char(20),
         last_name char(20)
      )
      column transforms
      (
         name from concat (constant '<', first_name, constant ' ', last_name, constant '>')
      )
   )
   location (data_dir:'data.txt')
)
reject limit unlimited noparallel nomonitoring;


The content of input file data.txt is the following one:

Hans|Müller
San|Sebastián


The codepoints of special characters 'ü' and 'á' are conform with specified character set WE8ISO8859P1. You can verify this either using a hex editor (on Windows platforms) or simply perform on Unix platforms:

#> od -c -t x1 data.txt

0000000 H  a  n  s  |  M  ü  l  l  e  r  \n S  a  n  |
        48 61 6e 73 7c 4d fc 6c 6c 65 72 0a 53 61 6e 7c
0000020 S  e  b  a  s  t  i  á  n  \n
        53 65 62 61 73 74 69 e1 6e 0a


The selects from external table return:

select * from extdata;

NAME
--------------------------------------------------
<Hans M¿ller>
<San Sebasti¿

select name, dump (name, 16) from extdata1;

NAME
--------------------------------------------------
DUMP(NAME,16)
--------------------------------------------------------------------------------
<Hans M¿ller>
Typ=1 Len=13: 3c,48,61,6e,73,20,4d,fc,6c,6c,65,72,3e

<San Sebasti¿
Typ=1 Len=15: 3c,53,61,6e,20,53,65,62,61,73,74,69,e1,6e,3e


=> The codepoints ('ü' = 0xFC and 'á' = 0xE1) verified by function DUMP are correct conform with character set WE8ISO8859P1) but these are not displayed correctly (the database character set is AL32UTF8 and the codepoints should be 'ü' = 0xC3BC and 'á' = 0xC3A1).

If you exclude from external table the option COLUMN TRANSFORMS, like:

create table extdata1
(
   first_name varchar2(20),
   last_name varchar2(20)
)
organization external
(
   type oracle_loader
   default directory data_dir
   access parameters
   (
      records delimited by newline
      characterset we8iso8859p1
      badfile data_dir:'data1.bad'
      logfile data_dir:'data1.log'
      fields terminated by '|'
      (
         first_name char(20),
         last_name char(20)
      )
   )
   location (data_dir:'data.txt')
)
reject limit unlimited noparallel nomonitoring;


then the selects show:

select * from extdata1;

FIRST_NAME           LAST_NAME
-------------------- --------------------
Hans                 Müller
San                  Sebastián

select '<'||first_name||' '||last_name||'>' name from extdata1;

NAME
-------------------------------------------
<Hans Müller>
<San Sebastián>

select '<'||first_name||' '||last_name||'>' name, dump ('<'||first_name||' '||last_name||'>', 16) dump from extdata1;

NAME
-------------------------------------------
DUMP
-----------------------------------------------------
<Hans Müller>
Typ=1 Len=14: 3c,48,61,6e,73,20,4d,c3,bc,6c,6c,65,72,3e

<San Sebastián>
Typ=1 Len=16: 3c,53,61,6e,20,53,65,62,61,73,74,69,c3,a1,6e,3e

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.