Possible datacorruption with Impdp if importing NLS_CHARACTERSET is different from exporting NLS_CHARACTERSET
(Doc ID 457526.1)
Last updated on AUGUST 04, 2018
Applies to:Oracle Server - Enterprise Edition - Version 10.1.0.2 to 10.2.0.3 [Release 10.1 to 10.2]
Oracle Server - Enterprise Edition - Version 18.104.22.168 to 22.214.171.124 [Release 11.1]
Information in this document applies to any platform.
Oracle Server - Enterprise Edition - Version: 126.96.36.199 to 188.8.131.52
This problem can occur on any platform.
Affects Impdp (Import Datapump) only.
Database 10.1.0.2 to 10.1.0.5 and 10.2.0.1 to 10.2.0.3
Fixed in 10.2.0.4 (and 10.2.0.5) and 184.108.40.206 patchset
Fixed in 220.127.116.11 and up
***Checked for relevance on 22-Jun-2012***
The original import tool (Imp) is not affected.
When the target (impdp side) NLS_CHARACTERSET is different from the source (expdp side) NLS_CHARACTERSET then Impdp (Import Datapump) may provoke random silent corruption of some CHAR, VARCHAR2, CLOB and LONG data.
Please note that:
* The original imp tool is not affected by this problem.
* Impdp will not give errors or warnings, the corruption can only been seen on some rows when selecting the imported dataset.
* Not all data is corrupted, only a certain part. Typical figures are around 50 rows on 500.000 rows for 8 bit to multibyte (AL32UTF8), but this depends also on your dataset so it can be more or less. When going from 8bit to 8 bit the possibility to have a corruption is very low.
* This is a impdp problem, the expdp dmp files are not corrupted.
* It only happens during characterset conversions, EBCDIC characterset are not affected.
This bug may provoke two similar kinds of silent corruption:
A) Data may be corrupted around the first non-US7ASCII character but selecting will NOT provoke a error because the corrupted data is not longer then the defined column length.
Typically this will provoke corruption that repeats a part of the row's contents.
In the original database the data looks like:
After the impdp it may look like:
or even missing a part of the data:
B)In addition to the corrupted data some of the corrupted rows may contain more data then the column length alows, for example if you do a
select length(<colum>) from <table>;
you may see that for some of the corrupted rows the length is longer then the defined column length.
For example you have a row that has a length of 155 but the column is only defined as VARCHAR2 150.
When selecting this data or when creating indexes on this data you may see a ora-3113 or ORA-7445: EXCEPTION ENCOUNTERED: CORE DUMP [KDSTSRP0KM()+760]
This are rows that also suffer from the corruption in point A), it's just more visible because the length is longer then the defined column size.
A part of the corruption can be found by the script in <Note 428526.1> Baddata Script To Check Database For Corrupt column data
The problem occures when:
A) the dataset starts with a US7ASCII character and then contains non-US7ASCII characters in the same row
Note: If the entire row contains non-US7ASCII characters there will be no corruption, so this will mostly affect databases who store Latin languages like Spanish, French,Dutch etc or English combined with currency symbols like the £ or the € sign. The $ sign is a US7ASCII character and will not provoke a problem.
B) the source NLS_CHARACTERSET is different from the target NLS_CHARACTERSET and the non-US7ASCII character need conversion.
A other example is the € when going from WE8ISO8859P15 to WE8MSWIN1252 (or inverse). The euro is code 164 (decimal) in WE8ISO8859P15 but code 128 (decimal) in WE8MSWIN1252.
To find your NLS_CHARACTERSET use this select:
select value from NLS_DATABASE_PARAMETERS where parameter='NLS_CHARACTERSET';
There will be NO corruption if the source and target NLS_CHARACTERSET are the same.
The corruption may happen when you go from any NLS_CHARACTERSET to UTF8 / AL32UTF8.
The following table gives a overview of often used NLS_CHARACTERSET combinations:
|Source NLS_CHARACTERSET||Target NLS_CHARACTERSET||Corruption can happen?|
|NLS_CHARACTERSET is the same as target||NLS_CHARACTERSET is the same as source||No|
|US7ASCII||xx8ISO8859Px / xx8MSWIN12xx||No|
|US7ASCII||JA16xxxx / KO16xxxx / ZHS16xxxx / ZHT16xxxx||No|
|US7ASCII||AL32UTF8 / UTF8||No|
|xx8ISO8859Px||xx8MSWIN12xx||Yes, probability is very low. NOT possible for WE8ISO8859P1 to WE8MSWIN1252|
|xx8MSWIN12xx||xx8ISO8859Px||Yes, probability is very low.|
|xx8ISO8859Px / xx8MSWIN12xx||AL32UTF8 / UTF8||Yes|
|JA16xxxx / KO16xxxx / ZHS16xxxx / ZHT16xxxx||AL32UTF8 / UTF8||Yes|
|AL32UTF8 / UTF8||xx8ISO8859Px / xx8MSWIN12xx /JA16xxxx, KO16xxxx / ZHS16xxxx / ZHT16xxxx||Yes|
|AL32UTF8||UTF8||Normally NO, Yes if you have "convertible" in csscan - wich is very rare.|
|UTF8||AL32UTF8||Normally NO, Yes if you have "convertible" in csscan - which is very rare.|
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document
|If you are planning to use impdp for a characterset change:|
|If you already have used expdp/impdp for a characterset migration and you are affected:|