Typo in the field name in loaderMap Causes Parse Error
(Doc ID 1456342.1)
Last updated on SEPTEMBER 07, 2021
Applies to:
Oracle Loader for Hadoop - Version 1.1.0 and laterLinux x86-64
Symptoms
Running a job using Oracle Loader for Hadoop, the job completes however there are parse errors occurring during insert and thus NO data is inserted into Oracle table.
Oracle Loader for Hadoop Release 1.1.0.0.1 - Production
Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved.
12/07/19 16:19:13 INFO loader.OraLoader: Oracle Loader for Hadoop Release 1.1.0.0.1 - Production
Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved.
...............
.........
12/07/19 16:19:23 WARN sampler.DBSplitSampler: Skipping record: field "FIRST_NAME" not found in input record with fields "F0", "F1", "F2", "F3", "F4"
12/07/19 16:19:23 WARN loader.OraLoader: Sampler error: need more samples for confident load balancing. Job will continue without sampled information.
12/07/19 16:19:23 INFO loader.OraLoader: Sampling time=0D:0h:0m:0s:0ms (601 ms)
12/07/19 16:19:23 INFO loader.OraLoader: Submitting OraLoader job OraLoader
12/07/19 16:19:34 INFO input.FileInputFormat: Total input paths to process : 1
12/07/19 16:19:36 INFO mapred.JobClient: Running job: job_201206270622_0127
12/07/19 16:19:37 INFO mapred.JobClient: map 0% reduce 0%
12/07/19 16:19:42 INFO mapred.JobClient: map 100% reduce 0%
12/07/19 16:19:50 INFO mapred.JobClient: map 100% reduce 16%
12/07/19 16:19:51 INFO mapred.JobClient: map 100% reduce 33%
12/07/19 16:19:52 INFO mapred.JobClient: map 100% reduce 66%
12/07/19 16:19:53 INFO mapred.JobClient: map 100% reduce 100%
12/07/19 16:19:54 INFO mapred.JobClient: Job complete: job_201206270622_0127
.........................
................
12/07/19 16:19:54 INFO mapred.JobClient: Rows skipped by input error
12/07/19 16:19:54 INFO mapred.JobClient: Parse Error=20
12/07/19 16:19:54 INFO mapred.JobClient: Total rows skipped by input error=20
12/07/19 16:19:54 INFO mapred.JobClient: Map-Reduce Framework
12/07/19 16:19:54 INFO mapred.JobClient: Map input records=20
12/07/19 16:19:54 INFO mapred.JobClient: Reduce shuffle bytes=32
12/07/19 16:19:54 INFO mapred.JobClient: Spilled Records=0
12/07/19 16:19:54 INFO mapred.JobClient: Map output bytes=0
12/07/19 16:19:54 INFO mapred.JobClient: CPU time spent (ms)=4790
12/07/19 16:19:54 INFO mapred.JobClient: Total committed heap usage (bytes)=2267283456
12/07/19 16:19:54 INFO mapred.JobClient: Combine input records=0
12/07/19 16:19:54 INFO mapred.JobClient: SPLIT_RAW_BYTES=133
12/07/19 16:19:54 INFO mapred.JobClient: Reduce input records=0
12/07/19 16:19:54 INFO mapred.JobClient: Reduce input groups=0
12/07/19 16:19:54 INFO mapred.JobClient: Combine output records=0
12/07/19 16:19:54 INFO mapred.JobClient: Physical memory (bytes) snapshot=833957888
12/07/19 16:19:54 INFO mapred.JobClient: Reduce output records=0
12/07/19 16:19:54 INFO mapred.JobClient: Virtual memory (bytes) snapshot=4463357952
12/07/19 16:19:54 INFO mapred.JobClient: Map output records=0
On Hadoop Cluster(CDH3) you can check MAP task logs to find out warnings and number of rows failed for the job Id reported in Oracle Loader output
1) Manually check for Warning/Error's in the hadoop logs (usually in /var/log/hadoop) directory for the job Id reported in Oracle Loader output
For example in /var/log/hadoop/userlogs/job_201206270622_0127/attempt_201206270622_0127_m_000000_0 you will find a syslog file which contains below warnings
2012-07-18 10:30:11,428 WARN oracle.hadoop.loader.OraLoaderMapper: skipping record...oracle.hadoop.loader.OraLoaderException: field "FIRST_NAME" not found in input record with fields "F0", "F1", "F2", "F3", "F4"
2012-07-18 10:30:11,428 WARN oracle.hadoop.loader.OraLoaderMapper: skipping record...oracle.hadoop.loader.OraLoaderException: field "FIRST_NAME" not found in input record with fields "F0", "F1", "F2", "F3", "F4"
2012-07-18 10:30:11,428 WARN oracle.hadoop.loader.OraLoaderMapper: skipping record...oracle.hadoop.loader.OraLoaderException: field "FIRST_NAME" not found in input record with fields "F0", "F1", "F2", "F3", "F4"
OR
2) Open JobTracker Web UI , which generally runs on port 50030.
For ex:-
http://<nodename>:50030/jobtracker.jsp
Click on the job Id
a) Check for the Rows Skipped
The Job Tracker UI shows the following detail for the job:
Rows skipped by input error | Parse Error | 20 | 0 | 20 |
Total rows skipped by input error | 20 | 0 | 20 |
b) To navigate to the map task log follow below steps
i) Click on map
Kind | % Complete | Num Tasks | Pending | Running | Complete | Killed | Failed/Killed Task Attempts | |
---|---|---|---|---|---|---|---|---|
map | 100.00%
|
1 | 0 | 0 | 1 | 0 | 0 / 0 | |
reduce | 100.00%
|
1 | 0 | 0 | 1 | 0 | 0 / 0 |
ii) Now click on the task associated with the Map
Task | Complete | Status | Start Time | Finish Time | Errors | Counters | |
---|---|---|---|---|---|---|---|
task_201206270622_0126_m_000000 | 100.00%
|
19-Jul-2012 08:50:34 | 19-Jul-2012 08:50:36 (1sec) | 15 |
iii) In the Task Attempts page click on ALL or 'Last 4KB' in 'Task Logs' column
Task Attempts | Maschine | Status | Progress | Start Time | Finish Time | Errors | Task Logs | Counters | Actions | |
---|---|---|---|---|---|---|---|---|---|---|
attempt_201206270622_0126_m_000000_0 | /switch1/scaj01/****.<DOMAIN> | SUCCEEDED | 100.00%
|
19-Jul-2012 08:50:34 | 19-Jul-2012 08:50:36 (1sec) | Last 4KB Last 8KB All |
15 |
|
iv) Review the syslog for WARN associated with oracle.hadoop.loader.OraLoaderMapper in page similar to below url ...
http://*****.<DOMAIN>:50060/tasklog?attemptid=attempt_201206270622_0126_m_000000_0&all=true
2012-07-19 08:50:36,233 WARN oracle.hadoop.loader.OraLoaderMapper: skipping record...oracle.hadoop.loader.OraLoaderException: field "FIRST_NAME" not found in input record with fields "F0", "F1", "F2", "F3", "F4"
2012-07-19 08:50:36,233 WARN oracle.hadoop.loader.OraLoaderMapper: skipping record...oracle.hadoop.loader.OraLoaderException: field "FIRST_NAME" not found in input record with fields "F0", "F1", "F2", "F3", "F4"
2012-07-19 08:50:36,233 WARN oracle.hadoop.loader.OraLoaderMapper: skipping record...oracle.hadoop.loader.OraLoaderException: field "FIRST_NAME" not found in input record with fields "F0", "F1", "F2", "F3", "F4"
OR
3) If using Cloudera Manager, follow below steps to find the syslog of a particular job.
Open Cloudera Manager Web UI , which generally runs on port 7180.
For ex:-
http://<nodename>:7180/cmf/services/status
Here are steps to navigate to syslog file
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Log Details |
Cause |
Solution |