Batch Job APAYCRET And SAACT Errors Out In Production Frequently.
(Doc ID 2241846.1)
Last updated on MARCH 13, 2017
Applies to:Oracle Utilities Customer Care and Billing - Version 220.127.116.11.0 and later
Information in this document applies to any platform.
In production in Nightly batch execution APAYCRET and SAACT jobs goes to error where few threads go into pending and few threads gets completed(ex. out of 15 threads 3 threads stuck in pending state).The pending threads get completed when resubmitted the batch online. The occurrence of this issue is 70% as sometimes batch completes successfully with out any issue.
Enabled batch debug trace for APAYCRET with the submit job –g option
-b APAYCRET -d 2016-12-01 -e CLUSTERED -p MEDIUMBATCH -c 10 -g YYYY
After analyzing the submit job, TPW, and hs_err_pid logs, it’s concluded that the SIGSEGV error in hs_err_pid may be due to a missing Patch (Patch 20805371: CCB 18.104.22.168 JVM THREADPOOLWORKERCRASH), but after applying this patch, this issue was replicated and this patch did not solve the issue.
For a job that fails, it could help to activate a COBOL trace. The following describes the steps (only the one for threadpoolworker is necessary):
1. Create a new folder for the COBOL mftrace log output
2. Create new file $SPLEBASE/ctf.cfg and paste the following lines into it
# IMPORTANT: Please note the space between the names and $ signs. Enter in the file exactly like this.
# A trace file will be rolled over at 130MB for a maximum of 20 files, so max 2.5GB in total.
mftrace.emitter.textfile#format=$(TIME) $(THREAD) $(COMPONENT) $(EVENT) $(LEVEL)$(DATA)
3. The following environment variables will activate the MF COBOL trace ...
For batch threadpoolworker:
-- Add the above export lines to $SPLEBASE/bin/threadpoolworker.sh
-- Restart threadpoolworker
-- Send the mftrace files as well as the tpw logs
For batch submitjob in THIN mode:
-- Add the above export lines to $SPLEBASE/bin/submitjob.sh
-- Run the job using submitjob in "-e THIN" mode
-- Send the mftrace files as well as the submitjob logs
-- Add the above export lines to $SPLEBASE/bin/splcobjrun.sh, before the "exec $COMMAND $*" line at the bottom
-- Web server does not need to be rebooted for this to take effect, as long as the child JVMs are recycled
-- Send the mftrace files as well as system logs
From the COBOL traces, you may notice the following:
1. A load error on
FYI: what shows up in the trace just before this exception is that 4 COBOL
threads (88,87,89,90) started in the exact same millisecond as shown below,
which lead to the load error. This concurrent thread start is what can cause this issue:
12:45:26.822 88 MF.RTS 6 1 49 0 "CIPZBTPN"
12:45:26.822 87 MF.RTS 6 1 49 0 "CIPZBTPN"
12:45:26.822 89 MF.RTS 6 1 49 0 "CIPZBTPN"
12:45:26.822 90 MF.RTS 6 1 49 0 "CIPZBTPN"
NOTE: These are probably the same 4 threads that were in pending status on
the BRT because they never started.
2. You may also notice a few of the following errors:
MF.RTS 251 0 9 "C" "ld.so.1: cobjrun64: fatal: /|_#a1Ouqa: open failed: No
such file or directory"
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!