AAA Pipeline Consumes 100% CPU (Doc ID 1083994.1)

Last updated on SEPTEMBER 26, 2016

Applies to:

Oracle Communications Billing and Revenue Management - Version 7.3.0.0.1 and later
Information in this document applies to any platform.
***Checked for relevance on 30-Nov-2011***
***Checked for relevance on 11-Feb-2015***


Goal

CM returns error poid and ifw produces a core dump and then begins consuming 100% of CPU time on its processor. Must restart AAA pipeline or suffer degraded performance

Errors in pipeline log - ExceptionPipe.log

19.03.2010 22:25:55 istctap01 ifw IFW CRITICAL [T:23] 00446 - (ifw.Pipelines.ExceptionPipeline.TransactionManager) Module 'ifw.Pipelines.ExceptionPipeline.Output' failed to commit transaction '9'.
19.03.2010 22:25:56 istctap01 ifw IFW MAJOR [T:23] 00332 - (ifw.Pipelines.ExceptionPipeline) The object 'ifw.Pipelines.ExceptionPipeline.TransactionManager' is not usable.

Error in Process log:

19.03.2010 22:25:55 istctap01 ifw IFW MAJOR [T:23] 00081 - (ifw.Pipelines.ExceptionPipeline.Output.OutputCollection.EdrOutput.Module.OutputStream.Module) 'No such file or directory': Cannot move temporary file './data/Serialize/in/tmpExceptionPipelineToReplayEdrSerialize_3446490291_9.edr' to output file './data/Serialize/in/ToReplayEdrSerialize_3446490291_9.edr'.

From prstat and pstack , get statck trace as below:
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
23557 pin 98 0.0 0.0 0.0 0.0 2.4 0.0 0.0 0 32 0 0 ifw/23

----------------- lwp# 23 / thread# 23 --------------------
ffffffff78ecbf80 ???????? (ffffffff60fffaf8, ffffffff60fff890, 0, 3e8,
ffffffff60fffaf8, 1)
ffffffff78d55f34 bool RWRunnableSelf::serviceInterrupt() (ffffffff60fffaf8,
ffffffff7b1624c0, ffffffff60fff788, 17, ffffffff78ec90b8, ffffffff78ec90b8) +
4
ffffffff7b10edbc void PPL::InputController::execute() (10098aac0,
ffffffff60fff890, 0, 3e8, ffffffff60fffaf8, 1) + 7ec


Steps to Reproduce:
The environment "systest" is a two-server configuration.
1.stopping the cm_aaa process on server to ensure all traffic hits server 1.
2. cleared out all the logs in the ifw/log directory as well as cm_aaa.pinlog.
3. using script to generate typical data call traffic. There is nothing exceptional about the quota requests themselves.
4. stop realtime discount pipeline which causes the numerous PIN_ERR_DM_CONNECT_FAILED errors.
5. Wait a few seconds and then restart the realtime discount pipeline.
6. Within about a minute of the restart ifw generates that stack trace and begins consuming all available CPU time again.

This is not typical of what would happen in production but it's triggering what appears to be the same bug.

How to resolve this issue?

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms