My Oracle Support Banner

How to Troubleshoot Impairment Issues for OCSM Probes (Doc ID 2272174.1)

Last updated on MAY 01, 2019

Applies to:

Oracle Communications Session Monitor - Version 3.1.x and later
Information in this document applies to any platform.

Purpose

This document is meant to help customer troubleshoot impairment issues for OCSM probes (including Napatech and DPDK probes). An impairment can be defined as a problem with the connection between the ME and the probe.

This document applies ONLY to separate OCSM probes. If an impairment is logged for a probe which is on the same machine as the ME (local probe 127.0.0.1 on an ME+probe), then either the machine is undersized (running out of resources). If the machine is properly sized and the local probe still has impairments, this would indicate a potential defect.
This document does not apply to SBC probes. For SBC probes, see "Missing Data and Intermittent IPFIX Errors Seen: "ipfix msg version" for SBC Probes" (<Document 2285038.1>)

Here is how these issues will look like:

1. There is lost/missing traffic from this probe

2. Probe Impairment KPI (under KPI/Metrics -> Probe Metrics -> select probe -> Impairment) shows values of 1, which means there was an impairment at that time

 

3. Log on the ME: (logs can be found in /var/log/syslog* on 3.3 systems or /var/log/messages* on 3.4 systems)

For versions before 3.3.93.5.0 and 3.4.0.2.0:

Connection was reset due to timeout in the reply to ping:

2017-05-31T16:15:48.411665+02:00 ocsm apid: CRITICAL (probe_con.c:1290 probe_con_ping_em) Probe f5fbdd2d-2ff9-4f70-87d2-818fc4aba5f6 from <IP of Probe>, (id=2)
does not answer ping and will be halted.

For versions starting with 3.3.93.5.0 and 3.4.0.2.0:

No answer to ping (this does not mean connection was reset):

2017-08-25T13:22:26.186309+00:00 ocsm apid: WARNING (probe_con.c:392 probe_con_watch_thread) probe 1 did not answer ping in 4.000000000s

Connection was reset due to full inactivity:

2017-08-25T13:22:26.185486+00:00 ocsm apid: CRITICAL (probe_con.c:1351 probe_con_check_alive) Probe 1af71378-2df9-4918-bdf7-efc49a551031 from 10.165.74.73, (id=1)
has been inactive for 6000ms and will be halted.

4. Log on probe: (logs can be found in /var/log/syslog* on 3.3 systems or /var/log/messages* on 3.4 systems)

Connection was reset due to timeout in the reply to ping:

2017-05-31T16:15:47.512497+02:00 ocsm rapid: CRITICAL [CONNECTION_000] Ping to d0 (<ME IP Address> at tcp://<ME IP Address>:4741) failed. Shutting down connection.

  

Troubleshooting Steps

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Purpose
Troubleshooting Steps
 Understanding the impairment detection feature
 The timeout settings
 How often are the pings occurring
 Additional log entries
 1. Try adjusting the timers
 2. Network issues
 ME<->Probe bandwidth must be at least as large as the peak signaling bandwidth + 10%
 On standard setups, ME<->Probe path latency must not be lower then 1 second
 Two network interfaces on the same subnet
 3. Performance issue on ME or probe
 4. Tracing
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.