My Oracle Support Banner

Lag In Writing Metrics to InfluxDB (Doc ID 3062214.1)

Last updated on DECEMBER 10, 2024

Applies to:

Oracle Communications Unified Assurance - Version 6.0.5 and later
Information in this document applies to any platform.

Symptoms

On : Oracle Communications Unified Assurance 6.0.5 version, Core

ACTUAL BEHAVIOR
---------------

As a system health metric, we constantly track the number of metrics written to each measurement. In order to give Telegraf/Kafka time to process the metrics, we check the metric counts 10 minutes after the desired time period. Metrics for 08:00 are collected at 08:10.

We have noticed on multiple occasions, an anomaly where the Interface metrics are delayed being written to the database.

We start counting metrics for 18:15 @ 18:16:48. As expected, the count begins low. NOT expected is the timeframe it takes for the proper count to be realized. The correct metric count is 70104, which is not seen for the time-window of 18:15:00 until 18:47:22. Somehow it took 30 minutes for some of those metrics to be written to the Influxdb.

We have 98 different Interface Polling services. We checked historical performance metrics Poll Queue Length, DB Queue Length, Polled Duration, and Polled Devices for ALL of these services. All services report metrics well within expected norms.

We performed a full STOP and START on all Interface polling services. This did not affect the metric write lag.

We are tracking Kafka lag between Primary and Backup Consumers, but this metric has remained consistent.
We are also tracking InfluxDB active query count, but this metric has also remained consistent.


Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.