-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
Hi,
We have been running an older version of kafka_consumer (2.16.4) because we had some issues with metrics when we tried to upgrade before. We finally tried again and rolled out 6.5.2 which contains improvements made to address the issue mentioned in #19564 However the check still takes much longer as compared to the previous version which is between 5-10 seconds
sudo datadog-agent status Collector | grep -A 8 kafka
kafka_consumer (6.5.2)
----------------------
Instance ID: kafka_consumer:6806d01930984041 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/kafka_consumer.d/conf.yaml
Total Runs: 1
Metric Samples: Last Run: 74,352, Total: 74,352
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 33m40.531s
Last Execution Date : 2025-06-26 14:09:52 UTC (1750946992000)
Last Successful Execution Date : 2025-06-26 14:09:52 UTC (1750946992000)
There are possible improvments that can be made to reduce this time. Two main ones are
- In the get_highwater_offsets we fetch the list of topic partitions for each consumer group. However highwater mark is not a per consumer group setting so fetching topic, partition info once should be enough https://github.com/DataDog/integrations-core/blob/master/kafka_consumer/datadog_checks/kafka_consumer/kafka_consumer.py#L348
- Also we can cache the result of list_topics method and use it through the run of the check instead of calling it everytime we need this topic partition info.
I made this changes and tried the locally and it brought the time down to the acceptable range again.
sudo datadog-agent status Collector | grep -A 8 kafka
kafka_consumer (6.5.2)
----------------------
Instance ID: kafka_consumer:6c446270ca0a8da1 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/kafka_consumer.d/conf.yaml
Total Runs: 1
Metric Samples: Last Run: 76,129, Total: 76,129
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 22.72s
Last Execution Date : 2025-06-27 20:59:48 UTC (1751057988000)
Last Successful Execution Date : 2025-06-27 20:59:48 UTC (1751057988000)
dkirov-dd and NouemanKHAL
Metadata
Metadata
Assignees
Labels
No labels