JMX metrics endpoint is slow #7670

abhinavgautam07 · 2022-11-21T12:27:29Z

abhinavgautam07
Nov 21, 2022

There was some metrics mismatch on grafana compared to what we actual saw on the cli (like number of brokers up), to investigate further, we exec into the pod and ran command curl localhost:9404/metrics , it gave some data but the result came after like 5-6 mins, what should we scale, to get metrics faster?

zfrgt · 2023-04-10T15:14:47Z

zfrgt
Apr 10, 2023

Hello Abhinav!

Your issue is really similar to the issue we have with Strimzi Metrics.

From what I've seen so far, this issue usually appears when the Kafka cluster has quite a lot of topics (10..20k), and the main problem is that metrics are prepared and exposed for each and every topic, enlarging the output drastically, up to ~33.96 MiB in our case (please see logs below). To me it also seems like we spend quite some time calculating/generating the output rather than downloading it (as it's localhost and not a remote server of any kind). The utilization patterns of the VM don't seem to change much during the handling of this request.

$ curl -v localhost:9404

# some time passes, usually 5-6 minutes or so

> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 06 Apr 2023 16:39:03 GMT
< Content-type: text/plain; version=0.0.4; charset=utf-8
< Content-length: 35614835

Most of the time these requests are failing after approximately 1 minute, like below:

$ time curl -v localhost:9404 > /tmp/metric_response_01

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
0     0    0     0    0     0      0      0 --:--:--  0:01:00 --:--:--     0
curl: (56) Recv failure: Connection reset by peer

real	1m0.334s

I've found some information on how to limit the metrics output here, but it seems to be relevant only to Strimzi Kafka Exporter, while the main issue to me seems to be with the metrics generated and exposed from Kafka process running on the Broker Pods.

Is there any way to exclude some topics from these metrics on Pods, so that preparation and exposition will be done only for the topics we are interested in?
Just in case, our Strimzi operator version is 0.33.1, openjdk version "17.0.6" 2023-01-17 LTS.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strimzi

JMX metrics endpoint is slow #7670

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Strimzi

JMX metrics endpoint is slow #7670

Uh oh!

abhinavgautam07 Nov 21, 2022

Replies: 1 comment

Uh oh!

zfrgt Apr 10, 2023

abhinavgautam07
Nov 21, 2022

zfrgt
Apr 10, 2023