Skip to content

Conversation

@skl
Copy link
Collaborator

@skl skl commented Jan 31, 2025

Resolves duplicate series error with KubeletTooManyPods alert:

  • Previous query used a per-node pod count, which is expensive and prone to duplicate-series errors with pods of the same name and different uid
  • I wanted to avoid depending on uid as many people drop this label due to high cardinality
  • kubelet_running_pods metric already exists and gives a per-node count of running pods, alleviating the need to counts the pods per node in the first place

Added a test for multi-node scenario.

Fixes #1015

@skl skl added the bug Something isn't working label Jan 31, 2025
@skl skl self-assigned this Jan 31, 2025
@skl skl requested a review from povilasv as a code owner January 31, 2025 19:22
@skl skl merged commit 234c773 into master Feb 3, 2025
18 checks passed
@skl skl deleted the skl/fix-KubeletTooManyPods-duplicate-series branch February 3, 2025 09:33
@jonasbadstuebner
Copy link

#1015 (comment)

I'm sorry I didn't answer over the weekend. The label_replace in your proposed solution is incorrect. It's replacing the node name with the IP of the node. kubelet_running_pods already has a node label that has the node name as value, so you can drop the label_replace.

rexagod pushed a commit to rexagod/kubernetes-mixin that referenced this pull request Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: found duplicate series in KubeletTooManyPods alert

3 participants