Skip to content

Commit 26402f2

Browse files
committed
Change the instance name for standard pod scraping to be unique
Any of the potentially many containers in a pod can expose one or more ports with Prometheus metrics. However, with our current target labels, all of these targets get the same instance label (just the pod name), which leads to the dreaded `PrometheusOutOfOrderTimestamps` alert, see grafana/deployment_tools#3441 . (In fact, if we get the alert, we are already lucky, because the problem can go unnoticed until someone actually needs one of the time series that receive samples from different targets, rendering them useless.) In practice, we rarely have more than one port to scrape per pod, but it does happen, and it's totally within the intended usage pattern of K8s, which means it can happen more at any time. The two examples I'm aware of: - Kube-state-metrics (KSM) has only one container it its pod, but that container exposes two metrics ports (http-metrics and self-metrics). - Consul pods run a container with the consul-exporter and a container with the statsd-exporter, each exposing their metrics on a different port. Both ports are named http-metrics, which is possible because they are exposed by different containers. (This is the case that triggered the above linked issue.) To avoid the metric duplication, we could add a container and a port label, but it is a Prometheus convention that the instance label alone should be unique within a job. Which brings us to what I'm proposing in this commit: Create the instance label by joining pod name, container name, and port name with `:` in between. In most cases, the resulting instance value will appear redundant, but I believe the consistency has some value. Applying same magic to shorten the instance label when possible would add complexity and remove the consistency.
1 parent 809d4c4 commit 26402f2

File tree

1 file changed

+17
-6
lines changed

1 file changed

+17
-6
lines changed

prometheus-ksonnet/lib/prometheus-config.libsonnet

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -141,10 +141,16 @@
141141
target_label: 'namespace',
142142
},
143143

144-
// Rename instances to be the pod name
144+
// Rename instances to the concatenation of pod:container:port.
145+
// All three components are needed to guarantee a unique instance label.
145146
{
146-
source_labels: ['__meta_kubernetes_pod_name'],
147+
source_labels: [
148+
'__meta_kubernetes_pod_name',
149+
'__meta_kubernetes_pod_container_name',
150+
'__meta_kubernetes_pod_container_port_name',
151+
],
147152
action: 'replace',
153+
separator: ':',
148154
target_label: 'instance',
149155
},
150156

@@ -192,11 +198,16 @@
192198
action: 'keep',
193199
},
194200

195-
// Rename instances to be the pod name.
196-
// As the scrape two ports of KSM, include the port name in the instance
197-
// name. Otherwise alerts about scrape failures and timeouts won't work.
201+
// Rename instances to the concatenation of pod:container:port.
202+
// In the specific case of KSM, we could leave out the container
203+
// name and still have a unique instance label, but we leave it
204+
// in here for consistency with the normal pod scraping.
198205
{
199-
source_labels: ['__meta_kubernetes_pod_name', '__meta_kubernetes_pod_container_port_name'],
206+
source_labels: [
207+
'__meta_kubernetes_pod_name',
208+
'__meta_kubernetes_pod_container_name',
209+
'__meta_kubernetes_pod_container_port_name',
210+
],
200211
action: 'replace',
201212
separator: ':',
202213
target_label: 'instance',

0 commit comments

Comments
 (0)