From b49a0e1cca2ee2123a513646e0c6cfb3229da522 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Fri, 14 Nov 2025 17:44:22 +0100 Subject: [PATCH 01/26] Add info function blog post Signed-off-by: Arve Knudsen --- .../2025-11-14-introducing-info-function.md | 298 ++++++++++++++++++ 1 file changed, 298 insertions(+) create mode 100644 blog/posts/2025-11-14-introducing-info-function.md diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md new file mode 100644 index 000000000..bf895dac2 --- /dev/null +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -0,0 +1,298 @@ +--- +title: Introducing the Experimental info() Function +created_at: 2025-11-14 +kind: article +author_name: Arve Knudsen +--- + +Enriching metrics with metadata labels can be surprisingly tricky in Prometheus, even if you're a PromQL wiz! +Traditionally, complex PromQL join syntax is required in Prometheus to add even basic information like Kubernetes cluster names or cloud provider regions to queries. +The new, still experimental `info()` function, promises a simpler way, making label enrichment as simple as wrapping your query in a single function call. + +In Prometheus 3.0, we introduced the [`info()`](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) function, a powerful new way to enrich your time series with labels from info metrics. +`info` doesn't only suffer simpler syntax however. +It also solves a subtle yet critical problem that has plagued join queries for years: The "churn problem" that causes queries to fail when "non-identifying" info metric labels change. +Identifying labels here in practice means those that are joined on. + +Whether you're working with OpenTelemetry resource attributes, Kubernetes labels, or any other metadata, the `info()` function makes your PromQL queries cleaner, more reliable, and easier to understand. + + + +## The Problem: Complex Joins and The Churn Problem + +Let us start by looking at what we have had to do until now. +Imagine you're monitoring HTTP request durations via OpenTelemetry and want to break them down by Kubernetes cluster. +Your metrics have `job` and `instance` labels, but the cluster name lives in a separate `target_info` metric, as the `k8s_cluster_name` label. +Here's what the traditional approach looks like: + +```promql +sum by (k8s_cluster_name, http_status_code) ( + rate(http_server_request_duration_seconds_count[2m]) + * on (job, instance) group_left (k8s_cluster_name) + target_info +) +``` + +While this works, there are several issues: + +**1. Complexity:** You need to know: +- Which info metric contains your labels (`target_info`) +- Which labels are the "identifying" labels to join on (`job`, `instance`) +- Which data labels you want to add (`k8s_cluster_name`) +- The proper PromQL join syntax (`on`, `group_left`) + +This requires expert-level PromQL knowledge and makes queries harder to read and maintain. + +**2. The Churn Problem (The Critical Issue):** + +Here's the subtle but serious problem: What happens when a Kubernetes pod gets recreated? +The `k8s_pod_name` label in `target_info` changes, and Prometheus sees this as a completely new time series. + +If the old `target_info` series isn't properly marked as stale immediately, both the old and new series can exist simultaneously for up to 5 minutes (the default lookback delta). +During this overlap period, your join query finds **two distinct matching `target_info` time series** and fails with a "many-to-many matching" error. + +This could in practice mean your dashboards break and your alerts stop firing when infrastructure changes are happening, perhaps precisely when you would need visibility the most. + +## The Solution: Simple, Reliable Label Enrichment + +The `info()` function solves both problems at once. +Here's the same query using `info()`: + +```promql +sum by (k8s_cluster_name, http_status_code) ( + info( + rate(http_server_request_duration_seconds_count[2m]), + {k8s_cluster_name=~".+"} + ) +) +``` + +Much more comprehensible, no? +The real magic happens under the hood though: **`info()` automatically selects the time series with the latest sample**, eliminating churn-related join failures entirely. + +### Basic Syntax + +```promql +info(v instant-vector, [data-label-selector instant-vector]) +``` + +- **`v`**: The instant vector to enrich with metadata labels +- **`data-label-selector`** (optional): Label matchers in curly braces to filter which labels to include + +If you omit the second parameter, `info()` adds **all** data labels from `target_info`: + +```promql +info(rate(http_server_request_duration_seconds_count[2m])) +``` + +### Selecting Different Info Metrics + +By default, `info()` uses the `target_info` metric. +However, you can select different info metrics (like `build_info`, `node_uname_info`, or `kube_pod_labels`) by including a `__name__` matcher in the data-label-selector: + +```promql +# Use build_info instead of target_info +info(up, {__name__="build_info"}) + +# Use multiple info metrics (combines labels from both) +info(up, {__name__=~"(target|build)_info"}) + +# Select build_info and only include the version label +info(up, {__name__="build_info", version=~".+"}) +``` + +**Note:** The current implementation always uses `job` and `instance` as the identifying labels for joining, regardless of which info metric you select. +This works well for most standard info metrics but may have limitations with custom info metrics that use different identifying labels. + +## Real-World Use Cases + +### OpenTelemetry Integration + +The primary driver for the `info()` function is [OpenTelemetry](https://prometheus.io/blog/2024/03/14/commitment-to-opentelemetry/) (OTel) integration. +When using Prometheus as an OTel backend, resource attributes (metadata about the metrics producer) are automatically converted to the `target_info` metric: + +- `service.instance.id` → `instance` label +- `service.name` → `job` label +- `service.namespace` → prefixed to `job` (i.e., `/`) +- All other resource attributes → data labels on `target_info` + +This means that, so long as at least either the `service.instance.id` or the `service.name` resource attribute is included, every OTel metric you send to Prometheus over OTLP can be enriched with resource attributes using `info()`: + +```promql +# Add all OTel resource attributes +info(rate(http_server_request_duration_seconds_sum[5m])) + +# Add only specific attributes +info( + rate(http_server_request_duration_seconds_sum[5m]), + {k8s_cluster_name=~".+", k8s_namespace_name=~".+", k8s_pod_name=~".+"} +) +``` + +### Kubernetes Metadata + +Enrich your metrics with Kubernetes-specific information: + +```promql +# Add cluster and namespace information to request rates +info( + sum by (job, http_status_code) ( + rate(http_server_request_duration_seconds_count[2m]) + ), + {k8s_cluster_name=~".+", k8s_namespace_name=~".+"} +) +``` + +### Cloud Provider Metadata + +Add cloud provider information to understand costs and performance by region: + +```promql +# Enrich with AWS/GCP/Azure region and availability zone +info( + rate(cloud_storage_request_count[5m]), + {cloud_provider=~".+", cloud_region=~".+", cloud_availability_zone=~".+"} +) +``` + +## Before and After: Side-by-Side Comparison + +Let's see how the `info()` function simplifies real queries: + +### Example 1: Basic Label Enrichment + +**Traditional approach:** +```promql +rate(http_server_request_duration_seconds_count[2m]) + * on (job, instance) group_left (k8s_cluster_name) + target_info +``` + +**With info():** +```promql +info( + rate(http_server_request_duration_seconds_count[2m]), + {k8s_cluster_name=~".+"} +) +``` + +### Example 2: Aggregation with Multiple Labels + +**Traditional approach:** +```promql +sum by (k8s_cluster_name, k8s_namespace_name, http_status_code) ( + rate(http_server_request_duration_seconds_count[2m]) + * on (job, instance) group_left (k8s_cluster_name, k8s_namespace_name) + target_info +) +``` + +**With info():** +```promql +sum by (k8s_cluster_name, k8s_namespace_name, http_status_code) ( + info( + rate(http_server_request_duration_seconds_count[2m]), + {k8s_cluster_name=~".+", k8s_namespace_name=~".+"} + ) +) +``` + +The intent is much clearer with `info`: We're enriching `http_server_request_duration_seconds_count` with cluster and namespace information, then aggregating by those labels and `http_status_code`. + +## Technical Benefits + +Beyond cleaner syntax, the `info()` function provides several technical advantages: + +### 1. Automatic Churn Handling + +As previously mentioned, `info()` automatically picks the matching info time series with the latest sample when multiple versions exist. +This eliminates the "many-to-many matching" errors that plague traditional join queries during churn. + +**How it works:** When non-identifying info metric labels change (e.g., a pod is re-created), there's a brief period where both old and new series might exist. +The `info()` function simply selects whichever has the most recent sample, ensuring your queries keep working. + +### 2. Better Performance + +The `info()` function is more efficient than traditional joins: +- Only selects matching info series +- Avoids unnecessary label matching operations +- Optimized query execution path + +## Getting Started + +The `info()` function is experimental and must be enabled via a feature flag: + +```bash +prometheus --enable-feature=promql-experimental-functions +``` + +Once enabled, you can start using it immediately. +Here are some simple examples to try: + +```promql +# Basic usage - add all target_info labels +info(up) + +# Selective enrichment - add only cluster name +info(up, {k8s_cluster_name=~".+"}) + +# In a real query +info( + rate(http_server_request_duration_seconds_count[5m]), + {k8s_cluster_name=~".+"} +) + +# With aggregation +sum by (k8s_cluster_name) ( + info(up, {k8s_cluster_name=~".+"}) +) +``` + +## Current Limitations and Future Plans + +The current implementation is an **MVP (Minimum Viable Product)** designed to validate the approach and gather user feedback. +It has some intentional limitations: + +### Current Constraints + +1. **Default info metric:** Only considers `target_info` by default + - Workaround: You can use `__name__` matchers like `{__name__=~"(target|build)_info"}` in the data-label-selector, though this still assumes `job` and `instance` as identifying labels + +2. **Fixed identifying labels:** Always assumes `job` and `instance` are the identifying labels for joining + - This works for most use cases but may not be suitable for all scenarios + +### Future Development + +These limitations are meant to be temporary. +The experimental status allows us to: +- Gather real-world usage feedback +- Understand which use cases matter the most +- Iterate on the design before committing to a final API + +A future version of the `info()` function should: +- Support all info metrics (not just `target_info`) +- Dynamically determine identifying labels based on the info metric's structure + +**Important:** Because this is an experimental feature, the behavior may change in future Prometheus versions, or the function could potentially be removed from PromQL entirely based on user feedback. + +## Conclusion + +The experimental `info()` function represents a significant step forward in making PromQL more accessible and reliable. +By simplifying metadata label enrichment and automatically handling the churn problem, it removes two major pain points for Prometheus users, especially those adopting OpenTelemetry. + +We encourage you to try the `info()` function and share your feedback: +- What use cases does it solve for you? +- What additional functionality would you like to see? +- How could the API be improved? +- Do you see improved performance? + +Your feedback will directly shape the future of this feature and help us determine whether it should become a permanent part of PromQL. + +To learn more: +- [PromQL functions documentation](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) +- [OpenTelemetry guide (includes detailed info() usage)](https://prometheus.io/docs/guides/opentelemetry/) +- [Feature proposal](https://github.com/prometheus/proposals/blob/main/proposals/0037-native-support-for-info-metrics-metadata.md) + +Please feel welcome to share your thoughts with the Prometheus community on [GitHub Discussions](https://github.com/prometheus/prometheus/discussions) or get in touch with us on the [CNCF Slack #prometheus channel](https://cloud-native.slack.com/). + +Happy querying! From 102f34da41f9a3daa4708cb94e8e817fac96cf5e Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Mon, 17 Nov 2025 17:30:41 +0100 Subject: [PATCH 02/26] Update blog/posts/2025-11-14-introducing-info-function.md Co-authored-by: Owen Williams Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index bf895dac2..169b72bd0 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -10,7 +10,7 @@ Traditionally, complex PromQL join syntax is required in Prometheus to add even The new, still experimental `info()` function, promises a simpler way, making label enrichment as simple as wrapping your query in a single function call. In Prometheus 3.0, we introduced the [`info()`](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) function, a powerful new way to enrich your time series with labels from info metrics. -`info` doesn't only suffer simpler syntax however. +`info` doesn't only offer a simpler syntax however. It also solves a subtle yet critical problem that has plagued join queries for years: The "churn problem" that causes queries to fail when "non-identifying" info metric labels change. Identifying labels here in practice means those that are joined on. From 933a579d8f111ed274d5ab4d96412446d0d7b7ff Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Mon, 17 Nov 2025 17:30:54 +0100 Subject: [PATCH 03/26] Update blog/posts/2025-11-14-introducing-info-function.md Co-authored-by: Owen Williams Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 169b72bd0..f981121d8 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -12,7 +12,7 @@ The new, still experimental `info()` function, promises a simpler way, making la In Prometheus 3.0, we introduced the [`info()`](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) function, a powerful new way to enrich your time series with labels from info metrics. `info` doesn't only offer a simpler syntax however. It also solves a subtle yet critical problem that has plagued join queries for years: The "churn problem" that causes queries to fail when "non-identifying" info metric labels change. -Identifying labels here in practice means those that are joined on. +In practice, "identifying labels" refers to those labels that the join is performed on. Whether you're working with OpenTelemetry resource attributes, Kubernetes labels, or any other metadata, the `info()` function makes your PromQL queries cleaner, more reliable, and easier to understand. From e0a6e062c45447dec1ebb72edd82cc6b784cad27 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 09:34:13 +0100 Subject: [PATCH 04/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index f981121d8..a1c7cdc28 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -6,7 +6,7 @@ author_name: Arve Knudsen --- Enriching metrics with metadata labels can be surprisingly tricky in Prometheus, even if you're a PromQL wiz! -Traditionally, complex PromQL join syntax is required in Prometheus to add even basic information like Kubernetes cluster names or cloud provider regions to queries. +The PromQL join query traditionally used for this is inherently quite complex because it has to specify the labels to join on, the info metric to join with, and the labels to enrich with. The new, still experimental `info()` function, promises a simpler way, making label enrichment as simple as wrapping your query in a single function call. In Prometheus 3.0, we introduced the [`info()`](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) function, a powerful new way to enrich your time series with labels from info metrics. From 3fb75923a120693be693dc751b3933314280a8fd Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 09:35:42 +0100 Subject: [PATCH 05/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index a1c7cdc28..ebae25c98 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -60,10 +60,7 @@ Here's the same query using `info()`: ```promql sum by (k8s_cluster_name, http_status_code) ( - info( - rate(http_server_request_duration_seconds_count[2m]), - {k8s_cluster_name=~".+"} - ) + info(rate(http_server_request_duration_seconds_count[2m])) ) ``` From b7f466e390ef33d105fc88c564d2b78465a0a800 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 09:38:24 +0100 Subject: [PATCH 06/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index ebae25c98..515c3cfe8 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -267,7 +267,7 @@ The experimental status allows us to: - Iterate on the design before committing to a final API A future version of the `info()` function should: -- Support all info metrics (not just `target_info`) +- Consider all info metrics by default (not just `target_info`) - Dynamically determine identifying labels based on the info metric's structure **Important:** Because this is an experimental feature, the behavior may change in future Prometheus versions, or the function could potentially be removed from PromQL entirely based on user feedback. From 6bf6f399f42afa3d3dd179cf9d074dc47c09ec6d Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 09:38:49 +0100 Subject: [PATCH 07/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 515c3cfe8..6490d0030 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -56,9 +56,6 @@ This could in practice mean your dashboards break and your alerts stop firing wh ## The Solution: Simple, Reliable Label Enrichment The `info()` function solves both problems at once. -Here's the same query using `info()`: - -```promql sum by (k8s_cluster_name, http_status_code) ( info(rate(http_server_request_duration_seconds_count[2m])) ) From 220ade118d46d4f982cb3207f05e7de8769dc7eb Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 10:30:45 +0100 Subject: [PATCH 08/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 6490d0030..382cb6086 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -10,9 +10,12 @@ The PromQL join query traditionally used for this is inherently quite complex be The new, still experimental `info()` function, promises a simpler way, making label enrichment as simple as wrapping your query in a single function call. In Prometheus 3.0, we introduced the [`info()`](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) function, a powerful new way to enrich your time series with labels from info metrics. -`info` doesn't only offer a simpler syntax however. -It also solves a subtle yet critical problem that has plagued join queries for years: The "churn problem" that causes queries to fail when "non-identifying" info metric labels change. -In practice, "identifying labels" refers to those labels that the join is performed on. +What's special about `info()` versus the traditional join query technique is that it relieves you from having to specify _identifying labels_, which info metric(s) to join with, and the (non-identifying) labels to enrich with. +Note that "identifying labels" in this particular context refers to the set of labels that identify the info metrics in question, and are shared with associated non-info metrics. +They are the labels you would join on in a Prometheus [join query](https://grafana.com/blog/2021/08/04/how-to-use-promql-joins-for-more-effective-queries-of-prometheus-metrics-at-scale). +Conceptually, they can be compared to [foreign keys](https://en.wikipedia.org/wiki/Foreign_key) in relational databases. + +Beyond the main functionality, `info()` also solves a subtle yet critical problem that has plagued join queries for years: The "churn problem" that causes queries to fail when non-identifying info metric labels change, combined with missing staleness marking (as is the case with OTLP ingestion). Whether you're working with OpenTelemetry resource attributes, Kubernetes labels, or any other metadata, the `info()` function makes your PromQL queries cleaner, more reliable, and easier to understand. From 729a7552b69ce10f4878c0156ae6fad4696f5d97 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 10:46:35 +0100 Subject: [PATCH 09/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 382cb6086..31706ea70 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -10,7 +10,7 @@ The PromQL join query traditionally used for this is inherently quite complex be The new, still experimental `info()` function, promises a simpler way, making label enrichment as simple as wrapping your query in a single function call. In Prometheus 3.0, we introduced the [`info()`](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) function, a powerful new way to enrich your time series with labels from info metrics. -What's special about `info()` versus the traditional join query technique is that it relieves you from having to specify _identifying labels_, which info metric(s) to join with, and the (non-identifying) labels to enrich with. +What's special about `info()` versus the traditional join query technique is that it relieves you from having to specify _identifying labels_, which info metric(s) to join with, and the ("data" or "non-identifying") labels to enrich with. Note that "identifying labels" in this particular context refers to the set of labels that identify the info metrics in question, and are shared with associated non-info metrics. They are the labels you would join on in a Prometheus [join query](https://grafana.com/blog/2021/08/04/how-to-use-promql-joins-for-more-effective-queries-of-prometheus-metrics-at-scale). Conceptually, they can be compared to [foreign keys](https://en.wikipedia.org/wiki/Foreign_key) in relational databases. @@ -25,6 +25,7 @@ Whether you're working with OpenTelemetry resource attributes, Kubernetes labels Let us start by looking at what we have had to do until now. Imagine you're monitoring HTTP request durations via OpenTelemetry and want to break them down by Kubernetes cluster. +You push your metrics to Prometheus' OTLP endpoint. Your metrics have `job` and `instance` labels, but the cluster name lives in a separate `target_info` metric, as the `k8s_cluster_name` label. Here's what the traditional approach looks like: @@ -48,10 +49,11 @@ This requires expert-level PromQL knowledge and makes queries harder to read and **2. The Churn Problem (The Critical Issue):** -Here's the subtle but serious problem: What happens when a Kubernetes pod gets recreated? -The `k8s_pod_name` label in `target_info` changes, and Prometheus sees this as a completely new time series. +Here's the subtle but serious problem: What happens when an OTel resource attribute changes in a Kubernetes container, while the identifying resource attributes stay the same? +An example could be the resource attribute `k8s.pod.labels.app.kubernetes.io/version`. +Then the corresponding `target_info` label `k8s.pod.labels.app.kubernetes.io/version` changes, and Prometheus sees a completely new `target_info` time series. -If the old `target_info` series isn't properly marked as stale immediately, both the old and new series can exist simultaneously for up to 5 minutes (the default lookback delta). +As the OTLP endpoint doesn't mark the old `target_info` series as stale, both the old and new series can exist simultaneously for up to 5 minutes (the default lookback delta). During this overlap period, your join query finds **two distinct matching `target_info` time series** and fails with a "many-to-many matching" error. This could in practice mean your dashboards break and your alerts stop firing when infrastructure changes are happening, perhaps precisely when you would need visibility the most. From 4358dfc589daa58a11762b991749772c3f687dcf Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 10:52:24 +0100 Subject: [PATCH 10/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 31706ea70..2de3be470 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -58,9 +58,6 @@ During this overlap period, your join query finds **two distinct matching `targe This could in practice mean your dashboards break and your alerts stop firing when infrastructure changes are happening, perhaps precisely when you would need visibility the most. -## The Solution: Simple, Reliable Label Enrichment - -The `info()` function solves both problems at once. sum by (k8s_cluster_name, http_status_code) ( info(rate(http_server_request_duration_seconds_count[2m])) ) From 2f68dcb6cfab003b25e85bbc029702b3a5942486 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 10:56:19 +0100 Subject: [PATCH 11/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 2de3be470..493ecd181 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -58,13 +58,15 @@ During this overlap period, your join query finds **two distinct matching `targe This could in practice mean your dashboards break and your alerts stop firing when infrastructure changes are happening, perhaps precisely when you would need visibility the most. +```promql sum by (k8s_cluster_name, http_status_code) ( info(rate(http_server_request_duration_seconds_count[2m])) ) ``` Much more comprehensible, no? -The real magic happens under the hood though: **`info()` automatically selects the time series with the latest sample**, eliminating churn-related join failures entirely. +Note that this call to `info()` returns all data labels from `target_info`, but it doesn't matter because we aggregate them away with `sum`. +As regards solving the churn problem, the real magic happens under the hood: **`info()` automatically selects the time series with the latest sample**, eliminating churn-related join failures entirely. ### Basic Syntax From c6df22a16910711096c6178910e562b36569fd60 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 11:07:47 +0100 Subject: [PATCH 12/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- .../2025-11-14-introducing-info-function.md | 23 ++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 493ecd181..9a61c8699 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -77,12 +77,33 @@ info(v instant-vector, [data-label-selector instant-vector]) - **`v`**: The instant vector to enrich with metadata labels - **`data-label-selector`** (optional): Label matchers in curly braces to filter which labels to include -If you omit the second parameter, `info()` adds **all** data labels from `target_info`: +In its most basic form, omitting the second parameter, `info()` adds **all** data labels from `target_info`: ```promql info(rate(http_server_request_duration_seconds_count[2m])) ``` +Through the second parameter on the other hand, you can control which data labels to include from `target_info`: + +```promql +info( + rate(http_server_request_duration_seconds_count[2m]), + {k8s_cluster_name=~".+"} +) +``` + +In the example above, `info()` includes the `k8s_cluster_name` data label from `target_info`. +Because the selector matches any non-empty string, it will include any `k8s_cluster_name` label value. + +It's also possible to filter which `k8s_cluster_name` label values to include: + +```promql +info( + rate(http_server_request_duration_seconds_count[2m]), + {k8s_cluster_name="us-east-0"} +) +``` + ### Selecting Different Info Metrics By default, `info()` uses the `target_info` metric. From 39a186415778d16cea9b61a66f8b93d0844af61a Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 11:10:59 +0100 Subject: [PATCH 13/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 1 + 1 file changed, 1 insertion(+) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 9a61c8699..2d8c2643e 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -122,6 +122,7 @@ info(up, {__name__="build_info", version=~".+"}) **Note:** The current implementation always uses `job` and `instance` as the identifying labels for joining, regardless of which info metric you select. This works well for most standard info metrics but may have limitations with custom info metrics that use different identifying labels. +The intention is that `info()` in the future knows which metrics in the TSDB are info metrics and automatically uses all of them, unless the selection is explicitly restricted by a name matcher like the above, and which are the identifying labels for each info metric. ## Real-World Use Cases From 151a16ed9d08f780336753394cfdfc2c3709da53 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 11:14:56 +0100 Subject: [PATCH 14/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 1 + 1 file changed, 1 insertion(+) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 2d8c2643e..8ab416bb6 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -122,6 +122,7 @@ info(up, {__name__="build_info", version=~".+"}) **Note:** The current implementation always uses `job` and `instance` as the identifying labels for joining, regardless of which info metric you select. This works well for most standard info metrics but may have limitations with custom info metrics that use different identifying labels. +An example of an info metric that has different identifying labels than `job` and `instance` is `kube_pod_labels`, its identifying labels are instead: `namespace` and `pod`. The intention is that `info()` in the future knows which metrics in the TSDB are info metrics and automatically uses all of them, unless the selection is explicitly restricted by a name matcher like the above, and which are the identifying labels for each info metric. ## Real-World Use Cases From d8c4afb1c06a53ce3a7b6945a64117f5dec26318 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 15:52:47 +0100 Subject: [PATCH 15/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- .../2025-11-14-introducing-info-function.md | 27 +++++++++---------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 8ab416bb6..536d24dc7 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -107,7 +107,7 @@ info( ### Selecting Different Info Metrics By default, `info()` uses the `target_info` metric. -However, you can select different info metrics (like `build_info`, `node_uname_info`, or `kube_pod_labels`) by including a `__name__` matcher in the data-label-selector: +However, you can select different info metrics (like `build_info` or `node_uname_info`) by including a `__name__` matcher in the data-label-selector: ```promql # Use build_info instead of target_info @@ -150,29 +150,26 @@ info( ) ``` -### Kubernetes Metadata +### Build Information -Enrich your metrics with Kubernetes-specific information: +Enrich your metrics with build-time information: ```promql -# Add cluster and namespace information to request rates -info( - sum by (job, http_status_code) ( - rate(http_server_request_duration_seconds_count[2m]) - ), - {k8s_cluster_name=~".+", k8s_namespace_name=~".+"} +# Add version and branch information to request rates +sum by (job, http_status_code, version, branch) info( + rate(http_server_request_duration_seconds_count[2m]), + {__name__="build_info"} ) ``` -### Cloud Provider Metadata +### Filter on Producer Version -Add cloud provider information to understand costs and performance by region: +Pick only metrics from certain producer versions: ```promql -# Enrich with AWS/GCP/Azure region and availability zone -info( - rate(cloud_storage_request_count[5m]), - {cloud_provider=~".+", cloud_region=~".+", cloud_availability_zone=~".+"} +sum by (job, http_status_code, version) info( + rate(http_server_request_duration_seconds_count[2m]), + {__name__="build_info", version=~"2\..+"} ) ``` From 61033aeed7f0b5df45d284e1acaecce59795d679 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 16:11:24 +0100 Subject: [PATCH 16/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- .../2025-11-14-introducing-info-function.md | 55 +++++++------------ 1 file changed, 19 insertions(+), 36 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 536d24dc7..8086398c7 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -177,49 +177,52 @@ sum by (job, http_status_code, version) info( Let's see how the `info()` function simplifies real queries: -### Example 1: Basic Label Enrichment +### Example 1: OpenTelemetry Resource Attribute Enrichment **Traditional approach:** ```promql -rate(http_server_request_duration_seconds_count[2m]) - * on (job, instance) group_left (k8s_cluster_name) +sum by (http_status_code, k8s_cluster_name, k8s_container_name, k8s_cronjob_name, k8s_daemonset_name, k8s_deployment_name, k8s_job_name, k8s_namespace_name, k8s_pod_name, k8s_replicaset_name, k8s_statefulset_name) ( + rate(http_server_request_duration_seconds_count[2m]) + * on (job, instance) group_left (k8s_cluster_name, k8s_container_name, k8s_cronjob_name, k8s_daemonset_name, k8s_deployment_name, k8s_job_name, k8s_namespace_name, k8s_pod_name, k8s_replicaset_name, k8s_statefulset_name) target_info +) ``` **With info():** ```promql -info( - rate(http_server_request_duration_seconds_count[2m]), - {k8s_cluster_name=~".+"} +sum by (http_status_code, k8s_cluster_name, k8s_container_name, k8s_cronjob_name, k8s_daemonset_name, k8s_deployment_name, k8s_job_name, k8s_namespace_name, k8s_pod_name, k8s_replicaset_name, k8s_statefulset_name) ( + info(rate(http_server_request_duration_seconds_count[2m])) ) ``` -### Example 2: Aggregation with Multiple Labels +The intent is much clearer with `info`: We're enriching `http_server_request_duration_seconds_count` with Kubernetes related OpenTelemetry resource attributes. + +### Example 2: Filtering by Label Value **Traditional approach:** ```promql -sum by (k8s_cluster_name, k8s_namespace_name, http_status_code) ( +sum by (http_status_code) ( rate(http_server_request_duration_seconds_count[2m]) - * on (job, instance) group_left (k8s_cluster_name, k8s_namespace_name) - target_info + * on (job, instance) group_left () + target_info{k8s_cluster_name="us-east-1"} ) ``` **With info():** ```promql -sum by (k8s_cluster_name, k8s_namespace_name, http_status_code) ( +sum by (http_status_code) ( info( rate(http_server_request_duration_seconds_count[2m]), - {k8s_cluster_name=~".+", k8s_namespace_name=~".+"} + {k8s_cluster_name="us-east-1"} ) ) ``` -The intent is much clearer with `info`: We're enriching `http_server_request_duration_seconds_count` with cluster and namespace information, then aggregating by those labels and `http_status_code`. +Here we filter to only include metrics from the `us-east-1` cluster. The `info()` version integrates the filter naturally into the data-label-selector. ## Technical Benefits -Beyond cleaner syntax, the `info()` function provides several technical advantages: +Beyond the fundamental UX benefits, the `info()` function provides several technical advantages: ### 1. Automatic Churn Handling @@ -245,26 +248,6 @@ prometheus --enable-feature=promql-experimental-functions ``` Once enabled, you can start using it immediately. -Here are some simple examples to try: - -```promql -# Basic usage - add all target_info labels -info(up) - -# Selective enrichment - add only cluster name -info(up, {k8s_cluster_name=~".+"}) - -# In a real query -info( - rate(http_server_request_duration_seconds_count[5m]), - {k8s_cluster_name=~".+"} -) - -# With aggregation -sum by (k8s_cluster_name) ( - info(up, {k8s_cluster_name=~".+"}) -) -``` ## Current Limitations and Future Plans @@ -277,7 +260,7 @@ It has some intentional limitations: - Workaround: You can use `__name__` matchers like `{__name__=~"(target|build)_info"}` in the data-label-selector, though this still assumes `job` and `instance` as identifying labels 2. **Fixed identifying labels:** Always assumes `job` and `instance` are the identifying labels for joining - - This works for most use cases but may not be suitable for all scenarios + - This unfortunately makes `info()` unsuitable for certain scenarios, e.g. including data labels from `kube_pod_labels`, but it's a problem we want to solve in the future ### Future Development @@ -289,7 +272,7 @@ The experimental status allows us to: A future version of the `info()` function should: - Consider all info metrics by default (not just `target_info`) -- Dynamically determine identifying labels based on the info metric's structure +- Automatically understand identifying labels based on info metric metadata **Important:** Because this is an experimental feature, the behavior may change in future Prometheus versions, or the function could potentially be removed from PromQL entirely based on user feedback. From a083bca4c2152baa920313a3992352242e8df2a9 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Thu, 27 Nov 2025 17:08:11 +0100 Subject: [PATCH 17/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 8086398c7..34d6201c2 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -252,7 +252,8 @@ Once enabled, you can start using it immediately. ## Current Limitations and Future Plans The current implementation is an **MVP (Minimum Viable Product)** designed to validate the approach and gather user feedback. -It has some intentional limitations: +You may provide feedback through e.g. our [community connections](https://prometheus.io/community/#community-connections) or by opening a [Prometheus issue](https://github.com/prometheus/prometheus/issues). +The implementation has some intentional limitations: ### Current Constraints @@ -288,6 +289,7 @@ We encourage you to try the `info()` function and share your feedback: - Do you see improved performance? Your feedback will directly shape the future of this feature and help us determine whether it should become a permanent part of PromQL. +Feedback may be provided e.g. through our [community connections](https://prometheus.io/community/#community-connections) or by opening a [Prometheus issue](https://github.com/prometheus/prometheus/issues). To learn more: - [PromQL functions documentation](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) From aa82bae29c1e80e6c0796c2532c4f05235f68418 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Tue, 2 Dec 2025 19:28:58 +0100 Subject: [PATCH 18/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 34d6201c2..c5e8f26a9 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -64,7 +64,7 @@ sum by (k8s_cluster_name, http_status_code) ( ) ``` -Much more comprehensible, no? +Much more comprehensible, isn't it? Note that this call to `info()` returns all data labels from `target_info`, but it doesn't matter because we aggregate them away with `sum`. As regards solving the churn problem, the real magic happens under the hood: **`info()` automatically selects the time series with the latest sample**, eliminating churn-related join failures entirely. From f8b572d469645ffae8c9ca5619da0f4a2e9eeb30 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Tue, 2 Dec 2025 19:30:49 +0100 Subject: [PATCH 19/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index c5e8f26a9..42d18acc2 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -201,10 +201,10 @@ The intent is much clearer with `info`: We're enriching `http_server_request_dur **Traditional approach:** ```promql -sum by (http_status_code) ( +sum by (http_status_code, k8s_cluster_name) ( rate(http_server_request_duration_seconds_count[2m]) - * on (job, instance) group_left () - target_info{k8s_cluster_name="us-east-1"} + * on (job, instance) group_left (k8s_cluster_name) + target_info{k8s_cluster_name=~"us-.*"} ) ``` From 893356af14b1b1fe9e1e36e8cb133f9564b3dcf2 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Tue, 2 Dec 2025 19:31:45 +0100 Subject: [PATCH 20/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 42d18acc2..258ef9d0f 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -210,10 +210,10 @@ sum by (http_status_code, k8s_cluster_name) ( **With info():** ```promql -sum by (http_status_code) ( +sum by (http_status_code, k8s_cluster_name) ( info( rate(http_server_request_duration_seconds_count[2m]), - {k8s_cluster_name="us-east-1"} + {k8s_cluster_name=~"us-.*"} ) ) ``` From 6b2367c51daf721ca4be7446598dac5e74aebe0b Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Tue, 2 Dec 2025 19:32:21 +0100 Subject: [PATCH 21/26] Update blog/posts/2025-11-14-introducing-info-function.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Björn Rabenstein Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 258ef9d0f..fe09be924 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -218,7 +218,7 @@ sum by (http_status_code, k8s_cluster_name) ( ) ``` -Here we filter to only include metrics from the `us-east-1` cluster. The `info()` version integrates the filter naturally into the data-label-selector. +Here we filter to only include metrics from clusters in the US (whose name starts with `us-`). The `info()` version integrates the filter naturally into the data-label-selector. ## Technical Benefits From 61621b6c39d551e839416508caaaf6235a7a3a12 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Tue, 2 Dec 2025 19:36:40 +0100 Subject: [PATCH 22/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index fe09be924..3bb8b4321 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -181,16 +181,16 @@ Let's see how the `info()` function simplifies real queries: **Traditional approach:** ```promql -sum by (http_status_code, k8s_cluster_name, k8s_container_name, k8s_cronjob_name, k8s_daemonset_name, k8s_deployment_name, k8s_job_name, k8s_namespace_name, k8s_pod_name, k8s_replicaset_name, k8s_statefulset_name) ( +sum by (http_status_code, k8s_cluster_name, k8s_namespace_name, k8s_container_name) ( rate(http_server_request_duration_seconds_count[2m]) - * on (job, instance) group_left (k8s_cluster_name, k8s_container_name, k8s_cronjob_name, k8s_daemonset_name, k8s_deployment_name, k8s_job_name, k8s_namespace_name, k8s_pod_name, k8s_replicaset_name, k8s_statefulset_name) + * on (job, instance) group_left (k8s_cluster_name, k8s_namespace_name, k8s_container_name) target_info ) ``` **With info():** ```promql -sum by (http_status_code, k8s_cluster_name, k8s_container_name, k8s_cronjob_name, k8s_daemonset_name, k8s_deployment_name, k8s_job_name, k8s_namespace_name, k8s_pod_name, k8s_replicaset_name, k8s_statefulset_name) ( +sum by (http_status_code, k8s_cluster_name, k8s_namespace_name, k8s_container_name) ( info(rate(http_server_request_duration_seconds_count[2m])) ) ``` From 386253e2e54fc03b08fae13991cd453ff31b0e94 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Tue, 2 Dec 2025 19:58:32 +0100 Subject: [PATCH 23/26] Apply reviewer feedback Signed-off-by: Arve Knudsen --- .../2025-11-14-introducing-info-function.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 3bb8b4321..9d1d4d34f 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -30,7 +30,7 @@ Your metrics have `job` and `instance` labels, but the cluster name lives in a s Here's what the traditional approach looks like: ```promql -sum by (k8s_cluster_name, http_status_code) ( +sum by (http_status_code, k8s_cluster_name) ( rate(http_server_request_duration_seconds_count[2m]) * on (job, instance) group_left (k8s_cluster_name) target_info @@ -51,24 +51,28 @@ This requires expert-level PromQL knowledge and makes queries harder to read and Here's the subtle but serious problem: What happens when an OTel resource attribute changes in a Kubernetes container, while the identifying resource attributes stay the same? An example could be the resource attribute `k8s.pod.labels.app.kubernetes.io/version`. -Then the corresponding `target_info` label `k8s.pod.labels.app.kubernetes.io/version` changes, and Prometheus sees a completely new `target_info` time series. +Then the corresponding `target_info` label `k8s_pod_labels_app_kubernetes_io_version` changes, and Prometheus sees a completely new `target_info` time series. As the OTLP endpoint doesn't mark the old `target_info` series as stale, both the old and new series can exist simultaneously for up to 5 minutes (the default lookback delta). During this overlap period, your join query finds **two distinct matching `target_info` time series** and fails with a "many-to-many matching" error. This could in practice mean your dashboards break and your alerts stop firing when infrastructure changes are happening, perhaps precisely when you would need visibility the most. +### How the Info Function Solves the Problem + +The previous join query can be converted to use the `info` function as follows: + ```promql -sum by (k8s_cluster_name, http_status_code) ( +sum by (http_status_code, k8s_cluster_name) ( info(rate(http_server_request_duration_seconds_count[2m])) ) ``` Much more comprehensible, isn't it? -Note that this call to `info()` returns all data labels from `target_info`, but it doesn't matter because we aggregate them away with `sum`. As regards solving the churn problem, the real magic happens under the hood: **`info()` automatically selects the time series with the latest sample**, eliminating churn-related join failures entirely. +Note that this call to `info()` returns all data labels from `target_info`, but it doesn't matter because we aggregate them away with `sum`. -### Basic Syntax +## Basic Syntax ```promql info(v instant-vector, [data-label-selector instant-vector]) @@ -104,7 +108,7 @@ info( ) ``` -### Selecting Different Info Metrics +## Selecting Different Info Metrics By default, `info()` uses the `target_info` metric. However, you can select different info metrics (like `build_info` or `node_uname_info`) by including a `__name__` matcher in the data-label-selector: From 018760be3aaeddc75aa5d3d252c3f4ede1d406ec Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Tue, 2 Dec 2025 20:18:58 +0100 Subject: [PATCH 24/26] Tweak language Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 9d1d4d34f..36eab0d00 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -58,7 +58,7 @@ During this overlap period, your join query finds **two distinct matching `targe This could in practice mean your dashboards break and your alerts stop firing when infrastructure changes are happening, perhaps precisely when you would need visibility the most. -### How the Info Function Solves the Problem +### The Info Function Presents a Solution The previous join query can be converted to use the `info` function as follows: @@ -222,7 +222,7 @@ sum by (http_status_code, k8s_cluster_name) ( ) ``` -Here we filter to only include metrics from clusters in the US (whose name starts with `us-`). The `info()` version integrates the filter naturally into the data-label-selector. +Here we filter to only include metrics from clusters in the US (which names start with `us-`). The `info()` version integrates the filter naturally into the data-label-selector. ## Technical Benefits From 823a03d422b5011fd42b313df57705dab1a5ca0f Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Wed, 3 Dec 2025 09:10:51 +0100 Subject: [PATCH 25/26] Move feedback to its own section Signed-off-by: Arve Knudsen --- blog/posts/2025-11-14-introducing-info-function.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 36eab0d00..544b41d9e 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -256,7 +256,6 @@ Once enabled, you can start using it immediately. ## Current Limitations and Future Plans The current implementation is an **MVP (Minimum Viable Product)** designed to validate the approach and gather user feedback. -You may provide feedback through e.g. our [community connections](https://prometheus.io/community/#community-connections) or by opening a [Prometheus issue](https://github.com/prometheus/prometheus/issues). The implementation has some intentional limitations: ### Current Constraints @@ -292,9 +291,6 @@ We encourage you to try the `info()` function and share your feedback: - How could the API be improved? - Do you see improved performance? -Your feedback will directly shape the future of this feature and help us determine whether it should become a permanent part of PromQL. -Feedback may be provided e.g. through our [community connections](https://prometheus.io/community/#community-connections) or by opening a [Prometheus issue](https://github.com/prometheus/prometheus/issues). - To learn more: - [PromQL functions documentation](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) - [OpenTelemetry guide (includes detailed info() usage)](https://prometheus.io/docs/guides/opentelemetry/) @@ -303,3 +299,8 @@ To learn more: Please feel welcome to share your thoughts with the Prometheus community on [GitHub Discussions](https://github.com/prometheus/prometheus/discussions) or get in touch with us on the [CNCF Slack #prometheus channel](https://cloud-native.slack.com/). Happy querying! + +## Giving Feedback + +Your feedback will directly shape the future of this feature and help us determine whether it should become a permanent part of PromQL. +Feedback may be provided e.g. through our [community connections](https://prometheus.io/community/#community-connections) or by opening a [Prometheus issue](https://github.com/prometheus/prometheus/issues). From 41da4ff0ce333dbf3deae4d0da25643ae070fdf5 Mon Sep 17 00:00:00 2001 From: Arve Knudsen Date: Wed, 3 Dec 2025 09:15:17 +0100 Subject: [PATCH 26/26] Move Giving Feedback section before Conclusion Signed-off-by: Arve Knudsen --- .../2025-11-14-introducing-info-function.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/blog/posts/2025-11-14-introducing-info-function.md b/blog/posts/2025-11-14-introducing-info-function.md index 544b41d9e..442dc4cb0 100644 --- a/blog/posts/2025-11-14-introducing-info-function.md +++ b/blog/posts/2025-11-14-introducing-info-function.md @@ -280,10 +280,10 @@ A future version of the `info()` function should: **Important:** Because this is an experimental feature, the behavior may change in future Prometheus versions, or the function could potentially be removed from PromQL entirely based on user feedback. -## Conclusion +## Giving Feedback -The experimental `info()` function represents a significant step forward in making PromQL more accessible and reliable. -By simplifying metadata label enrichment and automatically handling the churn problem, it removes two major pain points for Prometheus users, especially those adopting OpenTelemetry. +Your feedback will directly shape the future of this feature and help us determine whether it should become a permanent part of PromQL. +Feedback may be provided e.g. through our [community connections](https://prometheus.io/community/#community-connections) or by opening a [Prometheus issue](https://github.com/prometheus/prometheus/issues). We encourage you to try the `info()` function and share your feedback: - What use cases does it solve for you? @@ -291,6 +291,11 @@ We encourage you to try the `info()` function and share your feedback: - How could the API be improved? - Do you see improved performance? +## Conclusion + +The experimental `info()` function represents a significant step forward in making PromQL more accessible and reliable. +By simplifying metadata label enrichment and automatically handling the churn problem, it removes two major pain points for Prometheus users, especially those adopting OpenTelemetry. + To learn more: - [PromQL functions documentation](https://prometheus.io/docs/prometheus/latest/querying/functions/#info) - [OpenTelemetry guide (includes detailed info() usage)](https://prometheus.io/docs/guides/opentelemetry/) @@ -299,8 +304,3 @@ To learn more: Please feel welcome to share your thoughts with the Prometheus community on [GitHub Discussions](https://github.com/prometheus/prometheus/discussions) or get in touch with us on the [CNCF Slack #prometheus channel](https://cloud-native.slack.com/). Happy querying! - -## Giving Feedback - -Your feedback will directly shape the future of this feature and help us determine whether it should become a permanent part of PromQL. -Feedback may be provided e.g. through our [community connections](https://prometheus.io/community/#community-connections) or by opening a [Prometheus issue](https://github.com/prometheus/prometheus/issues).