-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add Guidance wrt Labelling to Naming and Rules Best Practices #2691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 6 commits
e57a291
a40367a
ccf009a
0e87e09
a79aa9b
a0ebfa9
7694523
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -82,6 +82,17 @@ of unit and type information in the metric name will cause certain series to col | |||||
|
||||||
## Labels | ||||||
|
||||||
* `job` | ||||||
* The `job` label is one of the few ubiquitious labels, set at scrape time, and is used to identify metrics scraped from the same target/exporter. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
* If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure this is really a useful note here, as this applies to all label matching.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It applies to all labels. But There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but it's not related to job, but related to "target labels" and discovery. That is a different thing and related to querying, not creating labels. |
||||||
|
||||||
WARNING: When using `without`, be careful not to strip out the `job` label accidentally. | ||||||
|
||||||
Comment on lines
+93
to
+94
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This warning doesn't make a lot of sense to me. It has a high probability of being quoted as copy-pasta without being understood. Let's just drop it.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't want to encourage copy-pasta, but this is an important point. If using alerting expressions like I'll polish the wording here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't seem related to naming practices, which is what this guide is about. |
||||||
* `instance` | ||||||
* The `instance` label will include the `ip:port` what was scraped, providing a crucial breadcrumb for debugging scrape time issues | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
### General Labelling Advice | ||||||
|
||||||
Use labels to differentiate the characteristics of the thing that is being measured: | ||||||
|
||||||
* `api_http_requests_total` - differentiate request types: `operation="create|update|delete"` | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,6 +19,8 @@ This page documents proper naming conventions and aggregation for recording rule | |
Keeping the metric name unchanged makes it easy to know what a metric is and | ||
easy to find in the codebase. | ||
|
||
IMPORTANT: `job` label acts as a primary key. It is **strongly** recommended that you use it to scope your PromQL expressions to the system you are monitoring. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is misleading. Prometheus doesn't have the concept of "primary key". Not even metric names are a "primary key". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair, especially since folks used to SQL DBs will jump to the conclusion that it's a SQL DB, which it isn't. Iterated on the language to avoid creating ambiguity |
||
|
||
To keep the operations clean, `_sum` is omitted if there are other operations, | ||
as `sum()`. Associative operations can be merged (for example `min_min` is the | ||
same as `min`). | ||
|
@@ -27,6 +29,18 @@ If there is no obvious operation to use, use `sum`. When taking a ratio by | |
doing division, separate the metrics using `_per_` and call the operation | ||
`ratio`. | ||
|
||
## Labels | ||
|
||
NOTE: Omitting a label in a PromQL expression is the functional equivalent of specifying `label=*` | ||
|
||
* In both recorded rules and alerting expressions, always specify a `job` label to prevent expression mismatches from occuring. | ||
This is especially important in multi-tenant systems where the same metric names may be exported by different jobs or the | ||
same job (e.g `node_exporter) in multiple, distinct deployments | ||
|
||
* Always specify a `without` clause with the labels you are aggregating away. | ||
This is to preserve all the other labels such as `job`, which will avoid | ||
conflicts and give you more useful metrics and alerts. | ||
|
||
## Aggregation | ||
|
||
* When aggregating up ratios, aggregate up the numerator and denominator | ||
|
@@ -40,10 +54,6 @@ Instead keep the metric name without the `_count` or `_sum` suffix and replace | |
the `rate` in the operation with `mean`. This represents the average | ||
observation size over that time period. | ||
|
||
* Always specify a `without` clause with the labels you are aggregating away. | ||
This is to preserve all the other labels such as `job`, which will avoid | ||
conflicts and give you more useful metrics and alerts. | ||
|
||
## Examples | ||
|
||
_Note the indentation style with outdented operators on their own line between | ||
|
Uh oh!
There was an error while loading. Please reload this page.