Skip to content

Commit fd84e87

Browse files
authored
Merge pull request #76402 from eromanova97/OBSDOCS-977
OBSDOCS-977: Document improved scrape sample alerts
2 parents e6cacad + 2f3ec0f commit fd84e87

File tree

1 file changed

+23
-17
lines changed

1 file changed

+23
-17
lines changed

modules/monitoring-creating-scrape-sample-alerts.adoc

Lines changed: 23 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -30,38 +30,38 @@ metadata:
3030
labels:
3131
prometheus: k8s
3232
role: alert-rules
33-
name: monitoring-stack-alerts <1>
34-
namespace: ns1 <2>
33+
name: monitoring-stack-alerts #<1>
34+
namespace: ns1 #<2>
3535
spec:
3636
groups:
3737
- name: general.rules
3838
rules:
39-
- alert: TargetDown <3>
39+
- alert: TargetDown #<3>
4040
annotations:
4141
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service
42-
}} targets in {{ $labels.namespace }} namespace are down.' <4>
42+
}} targets in {{ $labels.namespace }} namespace are down.' #<4>
4343
expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job,
4444
namespace, service)) > 10
45-
for: 10m <5>
45+
for: 10m #<5>
4646
labels:
47-
severity: warning <6>
48-
- alert: ApproachingEnforcedSamplesLimit <7>
47+
severity: warning #<6>
48+
- alert: ApproachingEnforcedSamplesLimit #<7>
4949
annotations:
50-
message: '{{ $labels.container }} container of the {{ $labels.pod }} pod in the {{ $labels.namespace }} namespace consumes {{ $value | humanizePercentage }} of the samples limit budget.' <8>
51-
expr: scrape_samples_scraped/50000 > 0.8 <9>
52-
for: 10m <10>
50+
message: '{{ $labels.container }} container of the {{ $labels.pod }} pod in the {{ $labels.namespace }} namespace consumes {{ $value | humanizePercentage }} of the samples limit budget.' #<8>
51+
expr: (scrape_samples_post_metric_relabeling / (scrape_sample_limit > 0)) > 0.9 #<9>
52+
for: 10m #<10>
5353
labels:
54-
severity: warning <11>
54+
severity: warning #<11>
5555
----
5656
<1> Defines the name of the alerting rule.
57-
<2> Specifies the user-defined project where the alerting rule will be deployed.
58-
<3> The `TargetDown` alert will fire if the target cannot be scraped or is not available for the `for` duration.
59-
<4> The message that will be output when the `TargetDown` alert fires.
57+
<2> Specifies the user-defined project where the alerting rule is deployed.
58+
<3> The `TargetDown` alert fires if the target cannot be scraped and is not available for the `for` duration.
59+
<4> The message that is displayed when the `TargetDown` alert fires.
6060
<5> The conditions for the `TargetDown` alert must be true for this duration before the alert is fired.
6161
<6> Defines the severity for the `TargetDown` alert.
62-
<7> The `ApproachingEnforcedSamplesLimit` alert will fire when the defined scrape sample threshold is reached or exceeded for the specified `for` duration.
63-
<8> The message that will be output when the `ApproachingEnforcedSamplesLimit` alert fires.
64-
<9> The threshold for the `ApproachingEnforcedSamplesLimit` alert. In this example the alert will fire when the number of samples per target scrape has exceeded 80% of the enforced sample limit of `50000`. The `for` duration must also have passed before the alert will fire. The `<number>` in the expression `scrape_samples_scraped/<number> > <threshold>` must match the `enforcedSampleLimit` value defined in the `user-workload-monitoring-config` `ConfigMap` object.
62+
<7> The `ApproachingEnforcedSamplesLimit` alert fires when the defined scrape sample threshold is exceeded and lasts for the specified `for` duration.
63+
<8> The message that is displayed when the `ApproachingEnforcedSamplesLimit` alert fires.
64+
<9> The threshold for the `ApproachingEnforcedSamplesLimit` alert. In this example, the alert fires when the number of ingested samples exceeds 90% of the configured limit.
6565
<10> The conditions for the `ApproachingEnforcedSamplesLimit` alert must be true for this duration before the alert is fired.
6666
<11> Defines the severity for the `ApproachingEnforcedSamplesLimit` alert.
6767

@@ -71,3 +71,9 @@ spec:
7171
----
7272
$ oc apply -f monitoring-stack-alerts.yaml
7373
----
74+
75+
. Additionally, you can check if a target has hit the configured limit:
76+
77+
.. In the *Administrator* perspective of the web console, go to *Observe* -> *Targets* and select an endpoint with a `Down` status that you want to check.
78+
+
79+
The *Scrape failed: sample limit exceeded* message is displayed if the endpoint failed because of an exceeded sample limit.

0 commit comments

Comments
 (0)