You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/monitoring-creating-scrape-sample-alerts.adoc
+23-17Lines changed: 23 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,38 +30,38 @@ metadata:
30
30
labels:
31
31
prometheus: k8s
32
32
role: alert-rules
33
-
name: monitoring-stack-alerts <1>
34
-
namespace: ns1 <2>
33
+
name: monitoring-stack-alerts #<1>
34
+
namespace: ns1 #<2>
35
35
spec:
36
36
groups:
37
37
- name: general.rules
38
38
rules:
39
-
- alert: TargetDown <3>
39
+
- alert: TargetDown #<3>
40
40
annotations:
41
41
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service
42
-
}} targets in {{ $labels.namespace }} namespace are down.' <4>
42
+
}} targets in {{ $labels.namespace }} namespace are down.'#<4>
43
43
expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job,
44
44
namespace, service)) > 10
45
-
for: 10m <5>
45
+
for: 10m #<5>
46
46
labels:
47
-
severity: warning <6>
48
-
- alert: ApproachingEnforcedSamplesLimit <7>
47
+
severity: warning #<6>
48
+
- alert: ApproachingEnforcedSamplesLimit #<7>
49
49
annotations:
50
-
message: '{{ $labels.container }} container of the {{ $labels.pod }} pod in the {{ $labels.namespace }} namespace consumes {{ $value | humanizePercentage }} of the samples limit budget.' <8>
51
-
expr: scrape_samples_scraped/50000 > 0.8 <9>
52
-
for: 10m <10>
50
+
message: '{{ $labels.container }} container of the {{ $labels.pod }} pod in the {{ $labels.namespace }} namespace consumes {{ $value | humanizePercentage }} of the samples limit budget.'#<8>
<2> Specifies the user-defined project where the alerting rule will be deployed.
58
-
<3> The `TargetDown` alert will fire if the target cannot be scraped or is not available for the `for` duration.
59
-
<4> The message that will be output when the `TargetDown` alert fires.
57
+
<2> Specifies the user-defined project where the alerting rule is deployed.
58
+
<3> The `TargetDown` alert fires if the target cannot be scraped and is not available for the `for` duration.
59
+
<4> The message that is displayed when the `TargetDown` alert fires.
60
60
<5> The conditions for the `TargetDown` alert must be true for this duration before the alert is fired.
61
61
<6> Defines the severity for the `TargetDown` alert.
62
-
<7> The `ApproachingEnforcedSamplesLimit` alert will fire when the defined scrape sample threshold is reached or exceeded for the specified `for` duration.
63
-
<8> The message that will be output when the `ApproachingEnforcedSamplesLimit` alert fires.
64
-
<9> The threshold for the `ApproachingEnforcedSamplesLimit` alert. In this example the alert will fire when the number of samples per target scrape has exceeded 80% of the enforced sample limit of `50000`. The `for` duration must also have passed before the alert will fire. The `<number>` in the expression `scrape_samples_scraped/<number> > <threshold>` must match the `enforcedSampleLimit` value defined in the `user-workload-monitoring-config``ConfigMap` object.
62
+
<7> The `ApproachingEnforcedSamplesLimit` alert fires when the defined scrape sample threshold is exceeded and lasts for the specified `for` duration.
63
+
<8> The message that is displayed when the `ApproachingEnforcedSamplesLimit` alert fires.
64
+
<9> The threshold for the `ApproachingEnforcedSamplesLimit` alert. In this example, the alert fires when the number of ingested samples exceeds 90% of the configured limit.
65
65
<10> The conditions for the `ApproachingEnforcedSamplesLimit` alert must be true for this duration before the alert is fired.
66
66
<11> Defines the severity for the `ApproachingEnforcedSamplesLimit` alert.
67
67
@@ -71,3 +71,9 @@ spec:
71
71
----
72
72
$ oc apply -f monitoring-stack-alerts.yaml
73
73
----
74
+
75
+
. Additionally, you can check if a target has hit the configured limit:
76
+
77
+
.. In the *Administrator* perspective of the web console, go to *Observe* -> *Targets* and select an endpoint with a `Down` status that you want to check.
78
+
+
79
+
The *Scrape failed: sample limit exceeded* message is displayed if the endpoint failed because of an exceeded sample limit.
0 commit comments