Skip to content

Commit 342e8d3

Browse files
authored
Merge pull request #48754 from bburt-rh/RHDEVDOCS-3423-managing-user-defined-alerting-rules
RHDEVDOCS-3423-managing-user-defined-alerting-rules
2 parents 70e55ae + 110f634 commit 342e8d3

5 files changed

+158
-10
lines changed
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * monitoring/managing-alerts.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="creating-new-alerting-rules_{context}"]
7+
= Creating new alerting rules
8+
9+
As a cluster administrator, you can create new alerting rules based on platform metrics.
10+
These alerting rules trigger alerts based on the values of chosen metrics.
11+
12+
[NOTE]
13+
====
14+
If you create a customized `AlertingRule` resource based on an existing platform alerting rule, silence the original alert to avoid receiving conflicting alerts.
15+
====
16+
17+
.Prerequisites
18+
19+
* You are logged in as a user that has the `cluster-admin` role.
20+
* You have installed the OpenShift CLI (`oc`).
21+
* You have enabled Technology Preview features, and all nodes in the cluster are ready.
22+
23+
24+
.Procedure
25+
26+
. Create a new YAML configuration file named `example-alerting-rule.yaml` in the `openshift-monitoring` namespace.
27+
28+
. Add an `AlertingRule` resource to the YAML file.
29+
The following example creates a new alerting rule named `example`, similar to the default `watchdog` alert:
30+
+
31+
[source,yaml]
32+
----
33+
apiVersion: monitoring.openshift.io/v1alpha1
34+
kind: AlertingRule
35+
metadata:
36+
name: example
37+
namespace: openshift-monitoring
38+
spec:
39+
groups:
40+
- name: example-rules
41+
rules:
42+
- alert: ExampleAlert <1>
43+
expr: vector(1) <2>
44+
----
45+
<1> The name of the alerting rule you want to create.
46+
<2> The PromQL query expression that defines the new rule.
47+
48+
. Apply the configuration file to the cluster:
49+
+
50+
[source,terminal]
51+
----
52+
$ oc apply -f example-alerting-rule.yaml
53+
----

modules/monitoring-managing-alerting-rules.adoc renamed to modules/monitoring-managing-alerting-rules-for-user-defined-projects.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
// * monitoring/managing-alerts.adoc
44

55
:_content-type: CONCEPT
6-
[id="managing-alerting-rules_{context}"]
7-
= Managing alerting rules
6+
[id="managing-alerting-rules-for-user-defined-projects_{context}"]
7+
= Managing alerting rules for user-defined projects
88

99
{product-title} monitoring ships with a set of default alerting rules. As a cluster administrator, you can view the default alerting rules.
1010

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * monitoring/managing-alerts.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="managing-core-platform-alerting-rules_{context}"]
7+
= Managing alerting rules for core platform monitoring
8+
9+
:FeatureName: Creating and modifying alerting rules for core platform monitoring
10+
include::snippets/technology-preview.adoc[leveloffset=+1]
11+
12+
{product-title} {product-version} monitoring ships with a large set of default alerting rules for platform metrics.
13+
As a cluster administrator, you can customize this set of rules in two ways:
14+
15+
* Modify the settings for existing platform alerting rules by adjusting thresholds or by adding and modifying labels.
16+
For example, you can change the `severity` label for an alert from `warning` to `critical` to help you route and triage issues flagged by an alert.
17+
18+
* Define and add new custom alerting rules by constructing a query expression based on core platform metrics in the `openshift-monitoring` namespace.
19+
20+
.Core platform alerting rule considerations
21+
22+
* New alerting rules must be based on the default {product-title} monitoring metrics.
23+
24+
* You can only add and modify alerting rules. You cannot create new recording rules or modify existing recording rules.
25+
26+
* If you modify existing platform alerting rules by using an `AlertRelabelConfig` object, your modifications are not reflected in the Prometheus alerts API.
27+
Therefore, any dropped alerts still appear in the {product-title} web console even though they are no longer forwarded to Alertmanager.
28+
Additionally, any modifications to alerts, such as a changed `severity` label, do not appear in the web console.
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * monitoring/managing-alerts.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="modifying-core-platform-alerting-rules_{context}"]
7+
= Modifying core platform alerting rules
8+
9+
As a cluster administrator, you can modify core platform alerts before Alertmanager routes them to a receiver.
10+
For example, you can change the severity label of an alert, add a custom label, or exclude an alert from being sent to Alertmanager.
11+
12+
.Prerequisites
13+
14+
* You have access to the cluster as a user with the `cluster-admin` role.
15+
* You have installed the OpenShift CLI (`oc`).
16+
* You have enabled Technology Preview features, and all nodes in the cluster are ready.
17+
18+
19+
.Procedure
20+
21+
. Create a new YAML configuration file named `example-modified-alerting-rule.yaml` in the `openshift-monitoring` namespace.
22+
23+
. Add an `AlertRelabelConfig` resource to the YAML file.
24+
The following example modifies the `severity` setting to `critical` for the default platform `watchdog` alerting rule:
25+
+
26+
[source,yaml]
27+
----
28+
apiVersion: monitoring.openshift.io/v1alpha1
29+
kind: AlertRelabelConfig
30+
metadata:
31+
name: watchdog
32+
namespace: openshift-monitoring
33+
spec:
34+
configs:
35+
- sourceLabels: [alertname,severity] <1>
36+
regex: "Watchdog;none" <2>
37+
targetLabel: severity <3>
38+
replacement: critical <4>
39+
action: Replace <5>
40+
----
41+
<1> The source labels for the values you want to modify.
42+
<2> The regular expression against which the value of `sourceLabels` is matched.
43+
<3> The target label of the value you want to modify.
44+
<4> The new value to replace the target label.
45+
<5> The relabel action that replaces the old value based on regex matching.
46+
The default action is `Replace`.
47+
Other possible values are `Keep`, `Drop`, `HashMod`, `LabelMap`, `LabelDrop`, and `LabelKeep`.
48+
49+
. Apply the configuration file to the cluster:
50+
+
51+
[source,terminal]
52+
----
53+
$ oc apply -f example-modified-alerting-rule.yaml
54+
----

monitoring/managing-alerts.adoc

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,18 @@ include::modules/monitoring-searching-alerts-silences-and-alerting-rules.adoc[le
2626
// Getting information about alerts, silences and alerting rules
2727
include::modules/monitoring-getting-information-about-alerts-silences-and-alerting-rules.adoc[leveloffset=+1]
2828

29-
// Managing alerting rules
30-
include::modules/monitoring-managing-alerting-rules.adoc[leveloffset=+1]
29+
// Managing silences
30+
include::modules/monitoring-managing-silences.adoc[leveloffset=+1]
31+
include::modules/monitoring-silencing-alerts.adoc[leveloffset=+2]
32+
include::modules/monitoring-editing-silences.adoc[leveloffset=+2]
33+
include::modules/monitoring-expiring-silences.adoc[leveloffset=+2]
34+
35+
// Managing alerting rules for user-defined projects
36+
include::modules/monitoring-managing-alerting-rules-for-user-defined-projects.adoc[leveloffset=+1]
3137
include::modules/monitoring-optimizing-alerting-for-user-defined-projects.adoc[leveloffset=+2]
3238

3339
[role="_additional-resources"]
3440
.Additional resources
35-
3641
* See the link:https://prometheus.io/docs/practices/alerting/[Prometheus alerting documentation] for further guidelines on optimizing alerts
3742
* See xref:../monitoring/monitoring-overview.adoc#monitoring-overview[Monitoring overview] for details about {product-title} {product-version} monitoring architecture
3843
@@ -50,11 +55,19 @@ include::modules/monitoring-removing-alerting-rules-for-user-defined-projects.ad
5055

5156
* See the link:https://prometheus.io/docs/alerting/alertmanager/[Alertmanager documentation]
5257
53-
// Managing silences
54-
include::modules/monitoring-managing-silences.adoc[leveloffset=+1]
55-
include::modules/monitoring-silencing-alerts.adoc[leveloffset=+2]
56-
include::modules/monitoring-editing-silences.adoc[leveloffset=+2]
57-
include::modules/monitoring-expiring-silences.adoc[leveloffset=+2]
58+
// Managing core platform alerting rules
59+
include::modules/monitoring-managing-core-platform-alerting-rules.adoc[leveloffset=+1]
60+
include::modules/monitoring-modifying-core-platform-alerting-rules.adoc[leveloffset=+2]
61+
include::modules/monitoring-creating-new-alerting-rules.adoc[leveloffset=+2]
62+
63+
[role="_additional-resources"]
64+
.Additional resources
65+
* See xref:../monitoring/monitoring-overview.adoc#monitoring-overview[Monitoring overview] for details about {product-title} {product-version} monitoring architecture.
66+
* See the link:https://prometheus.io/docs/alerting/alertmanager/[Alertmanager documentation] for information about alerting rules.
67+
* See the link:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config[Prometheus relabeling documentation] for information about how relabeling works.
68+
* See the link:https://prometheus.io/docs/practices/alerting/[Prometheus alerting documentation] for further guidelines on optimizing alerts.
69+
70+
5871
5972
// Sending notifications to external systems
6073
include::modules/monitoring-sending-notifications-to-external-systems.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)