You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
= Tips for optimizing alerting rules for core platform monitoring
8
+
9
+
If you customize core platform alerting rules to meet your organization's specific needs, follow these guidelines to help ensure that the customized rules are efficient and effective.
10
+
11
+
* *Minimize the number of new rules*.
12
+
Create only rules that are essential to your specific requirements.
13
+
By minimizing the number of rules, you create a more manageable and focused alerting system in your monitoring environment.
14
+
15
+
* *Focus on symptoms rather than causes*.
16
+
Create rules that notify users of symptoms instead of underlying causes.
17
+
This approach ensures that users are promptly notified of a relevant symptom so that they can investigate the root cause after an alert has triggered.
18
+
This tactic also significantly reduces the overall number of rules you need to create.
19
+
20
+
* *Plan and assess your needs before implementing changes*.
21
+
First, decide what symptoms are important and what actions you want users to take if these symptoms occur.
22
+
Then, assess existing rules and decide if you can modify any of them to meet your needs instead of creating entirely new rules for each symptom.
23
+
By modifying existing rules and creating new ones judiciously, you help to streamline your alerting system.
24
+
25
+
* *Provide clear alert messaging*.
26
+
When you create alert messages, describe the symptom, possible causes, and recommended actions.
27
+
Include unambiguous, concise explanations along with troubleshooting steps or links to more information.
28
+
Doing so helps users quickly assess the situation and respond appropriately.
29
+
30
+
* *Include severity levels*.
31
+
Assign severity levels to your rules to indicate how a user needs to react when a symptom occurs and triggers an alert.
32
+
For example, classifying an alert as *Critical* signals that an individual or a critical response team needs to respond immediately.
33
+
By defining severity levels, you help users know how to respond to an alert and help ensure that the most urgent issues receive prompt attention.
// Tech Preview features are not documented in the ROSA/OSD docs. However, even when GA, ROSA/OSD generally doesn't include information about core platform monitoring.
* See xref:../monitoring/monitoring-overview.adoc#monitoring-overview[Monitoring overview] for details about {product-title}{product-version} monitoring architecture.
56
+
* See the link:https://prometheus.io/docs/alerting/alertmanager/[Alertmanager documentation] for information about alerting rules.
57
+
* See the link:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config[Prometheus relabeling documentation] for information about how relabeling works.
58
+
* See the link:https://prometheus.io/docs/practices/alerting/[Prometheus alerting documentation] for further guidelines on optimizing alerts.
59
+
endif::openshift-dedicated,openshift-rosa[]
60
+
45
61
// Managing alerting rules for user-defined projects
* See the link:https://prometheus.io/docs/alerting/alertmanager/[Alertmanager documentation]
71
87
72
-
// Managing core platform alerting rules
73
-
ifndef::openshift-dedicated,openshift-rosa[]
74
-
// Tech Preview features are not documented in the ROSA/OSD docs. However, even when GA, ROSA/OSD generally doesn't include information about core platform monitoring.
* See xref:../monitoring/monitoring-overview.adoc#monitoring-overview[Monitoring overview] for details about {product-title}{product-version} monitoring architecture.
82
-
* See the link:https://prometheus.io/docs/alerting/alertmanager/[Alertmanager documentation] for information about alerting rules.
83
-
* See the link:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config[Prometheus relabeling documentation] for information about how relabeling works.
84
-
* See the link:https://prometheus.io/docs/practices/alerting/[Prometheus alerting documentation] for further guidelines on optimizing alerts.
0 commit comments