|
1 | 1 | --- |
2 | | -title: How to configure alerts for Scaleway resources in Grafana |
3 | | -description: Learn how to configure alerts for Scaleway resources in Grafana. Follow the steps to create alert rules, define conditions, and set up notifications for your monitored resources. |
| 2 | +title: How to configure custom alerts in Grafana |
| 3 | +description: Learn how to configure custom alerts for Scaleway resources in Grafana. Follow the steps to create alert rules, define conditions, and set up notifications for your monitored resources. |
4 | 4 | dates: |
5 | 5 | validation: 2025-08-20 |
6 | 6 | posted: 2023-11-06 |
@@ -57,102 +57,107 @@ Data source managed alert rules allow you to configure alerts managed by the dat |
57 | 57 | Switch between the tabs below to create alerts for a Scaleway Instance, an Object Storage bucket, a Kubernetes cluster Pod, or Cockpit logs. |
58 | 58 |
|
59 | 59 | <Tabs id="install"> |
60 | | - <TabsTab label="Scaleway Instance"> |
61 | | - The steps below explain how to create the metric selection and configure an alert condition that triggers when **your Instance consumes more than 10% of a single CPU core over the past 5 minutes.** |
| 60 | + <TabsTab label="Scaleway Instance"> |
| 61 | + The steps below explain how to create the metric selection and configure an alert condition that triggers when **your Instance consumes more than 10% of a single CPU core over the past 5 minutes.** |
62 | 62 |
|
63 | | - 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_id`) correspond to those of the target resource. |
| 63 | + 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_id`) correspond to those of the target resource. |
64 | 64 | ```bash |
65 | 65 | rate(instance_server_cpu_seconds_total{resource_id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"}[5m]) > 0.1 |
66 | 66 | ``` |
67 | 67 | <Message type="tip"> |
68 | 68 | The `instance_server_cpu_seconds_total` metric records how many seconds of CPU time your Instance has used in total. It is helpful to detect unexpected CPU usage spikes. |
69 | 69 | </Message> |
70 | | - 2. In the **Set alert evaluation behavior** section, specify how long the condition must be met before triggering the alert. |
71 | | - 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
| 70 | + 2. In the **Set alert evaluation behavior** section, specify how long the condition must be met before triggering the alert. |
| 71 | + 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
72 | 72 | <Message type="note"> |
73 | 73 | The evaluation interval is different from the pending period set in step 2. The evaluation interval controls how often the rule is checked, while the pending period defines how long the condition must be continuously met before the alert fires. |
74 | 74 | </Message> |
75 | | - 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
76 | | - 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
| 75 | + 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
| 76 | + 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
77 | 77 | <Message type="note"> |
78 | 78 | In Grafana, notifications are sent by matching alerts to notification policies based on labels. This step is about deciding how alerts will reach you or your team (Slack, email, etc.) based on labels you attach to them. Then, you can set up rules that define who receives notifications in the **Notification policies** page. |
79 | 79 | For example, if your alert named `alert-for-high-cpu-usage` has the label `team = instances-team`, you are telling Grafana to send a notification to the Instances team when the alert gets triggered. Find out how to [configure notification policies in Grafana](/tutorials/configure-slack-alerting/#configuring-a-notification-policy). |
80 | 80 | </Message> |
81 | | - 6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert. |
82 | | - 7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points). |
83 | | - </TabsTab> |
84 | | - <TabsTab label="Object Storage bucket"> |
85 | | - The steps below explain how to create the metric selection and configure an alert condition that triggers when **the object count in your bucket exceeds a specific threshold**. |
| 81 | + 6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert. |
| 82 | + 7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points). |
| 83 | + </TabsTab> |
| 84 | + <TabsTab label="Object Storage bucket"> |
| 85 | + The steps below explain how to create the metric selection and configure an alert condition that triggers when **the object count in your bucket exceeds a specific threshold**. |
86 | 86 |
|
87 | | - 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_id` and `region`) correspond to those of the target resource. |
| 87 | + 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_id` and `region`) correspond to those of the target resource. |
88 | 88 | ```bash |
89 | 89 | object_storage_bucket_objects_total{region="fr-par", resource_id="my-bucket"} > 2000 |
90 | 90 | ``` |
91 | 91 | <Message type="tip"> |
92 | 92 | The `object_storage_bucket_objects_total` metric indicates the total number of objects stored in a given Object Storage bucket. It is useful to monitor and control object growth in your bucket and avoid hitting thresholds. |
93 | 93 | </Message> |
94 | | - 2. In the **Set alert evaluation behavior** section, specify how long the condition must be met before triggering the alert. |
95 | | - 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
| 94 | + |
| 95 | + 2. In the **Set alert evaluation behavior** section, specify how long the condition must be met before triggering the alert. |
| 96 | + 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
| 97 | + |
96 | 98 | <Message type="note"> |
97 | 99 | The evaluation interval is different from the pending period set in step 2. The evaluation interval controls how often the rule is checked, while the pending period defines how long the condition must be continuously met before the alert fires. |
98 | 100 | </Message> |
99 | | - 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
100 | | - 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
| 101 | + |
| 102 | + 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
| 103 | + 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
| 104 | + |
101 | 105 | <Message type="note"> |
102 | 106 | In Grafana, notifications are sent by matching alerts to notification policies based on labels. This step is about deciding how alerts will reach you or your team (Slack, email, etc.) based on labels you attach to them. Then, you can set up rules that define who receives notifications in the **Notification policies** page. |
103 | 107 | For example, if an alert has the label `team = object-storage-team`, you are telling Grafana to send a notification to the Object Storage team when your alert is firing. Find out how to [configure notification policies in Grafana](/tutorials/configure-slack-alerting/#configuring-a-notification-policy). |
104 | 108 | </Message> |
105 | | - 6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert. |
106 | | - 7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points). |
107 | | - </TabsTab> |
108 | | - <TabsTab label="Kubernetes Pod"> |
109 | | - The steps below explain how to create the metric selection and configure an alert condition that triggers when **no new Pod activity occurs, which could mean your cluster is stuck or unresponsive.** |
110 | 109 |
|
111 | | - 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_name`) correspond to those of the target resource. |
| 110 | + 6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert. |
| 111 | + 7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points). |
| 112 | + </TabsTab> |
| 113 | + <TabsTab label="Kubernetes Pod"> |
| 114 | + The steps below explain how to create the metric selection and configure an alert condition that triggers when **no new Pod activity occurs, which could mean your cluster is stuck or unresponsive.** |
| 115 | + |
| 116 | + 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_name`) correspond to those of the target resource. |
112 | 117 | ```bash |
113 | 118 | rate(kubernetes_cluster_k8s_shoot_nodes_pods_usage_total{resource_name="k8s-par-quizzical-chatelet"}[15m]) == 0 |
114 | 119 | ``` |
115 | 120 | <Message type="tip"> |
116 | 121 | The `kubernetes_cluster_k8s_shoot_nodes_pods_usage_total` metric represents the total number of Pods currently running across all nodes in your Kubernetes cluster. It is helpful to monitor current Pod consumption per node pool or cluster, and help track resource saturation or unexpected workload spikes. |
117 | 122 | </Message> |
118 | | - 2. In the **Set alert evaluation behavior** field, specify how long the condition must be true before triggering the alert. |
119 | | - 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
| 123 | + 2. In the **Set alert evaluation behavior** field, specify how long the condition must be true before triggering the alert. |
| 124 | + 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
120 | 125 | <Message type="note"> |
121 | 126 | The evaluation interval is different from the pending period set in step 2. The evaluation interval controls how often the rule is checked, while the pending period defines how long the condition must be continuously met before the alert fires. |
122 | 127 | </Message> |
123 | | - 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
124 | | - 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
| 128 | + 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
| 129 | + 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
125 | 130 | <Message type="note"> |
126 | 131 | In Grafana, notifications are sent by matching alerts to notification policies based on labels. This step is about deciding how alerts will reach you or your team (Slack, email, etc.) based on labels you attach to them. Then, you can set up rules that define who receives notifications in the **Notification policies** page. |
127 | 132 | For example, if an alert has the label `team = kubernetes-team`, you are telling Grafana to send a notification to the Kubernetes team when your alert is firing. Find out how to [configure notification policies in Grafana](/tutorials/configure-slack-alerting/#configuring-a-notification-policy). |
128 | 133 | </Message> |
129 | 134 | 6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert. |
130 | 135 | 7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points). |
131 | | - </TabsTab> |
132 | | - <TabsTab label="Cockpit logs"> |
133 | | - The steps below explain how to create the metric selection and configure an alert condition that triggers when **no logs are stored for 5 minutes, which may indicate your app or system is broken**. |
| 136 | + </TabsTab> |
| 137 | + <TabsTab label="Cockpit logs"> |
| 138 | + The steps below explain how to create the metric selection and configure an alert condition that triggers when **no logs are stored for 5 minutes, which may indicate your app or system is broken**. |
134 | 139 |
|
135 | | - 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_name`) correspond to those of the target resource. |
| 140 | + 1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_name`) correspond to those of the target resource. |
136 | 141 | ```bash |
137 | 142 | observability_cockpit_loki_chunk_store_stored_chunks_total:increase5m{resource_id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"} == 0 |
138 | 143 | ``` |
139 | 144 | <Message type="tip"> |
140 | 145 | The `observability_cockpit_loki_chunk_store_stored_chunks_total:increase5m` metric represents the number of chunks (log storage blocks) that have been written over the last 5 minutes for a specific resource. It is useful to monitor log ingestion activity and detect issues such as a crash of the logging agent, or your application not producing logs. |
141 | 146 | </Message> |
142 | | - 2. In the **Set alert evaluation behavior** field, specify how long the condition must be true before triggering the alert. |
143 | | - 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
| 147 | + 2. In the **Set alert evaluation behavior** field, specify how long the condition must be true before triggering the alert. |
| 148 | + 3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings. |
144 | 149 | <Message type="note"> |
145 | 150 | The evaluation interval is different from the pending period set in step 2. The evaluation interval controls how often the rule is checked, while the pending period defines how long the condition must be continuously met before the alert fires. |
146 | 151 | </Message> |
147 | | - 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
148 | | - 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
| 152 | + 4. In the **Configure labels and notifications** section, click **+ Add labels**. A pop-up appears. |
| 153 | + 5. Enter a label and value name and click **Save**. You can skip this step if you want your alerts to be sent to the contacts you may already have created in the Scaleway console. |
149 | 154 | <Message type="note"> |
150 | 155 | In Grafana, notifications are sent by matching alerts to notification policies based on labels. This step is about deciding how alerts will reach you or your team (Slack, email, etc.) based on labels you attach to them. Then, you can set up rules that define who receives notifications in the **Notification policies** page. |
151 | 156 | For example, if an alert has the label `team = cockpit-team`, you are telling Grafana to send a notification to the Cockpit team when your alert is firing. Find out how to [configure notification policies in Grafana](/tutorials/configure-slack-alerting/#configuring-a-notification-policy). |
152 | 157 | </Message> |
153 | | - 6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert. |
154 | | - 7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points). |
155 | | - </TabsTab> |
| 158 | + 6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert. |
| 159 | + 7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points). |
| 160 | + </TabsTab> |
156 | 161 | </Tabs> |
157 | 162 |
|
158 | 163 | **You can configure up to a maximum of 10 alerts** for the `Scaleway Metrics` data source. |
|
0 commit comments