scaleway
diff --git a/‎pages/cockpit/how-to/assets/scaleway-alerts-firing.webp‎
24.4 KB b/‎pages/cockpit/how-to/assets/scaleway-alerts-firing.webp‎
24.4 KB
diff --git a/‎pages/cockpit/how-to/configure-alerts-for-scw-resources.mdx‎
Lines changed: 32 additions & 28 deletions b/‎pages/cockpit/how-to/configure-alerts-for-scw-resources.mdx‎
Lines changed: 32 additions & 28 deletions
@@ -8,17 +8,14 @@ content:
 categories:
   - observability cockpit
 dates:
-  validation: 2025-05-07
+  validation: 2025-05-09
   posted: 2023-11-06
 ---
 
-This page shows you  how to create alert rules in Grafana for monitoring Scaleway resources like Instances, Object Storage, Kubernetes, and Cockpit. It explains how to use the `Scaleway Metrics` data source, interpret metrics, set alert conditions, and activate alerts.
 
-    <Message type="important">
-      Cockpit does not support Grafana's alerting system. This means that:
-      - Grafana's built-in contact points and notification policies will not trigger any emails or notifications. You **must enable the Scaleway alert manager and create contact points to receive notifications**.
-      - You must use the **Switch to data source-managed alert rule** button in Grafana, and use PromQL queries for alerting.
-    </Message>
+Cockpit does not support Grafana-managed alerting. It integrates with Grafana to visualize metrics, but alerts are managed through the Scaleway alert manager. You should use Grafana only to define alert rules, not to evaluate or receive alert notifications. Once the conditions of your alert rule are met, the Scaleway alert manager evaluates the rule and sends a notification to the **contact points you have configured in the Scaleway console**.
+
+This page shows you how to create alert rules in Grafana for monitoring Scaleway resources integrated with Cockpit, such as Instances, Object Storage, and Kubernetes. These alerts rely on Scaleway-provided metrics, which are preconfigured and available in the **Metrics browser** drop-down when using the **Scaleway Metrics data source** in the Grafana interface. This page explains how to use the `Scaleway Metrics` data source, interpret metrics, set alert conditions, and activate alerts.
 
 <Macro id="requirements" />
 
@@ -27,17 +24,17 @@ This page shows you  how to create alert rules in Grafana for monitoring Scalewa
   - Scaleway resources you can monitor
   - [Created Grafana credentials](/cockpit/how-to/retrieve-grafana-credentials/) with the **Editor** role
   - [Enabled](/cockpit/how-to/enable-alert-manager/) the Scaleway alert manager
-  - [Created](/cockpit/how-to/add-contact-points/) at least one contact point **in the Scaleway console**
+  - [Created](/cockpit/how-to/add-contact-points/) at least one contact point **in the Scaleway console**, otherwise, alerts will not be delivered
   - Selected the **Scaleway Alerting** alert manager in Grafana
 
-## Use data source managed alerts rules
+## Switch to data source managed alerts rules
 
 Data source managed alerts rules allow you to configure alerts managed by the data source of your choice, instead of using Grafana's managed alerting system which is not supported by Cockpit.
 
 1. [Log in to Grafana](/cockpit/how-to/access-grafana-and-managed-dashboards/) using your credentials.
 2. Click the **Toggle menu** then click **Alerting**.
 3. Click **Alert rules** and **+ New alert rule**.
-4. In the **Define query and alert condition** section, scroll to the **Grafana-managed alert rule** information banner and click **Switch to data source-managed alert rule**. You are redirected to the alert creation process.
+4. In the **Define query and alert condition** section, scroll to the **Grafana-managed alert rule** information banner and click **Switch to data source-managed alert rule**. This step is **required** because Cockpit does not support Grafana’s built-in alerting system, but only alerts configured and evaluated by the data source itself. You are redirected to the alert creation process.
     <Lightbox src="scaleway-switch-to-managed-alerts-button.webp" alt="" />
 
 ## Define your metric and alert conditions
@@ -53,7 +50,7 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
     3. Click the **Metrics browser** drop-down.
         <Lightbox src="scaleway-metrics-browser.webp" alt="" />
         <Lightbox src="scaleway-metrics-displayed.webp" alt="" />
-    4. Select the metric you want to configure an alert for. For the sake of this documentation, we are choosing the `instance_server_cpu_seconds_total` metric.
+    4. Select the metric you want to configure an alert for. For example, `instance_server_cpu_seconds_total`.
         <Message type="tip">
          The `instance_server_cpu_seconds_total` metric records how many seconds of CPU time your Instance has used in total. It is helpful to detect unexpected CPU usage spikes.
         </Message>
@@ -65,15 +62,15 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
         ```bash
         rate(instance_server_cpu_seconds_total{resource_id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",resource_name="name-of-your-resource"}[5m]) > 0.1
         ```
-        <Lightbox src="scaleway-instance-grafana-alert.webp" alt="" />
     9. In the **Set alert evaluation behavior** field, specify how long the condition must be true before triggering the alert.
         <Message type="note">
          For example, to wait until the condition has been met continuously for 5 minutes, type `5` and select `minutes` in the drop-down.
         </Message>
-    10. Enter a namespace for your alert in the **Namespace** field and click **Enter**.
-    11. Enter a name for your alert's group in the **Group** field and click **Enter**.
+    10. Enter a namespace in the **Namespace** field to help you categorize and manage your alert, then click **Enter**.
+    11. Enter a name in the **Group** field to help you categorize and manage your alert, then click **Enter**.
     12. Optionally, add a summary and a description.
-    13. Click **Save rule** in the top right corner of your screen to save and activate your alert. Once the alert meets the conditions you have configured, your [contact point](/cockpit/concepts/#contact-points) will receive an email informing them that the alert is firing.
+    13. Click **Save rule** in the top right corner of your screen to save and activate your alert.
+    14. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and your [contact point](/cockpit/concepts/#contact-points) should receive an email informing them that the alert is firing.
   </TabsTab>
   <TabsTab label="Object Storage bucket">
     The steps below explain how to create the metric selection and configure an alert condition that triggers when **the object count in your bucket exceeds a specific threshold**.
@@ -83,7 +80,7 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
     3. Click the **Metrics browser** drop-down.
         <Lightbox src="scaleway-metrics-browser.webp" alt="" />
         <Lightbox src="scaleway-metrics-displayed.webp" alt="" />
-    4. Select the metric you want to configure an alert for. For the sake of this documentation, we are choosing the `object_storage_bucket_objects_total` metric.
+    4. Select the metric you want to configure an alert for. For example, `object_storage_bucket_objects_total`.
         <Message type="tip">
          The `object_storage_bucket_objects_total` metric indicates the total number of objects stored in a given Object Storage bucket. It is useful to monitor and control object growth in your bucket and avoid hitting thresholds.
         </Message>
@@ -98,10 +95,11 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
         <Message type="note">
          For example, to wait until the condition has been met continuously for 5 minutes, type `5` and select `minutes` in the drop-down.
         </Message>
-    10. Enter a namespace for your alert in the **Namespace** field and click **Enter**.
-    11. Enter a name for your alert's group in the **Group** field and click **Enter**.
+    10. Enter a namespace in the **Namespace** field to help you categorize and manage your alert, then click **Enter**.
+    11. Enter a name in the **Group** field to help you categorize and manage your alert, then click **Enter**.
     12. Optionally, add a summary and a description.
-    13. Click **Save rule** in the top right corner of your screen to save and activate your alert. Once the alert meets the conditions you have configured, your [contact point](/cockpit/concepts/#contact-points) will receive an email informing them that the alert is firing.
+    13. Click **Save rule** in the top right corner of your screen to save and activate your alert.
+    14. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and your [contact point](/cockpit/concepts/#contact-points) should receive an email informing them that the alert is firing.
   </TabsTab>
   <TabsTab label="Kubernetes pod">
     The steps below explain how to create the metric selection and configure an alert condition that triggers when **no new pod activity occurs, which could mean your cluster is stuck or unresponsive.**
@@ -111,7 +109,7 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
     3. Click the **Metrics browser** drop-down.
         <Lightbox src="scaleway-metrics-browser.webp" alt="" />
         <Lightbox src="scaleway-metrics-displayed.webp" alt="" />
-    4. Select the metric you want to configure an alert for. For the sake of this documentation, we are choosing the `kubernetes_cluster_k8s_shoot_nodes_pods_usage_total` metric.
+    4. Select the metric you want to configure an alert for. For example, `kubernetes_cluster_k8s_shoot_nodes_pods_usage_total`.
         <Message type="tip">
          The `kubernetes_cluster_k8s_shoot_nodes_pods_usage_total` metric represents the total number of pods currently running across all nodes in your Kubernetes cluster. It is helpful to monitor current pod consumption per node pool or cluster, and help track resource saturation or unexpected workload spikes.
         </Message>
@@ -126,10 +124,11 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
         <Message type="note">
          For example, to wait until the condition has been met continuously for 5 minutes, type `5` and select `minutes` in the drop-down.
         </Message>
-    10. Enter a namespace for your alert in the **Namespace** field and click **Enter**.
-    11. Enter a name for your alert's group in the **Group** field and click **Enter**.
+    10. Enter a namespace in the **Namespace** field to help you categorize and manage your alert, then click **Enter**.
+    11. Enter a name in the **Group** field to help you categorize and manage your alert, then click **Enter**.
     12. Optionally, add a summary and a description.
-    13. Click **Save rule** in the top right corner of your screen to save and activate your alert. Once the alert meets the conditions you have configured, your [contact point](/cockpit/concepts/#contact-points) will receive an email informing them that the alert is firing.
+    13. Click **Save rule** in the top right corner of your screen to save and activate your alert.
+    14. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and your [contact point](/cockpit/concepts/#contact-points) should receive an email informing them that the alert is firing.
   </TabsTab>
   <TabsTab label="Cockpit logs">
     The steps below explain how to create the metric selection and configure an alert condition that triggers when **no logs are stored for 5 minutes, which may indicate your app or system is broken**.
@@ -139,7 +138,7 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
     3. Click the **Metrics browser** drop-down.
         <Lightbox src="scaleway-metrics-browser.webp" alt="" />
         <Lightbox src="scaleway-metrics-displayed.webp" alt="" />
-    4. Select the metric you want to configure an alert for. For the sake of this documentation, we are choosing the `observability_cockpit_loki_chunk_store_stored_chunks_total:increase5m` metric.
+    4. Select the metric you want to configure an alert for. For example, `observability_cockpit_loki_chunk_store_stored_chunks_total:increase5m`.
         <Message type="tip">
          The `observability_cockpit_loki_chunk_store_stored_chunks_total:increase5m` metric represents the number of chunks (log storage blocks) have been written over the last 5 minutes for a specific resource. It is useful to monitor log ingestion activity and detect issues such as crash of the logging agent, or your application not producing logs.
         </Message>
@@ -154,15 +153,20 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
         <Message type="note">
          For example, to wait until the condition has been met continuously for 5 minutes, type `5` and select `minutes` in the drop-down.
         </Message>
-    10. Enter a namespace for your alert in the **Namespace** field and click **Enter**.
-    11. Enter a name for your alert's group in the **Group** field and click **Enter**.
+    10. Enter a namespace in the **Namespace** field to help you categorize and manage your alert, then click **Enter**.
+    11. Enter a name in the **Group** field to help you categorize and manage your alert, then click **Enter**.
     12. Optionally, add a summary and a description.
-    13. Click **Save rule** in the top right corner of your screen to save and activate your alert. Once the alert meets the conditions you have configured, your [contact point](/cockpit/concepts/#contact-points) will receive an email informing them that the alert is firing.
+    13. Click **Save rule** in the top right corner of your screen to save and activate your alert. Your alert will start evaluating based on the rule you have defined.
+    14. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and your [contact point](/cockpit/concepts/#contact-points) should receive an email informing them that the alert is firing.
   </TabsTab>
 </Tabs>
 
+You can view your firing alerts in the **Alert rules** section of your Grafana (Home > Alerting > Alerts rules).
+
+<Lightbox src="scaleway-alerts-firing.webp" alt="" />
+
 <Message type="important">
- You can configure up to a maximum of 10 alerts for the `Scaleway Metrics` data source.
+ You can configure up to a **maximum of 10 alerts** for the `Scaleway Metrics` data source.
 </Message>
 
 <Message type="tip">