Skip to content

Commit 0b4f7d6

Browse files
committed
Add alertmanager rules for Docker
Docker has a builtin prometheus exporter that we currently don't have enabled. This change adds alerts for stopped/paused containers and failed healthchecks. This patch requires changes to docker's configuration to export the metrics, and prometheus to consume them. This means that Kayobe and Kolla-Ansible should both be updated to their latest versions.
1 parent fe96cb4 commit 0b4f7d6

File tree

3 files changed

+36
-0
lines changed

3 files changed

+36
-0
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
---
2+
# Address for prometheus metrics endpoint
3+
docker_metrics_addr: "{{ internal_net_name | net_ip + ':9323'}}"
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
2+
groups:
3+
- name: Docker
4+
rules:
5+
6+
- alert: DockerContainerStopped
7+
expr: 'engine_daemon_container_states_containers{state="stopped"} > 0'
8+
labels:
9+
severity: warning
10+
annotations:
11+
summary: "Containers not running (instance {{ $labels.instance }})"
12+
description: "One or more container are stopped"
13+
14+
- alert: DockerContainerPaused
15+
expr: 'engine_daemon_container_states_containers{state="paused"} > 0'
16+
labels:
17+
severity: warning
18+
annotations:
19+
summary: "Containers not running (instance {{ $labels.instance }})"
20+
description: "One or more container are stopped"
21+
22+
- alert: DockerContainerHealthCheckFail
23+
expr: rate(engine_daemon_health_checks_failed_total[1m]) > 1
24+
labels:
25+
severity: warning
26+
annotations:
27+
summary: "Containers health check failed (instance {{ $labels.instance }})"
28+
description: "One or more container health checks failed"
29+
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
features:
3+
- |
4+
Added new default alerting rules for containers being unhealthy or stopped.

0 commit comments

Comments
 (0)