This document provides detailed information about the monitoring stack in the homelab.
The monitoring stack provides health checking and uptime monitoring for the homelab infrastructure and applications using Gatus.
The monitoring stack follows the following workflow:
- Gatus performs health checks on services at regular intervals
- Health check results are stored in a PostgreSQL database
- The Gatus dashboard displays service status and history
- Alerts are sent when services become unhealthy
Gatus is a health dashboard that checks the health of services and sends alerts when issues are detected.
- URL: uptime.layertwo.dev
- Storage: PostgreSQL database (see Gatus PostgreSQL Backend for details)
- Endpoints:
- Internal services
- External services
- APIs
- Alerting:
- Pushover notifications
The monitoring stack uses persistent storage for data:
- Gatus: PostgreSQL database for storage (CloudNativePG cluster)
The monitoring stack is exposed through the internal Traefik instance:
- Gatus is accessible at uptime.layertwo.dev
- Authentication is handled by Authentik
Gatus provides alerting capabilities to notify administrators when services become unhealthy:
- Pushover: Mobile notifications for service failures
- Email: Email notifications (if configured)
- Webhook: Integration with other systems (if configured)
The Gatus dashboard provides visualization of service health and status:
- Service health status (up/down)
- Response time metrics
- Status history and uptime percentage
- Endpoint-specific details
The applications are updated automatically through Flux CD when new versions are available in the Helm repositories.
- Gatus PostgreSQL database is backed up using CloudNative PG backups to Cloudflare R2
If Gatus is not performing health checks:
- Check that Gatus is running and accessible
- Verify that endpoints are configured correctly
- Check that services are reachable
- Check Gatus logs for errors