Skip to content

theodorismeko/docker-monitor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🐳 Docker Container Monitoring with Slack Integration

This Python script monitors Docker containers and sends status reports to Slack with real-time alerting capabilities.

✨ Features

  • 🐳 Comprehensive Container Monitoring: Track all containers with detailed metrics
  • ⚑ Real-time Monitoring: Instant alerts when containers go down, restart, or change status
  • πŸ“Š Performance Analytics: CPU, memory, network, and disk I/O statistics
  • πŸ”” Rich Slack Integration: Beautiful formatted notifications with status indicators
  • πŸ”„ Advanced Restart Detection: Detects both manual and automatic container restarts
  • πŸ“… Scheduled Reports: Daily summary reports at configured times
  • βš™οΈ Flexible Configuration: Environment-based configuration with sensible defaults
  • πŸ• Multiple Execution Modes: One-time, scheduled, continuous, or real-time monitoring
  • πŸ§ͺ Built-in Testing: Connection testing and validation tools
  • 🎯 Container Filtering: Regex-based container name filtering

πŸ—οΈ Architecture

This project follows clean architecture principles with proper separation of concerns:

docker-services-monitoring/
β”œβ”€β”€ docker_monitor/              # Main package
β”‚   β”œβ”€β”€ core/                    # Core business logic
β”‚   β”‚   β”œβ”€β”€ docker_client.py     # Thread-safe Docker daemon interaction
β”‚   β”‚   β”œβ”€β”€ docker_monitor.py    # Main orchestrator for scheduled 
β”‚   β”‚   β”œβ”€β”€ realtime_monitor.py  # Real-time monitoring orchestrator
β”‚   β”‚   β”œβ”€β”€ state_tracker.py     # Container state persistence and retrieval
β”‚   β”‚   β”œβ”€β”€ change_detector.py   # State difference analysis and change 
β”‚   β”‚   β”œβ”€β”€ notification_formatter.py # Message creation and formatting
β”‚   β”‚   β”œβ”€β”€ notification_manager.py   # Notification coordination and 
β”‚   β”‚   β”œβ”€β”€ cooldown_manager.py  # Notification timing and rate limiting
β”‚   β”‚   └── monitoring_thread.py # Background monitoring loop management
β”‚   β”œβ”€β”€ integrations/            # External service integrations
β”‚   β”‚   └── slack.py             # Slack notifications
β”‚   β”œβ”€β”€ utils/                   # Utilities and helpers
β”‚   β”‚   β”œβ”€β”€ config.py            # Configuration management
β”‚   β”‚   β”œβ”€β”€ formatters.py        # Data formatting utilities
β”‚   β”‚   └── logging_config.py    # Logging setup
β”‚   β”œβ”€β”€ cli/                     # Command-line interface
β”‚   β”‚   └── main.py              # CLI entry point
β”‚   β”œβ”€β”€ exceptions.py            # Custom exception hierarchy
β”‚   └── docker_monitor.py        # Legacy compatibility module
β”œβ”€β”€ scripts/                     # Executable scripts
β”‚   └── run_monitor.py           # Main execution script
β”œβ”€β”€ config/                      # Configuration templates
β”‚   └── env.example              # Environment configuration template
β”œβ”€β”€ tests/                       # Test suite
β”‚   β”œβ”€β”€ test_config.py           # Configuration tests
β”‚   β”œβ”€β”€ test_restart_detection.py # Restart detection tests
β”‚   β”œβ”€β”€ test_slack_integration.py # Slack integration tests
β”‚   └── test_threading.py        # Threading safety tests
β”œβ”€β”€ docker-compose.yml           # Docker Compose configuration
β”œβ”€β”€ Dockerfile                   # Docker image definition
β”œβ”€β”€ setup.sh                     # Universal setup script
└── requirements.txt             # Python dependencies

🧩 Core Components

πŸ“Š State Management:

  • StateTracker: Manages container state persistence, retrieval, and historical tracking
  • ChangeDetector: Analyzes state differences and classifies change types (start/stop/restart)

πŸ”” Notification System:

  • NotificationFormatter: Creates and formats notification messages for different event types
  • NotificationManager: Coordinates notification delivery and handles business logic
  • CooldownManager: Manages notification timing, rate limiting, and prevents spam

πŸ”„ Monitoring Engine:

  • MonitoringThread: Handles background monitoring loops with proper thread management
  • RealTimeMonitor: Orchestrates real-time monitoring components
  • DockerMonitor: Orchestrates scheduled monitoring workflows

🐳 Docker Integration:

  • DockerClient: Thread-safe Docker daemon interaction with connection pooling

⚑ Real-time Monitoring

What's Implemented

  • Continuous container monitoring every 10 seconds (configurable)
  • Instant Slack alerts for container status changes
  • Smart restart detection distinguishing manual vs automatic restarts
  • Thread-safe operations with proper locking and resource cleanup

Alert Types

🚨 Critical Alerts:

  • Container failures (running β†’ exited/stopped/dead)
  • Unexpected container removal
  • Health check failures

⚠️ Warning Alerts:

  • Container restart events
  • Status transitions

Usage

# Real-time monitoring with immediate alerts
docker compose --profile realtime up -d docker-monitor-realtime

# Custom check interval (seconds)
python3 scripts/run_monitor.py --realtime 15

# Combined with daily reports
docker compose --profile realtime up -d  # Runs both services

Restart Detection

The system automatically detects:

  • Manual restarts: docker restart <container> commands
  • Automatic restarts: Docker policy-based restarts (on-failure, unless-stopped)
  • Failed restarts: When containers don't come back up

Example Real-time Alerts

🚨 Container Status Alert - CRITICAL
Container: nginx-web
Status Change: running β†’ exited
Time: 2024-01-15 14:23:45

πŸ”„ Container Restart Detected
Container: api-service  
Type: Automatic restart
Status: running βœ…

πŸš€ Quick Start

Universal Automated Setup

Get monitoring running in minutes:

# 1. Clone the project
git clone <repo> docker-services-monitoring
cd docker-services-monitoring

# 2. Run the universal setup
./setup.sh

The setup script automatically handles:

  • βœ… Environment Detection - Works on local dev, cloud VMs, production servers
  • βœ… Docker Installation - Installs Docker if missing
  • βœ… Configuration Setup - Guides through Slack webhook setup with validation
  • βœ… Container Deployment - Builds and deploys with restart policies
  • βœ… Testing - Verifies monitoring and Slack integration work

Manual Setup

# 1. Install dependencies
pip install -r requirements.txt

# 2. Configure environment
cp config/env.example .env
# Edit .env with your Slack webhook URL

# 3. Choose your monitoring approach:

# Real-time monitoring (recommended for production)
docker compose --profile realtime up -d

# Daily reports only
docker compose up -d docker-monitor

# Test the setup
python3 scripts/run_monitor.py --test

πŸ”— Setting Up Slack Webhook

Before running the monitoring system, you need a Slack webhook URL to receive notifications.

πŸ“± Quick Setup Guide

Step 1: Create a Slack App

  • Go to https://api.slack.com/apps
  • Click the big green "Create New App" button
  • Select "From scratch"
  • Enter app name: Docker Monitor
  • Choose your Slack workspace from dropdown
  • Click "Create App"

Step 2: Enable Incoming Webhooks

  • In your new app's settings, find "Incoming Webhooks" in the left menu
  • Click the toggle switch to turn it ON (it should turn green)
  • Click the "Add New Webhook to Workspace" button

Step 3: Choose Channel & Authorize

  • Select the channel where you want alerts (create #docker-alerts if needed)
  • Click "Allow" to give the app permission

Step 4: Copy Your Webhook URL

  • You'll see a webhook URL that looks like this:
    https://hooks.slack.com/services/T1234567890/B1234567890/abcdefghijklmnopqrstuvwx
    
  • Copy this entire URL - you'll need it for configuration

πŸ”§ Testing Your Webhook

Once you have the webhook URL, test it:

# Test with curl
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"πŸ§ͺ Docker Monitor Test - Webhook is working!"}' \
YOUR_WEBHOOK_URL

# Or use the built-in test
python3 scripts/run_monitor.py --test

πŸ“ Adding to Configuration

Add your webhook URL to the .env file:

# In your .env file
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

πŸ’‘ Security Tip: Never commit webhook URLs to version control. Always use environment variables or .env files (which should be in .gitignore).

πŸ§ͺ Testing the System

Running Integration Tests

The project includes different types of test files:

Pytest Test Suites

These require pytest to run:

# Install pytest if not already installed
pip install pytest

# Run pytest-based test files
python3 -m pytest tests/test_config.py -v              # Configuration tests
python3 -m pytest tests/test_slack_integration.py -v   # Slack integration tests

Standalone Test Scripts

These can be run directly with Python:

# Test restart detection functionality
python3 tests/test_restart_detection.py

# Test threading safety improvements
python3 tests/test_threading.py

System Integration Testing

Test the complete monitoring pipeline:

# Test Docker connection and basic monitoring
python3 scripts/run_monitor.py --test

# Test Slack webhook integration
python3 scripts/run_monitor.py --test-notification

# Test inside Docker container
docker-compose exec docker-monitor python3 scripts/run_monitor.py --test

Test Results Example

❯ python3 -m pytest tests/test_config.py -v
========================================= test session starts ==========================================
collected 5 items                                                                                      

tests/test_config.py::TestConfig::test_config_initialization_with_required_env PASSED            [ 20%]
tests/test_config.py::TestConfig::test_config_missing_required_env_raises_error PASSED           [ 40%]
tests/test_config.py::TestConfig::test_default_values PASSED                                     [ 60%]
tests/test_config.py::TestConfig::test_custom_values PASSED                                      [ 80%]
tests/test_config.py::TestConfig::test_get_all_returns_dict PASSED                               [100%]

========================================== 5 passed in 0.09s ===========================================

Note: The test files use pytest framework and must be run with python3 -m pytest rather than direct Python execution.

πŸ“Š Monitoring Modes

The setup script offers three monitoring modes to suit different needs:

1. Scheduled Monitoring (Default)

  • βœ… Best for: Most users, development environments, regular health checks
  • πŸ“… Frequency: Daily reports at specified time (default: 9:00 AM)
  • πŸ’¬ Notifications: Comprehensive daily status reports
  • πŸ”‹ Resource Usage: Minimal - only runs once per day
# Runs daily at 9 AM
docker compose up -d docker-monitor

2. Real-time Monitoring

  • βœ… Best for: Production environments, critical services, immediate alerts
  • ⚑ Frequency: Continuous monitoring every 10 seconds
  • 🚨 Notifications: Immediate alerts when containers go down, restart, or fail
  • πŸ”‹ Resource Usage: Low - efficient state change detection
# Real-time monitoring with immediate alerts
docker compose --profile realtime up -d docker-monitor-realtime

3. Both Modes

  • βœ… Best for: Comprehensive monitoring
  • πŸ“Š Combines: Daily reports + immediate failure alerts
  • πŸ’ͺ Coverage: Complete monitoring solution
  • πŸ”‹ Resource Usage: Moderate - runs both services
# Run both scheduled and real-time monitoring
docker compose --profile realtime up -d

🚨 Real-time Alert Examples

When using real-time monitoring, you'll receive immediate Slack notifications for:

Critical Alerts (🚨):

  • Container goes from running β†’ exited
  • Container goes from running β†’ stopped
  • Container goes from running β†’ dead
  • Container is unexpectedly removed
  • Container restart fails (container doesn't come back up)

Warning Alerts (⚠️):

  • Container status becomes restarting
  • Container goes from healthy β†’ unhealthy
  • Container restarts successfully (manual or automatic)

Restart Detection: The system automatically detects and notifies about:

  • πŸ”„ Manual Restarts: When someone runs docker restart <container>
  • πŸ”„ Automatic Restarts: When Docker restarts a container due to restart policies
  • 🚨 Failed Restarts: When restart attempts fail and container doesn't recover

Sample Real-time Alerts:

Container Failure:

🚨 Container Status Alert - CRITICAL

Container: nginx-web
Status Change: running β†’ exited
Image: nginx:latest
Time: 2024-01-15 14:23:45
Ports: 80β†’80/tcp, 443β†’443/tcp

Container Restart:

🚨 Container Removed - CRITICAL

Container: nginx-web
Previous Status: running
Time: 2024-01-15 14:25:10

ℹ️ Container Added

Container: nginx-web
Status: running
Image: nginx:latest
Time: 2024-01-15 14:25:15

Health Check Failure:

⚠️ Container Status Alert - WARNING

Container: api-server
Status Change: running β†’ unhealthy
Image: myapp:latest
Time: 2024-01-15 14:30:22

πŸ”„ Container Restart Detection

The monitoring system automatically detects:

  • Manual restarts: docker restart <container> commands
  • Automatic restarts: Docker policy-based restarts (on-failure, unless-stopped)
  • Failed restarts: When containers don't come back up

πŸš€ Quick Start

Universal Automated Setup

Get monitoring running in minutes:

# 1. Clone the project
git clone <repo> docker-services-monitoring
cd docker-services-monitoring

# 2. Run the universal setup
./setup.sh

The setup script automatically handles:

  • βœ… Environment Detection - Works on local dev, cloud VMs, production servers
  • βœ… Docker Installation - Installs Docker if missing
  • βœ… Configuration Setup - Guides through Slack webhook setup with validation
  • βœ… Container Deployment - Builds and deploys with restart policies
  • βœ… Testing - Verifies monitoring and Slack integration work

Manual Setup

# 1. Install dependencies
pip install -r requirements.txt

# 2. Configure environment
cp config/env.example .env
# Edit .env with your Slack webhook URL

# 3. Choose your monitoring approach:

# Real-time monitoring (recommended for production)
docker compose --profile realtime up -d

# Daily reports only
docker compose up -d docker-monitor

# Test the setup
python3 scripts/run_monitor.py --test

πŸ“Š Monitoring Modes

The setup script offers three monitoring modes to suit different needs:

1. Scheduled Monitoring (Default)

  • βœ… Best for: Most users, development environments, regular health checks
  • πŸ“… Frequency: Daily reports at specified time (default: 9:00 AM)
  • πŸ’¬ Notifications: Comprehensive daily status reports
  • πŸ”‹ Resource Usage: Minimal - only runs once per day
# Runs daily at 9 AM
docker compose up -d docker-monitor

2. Real-time Monitoring

  • βœ… Best for: Production environments, critical services, immediate alerts
  • ⚑ Frequency: Continuous monitoring every 10 seconds
  • 🚨 Notifications: Immediate alerts when containers go down, restart, or fail
  • πŸ”‹ Resource Usage: Low - efficient state change detection
# Real-time monitoring with immediate alerts
docker compose --profile realtime up -d docker-monitor-realtime

3. Both Modes

  • βœ… Best for: Comprehensive monitoring
  • πŸ“Š Combines: Daily reports + immediate failure alerts
  • πŸ’ͺ Coverage: Complete monitoring solution
  • πŸ”‹ Resource Usage: Moderate - runs both services
# Run both scheduled and real-time monitoring
docker compose --profile realtime up -d

🚨 Real-time Alert Examples

When using real-time monitoring, you'll receive immediate Slack notifications for:

Critical Alerts (🚨):

  • Container goes from running β†’ exited
  • Container goes from running β†’ stopped
  • Container goes from running β†’ dead
  • Container is unexpectedly removed
  • Container restart fails (container doesn't come back up)

Warning Alerts (⚠️):

  • Container status becomes restarting- Container goes from healthy β†’ unhealthy
  • Container restarts successfully (manual or automatic)

Restart Detection: The system automatically detects and notifies about:

  • πŸ”„ Manual Restarts: When someone runs docker restart <container>
  • πŸ”„ Automatic Restarts: When Docker restarts a container due to restart policies
  • 🚨 Failed Restarts: When restart attempts fail and container doesn't recover

Sample Real-time Alerts:

Container Failure:

🚨 Container Status Alert - CRITICAL

Container: nginx-web
Status Change: running β†’ exited
Image: nginx:latest
Time: 2024-01-15 14:23:45
Ports: 80β†’80/tcp, 443β†’443/tcp

Container Restart:

🚨 Container Removed - CRITICAL

Container: nginx-web
Previous Status: running
Time: 2024-01-15 14:25:10

ℹ️ Container Added

Container: nginx-web
Status: running
Image: nginx:latest
Time: 2024-01-15 14:25:15

Health Check Failure:

⚠️ Container Status Alert - WARNING

Container: api-server
Status Change: running β†’ unhealthy
Image: myapp:latest
Time: 2024-01-15 14:30:22

πŸ”§ Configuration Options

Environment Variables

Variable Default Description
SLACK_WEBHOOK_URL Required Slack incoming webhook URL
DAILY_CHECK_TIME 09:00 Daily check time (HH:MM format)
REALTIME_CHECK_INTERVAL 10 Real-time monitoring interval (seconds)
LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)
DOCKER_SOCKET unix://var/run/docker.sock Docker daemon socket
NOTIFICATION_ENABLED true Enable/disable Slack notifications
INCLUDE_STOPPED_CONTAINERS true Include stopped containers in reports
CONTAINER_NAME_FILTER - Regex pattern to filter container names
TIMEZONE UTC Timezone for scheduling

Universal Cron Job (Recommended)

# Edit crontab
crontab -e

# Add this line for daily 9 AM reports:
0 9 * * * cd $HOME/docker-services-monitoring && python3 scripts/run_monitor.py --once

Benefits of this approach:

  • βœ… Works on any system with any username
  • βœ… Uses environment variable $HOME
  • βœ… Easy to deploy across different servers

Alternative Cron Schedules

# Every day at 8:30 AM
30 8 * * * cd $HOME/docker-services-monitoring && python3 scripts/run_monitor.py --once

# Every Monday at 9 AM
0 9 * * 1 cd $HOME/docker-services-monitoring && python3 scripts/run_monitor.py --once

# Every 6 hours
0 */6 * * * cd $HOME/docker-services-monitoring && python3 scripts/run_monitor.py --once

# Twice daily: 9 AM and 6 PM
0 9,18 * * * cd $HOME/docker-services-monitoring && python3 scripts/run_monitor.py --once

πŸƒβ€β™‚οΈ Production Deployment

Docker Compose (Recommended)

The easiest way to deploy in production is using Docker Compose with automatic restarts:

1. Setup

# Ensure you have your .env file configured
cp config/env.example .env
nano .env  # Add your Slack webhook URL

# Create logs directory
mkdir -p logs

2. Build and Run

# Build and start the service
docker-compose up -d

# View logs
docker-compose logs -f docker-monitor

# Check status
docker-compose ps

3. Management Commands

# Stop the service
docker-compose down

# Restart the service
docker-compose restart docker-monitor

# Rebuild after code changes
docker-compose up -d --build

# View real-time logs
docker-compose logs -f docker-monitor

# Run one-time check
docker-compose exec docker-monitor python3 scripts/run_monitor.py --once

# Test notifications
docker-compose exec docker-monitor python3 scripts/run_monitor.py --test-notification

4. Configuration

The Docker Compose setup includes:

  • βœ… Automatic restarts with restart: unless-stopped
  • βœ… Health checks to ensure service is running properly
  • βœ… Docker socket mounting for container monitoring
  • βœ… Persistent logs in ./logs directory
  • βœ… Environment variable support from .env file
  • βœ… Isolated network for security

5. Customization

You can customize the deployment by editing docker-compose.yml:

# Change the schedule or run mode
services:
  docker-monitor:
    # ... other config ...
    command: ["python3", "scripts/run_monitor.py", "--continuous", "30"]  # Every 30 minutes
    # OR
    command: ["python3", "scripts/run_monitor.py", "--once"]  # Run once and exit

About

🐳Docker Container Monitoring with Slack Integration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published