Skip to content

DoMaLi94/docker-compose-mlflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLflow with PostgreSQL and MinIO

A complete Docker Compose setup for MLflow tracking server with PostgreSQL backend and MinIO artifact storage. This provides a robust, production-ready MLOps platform for tracking machine learning experiments, models, and artifacts.

Features

  • MLflow Tracking Server: Web UI for experiment tracking and model management
  • PostgreSQL 17.6: Reliable backend database for metadata storage
  • MinIO: S3-compatible object storage for ML artifacts
  • Nginx: Reverse proxy with optimized configurations
  • Automated Setup: Auto-creates buckets and handles initialization
  • Data Persistence: Named volumes ensure data survives container restarts

Credits

Based on "Practical Deep Learning at Scale with MLFlow" by Packt Publishing (original repo).

Key Improvements:

  • Upgraded to PostgreSQL 17.6 (from MySQL) for better performance and compatibility
  • Latest software versions: MLflow 3.3.1, MinIO 2025-07-23, Python 3.11

Prerequisites

  • Docker and Docker Compose installed
  • At least 4GB RAM available for containers
  • Ports 80, 9000, and 9001 available on your system

Quick Start

  1. Clone the repository

    git clone https://github.com/DoMaLi94/docker-compose-mlflow.git
    cd docker-compose-mlflow
  2. Set up environment variables

    cp .env.example .env
    # Edit .env with your preferred credentials
  3. Start all services

    docker compose up -d

Services & Access Points

Service URL Purpose Credentials
MLflow UI http://localhost Experiment tracking & model management None required
MinIO Console http://localhost:9001 Object storage management Use AWS credentials from .env
MinIO API http://localhost:9000 S3-compatible API endpoint Use AWS credentials from .env

Configuration

Environment Variables

Create a .env file in the root directory:

# Image versions
MINIO_VERSION=RELEASE.2025-07-23T15-54-02Z
MINIO_MC_VERSION=RELEASE.2025-07-21T05-28-08Z
POSTGRES_VERSION=17.6
PYTHON_VERSION=3.11-slim
MLFLOW_VERSION=3.3.1
NGINX_VERSION=1.29-alpine

# MinIO configuration
AWS_ACCESS_KEY_ID=minio
AWS_SECRET_ACCESS_KEY=minio123
MLFLOW_BUCKET_NAME=mlflow
DATA_REPO_BUCKET_NAME=data

# PostgreSQL configuration
POSTGRES_DATABASE=mlflow_database
POSTGRES_USER=mlflow_user
POSTGRES_PASSWORD=mlflow

Variable Explanations

Image Versions

  • MINIO_VERSION: MinIO server version tag (S3-compatible object storage)
  • MINIO_MC_VERSION: MinIO client version tag (for bucket initialization)
  • POSTGRES_VERSION: PostgreSQL database version tag
  • PYTHON_VERSION: Python base image version for MLflow container
  • MLFLOW_VERSION: MLflow package version to install
  • NGINX_VERSION: Nginx reverse proxy version tag

Service Configuration

  • AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY: MinIO credentials for S3-compatible storage
  • MLFLOW_BUCKET_NAME: Stores MLflow artifacts (models, plots, metrics)
  • DATA_REPO_BUCKET_NAME: Additional bucket for datasets and custom files
  • POSTGRES_DATABASE: Database name for MLflow metadata storage
  • POSTGRES_USER: PostgreSQL username for MLflow database access
  • POSTGRES_PASSWORD: PostgreSQL password for MLflow database access

Version Management

All service versions are centrally managed through environment variables in the .env file. To update any service version:

  1. Edit the .env file with your desired versions
  2. Rebuild the affected services:
    # Rebuild specific services
    docker compose build mlflow nginx
    
    # Or rebuild all services
    docker compose up --build

Note: When updating versions, always check compatibility between MLflow and Python versions, and review the changelog for breaking changes.

Default Network Ports

  • Port 80: MLflow web interface (via Nginx)
  • Port 9000: MinIO S3 API (via Nginx)
  • Port 9001: MinIO web console (via Nginx)
  • Port 5432: PostgreSQL (internal network only)

Usage

Basic Operations

# Start all services
docker compose up -d

# View real-time logs
docker compose logs -f

# View logs for specific service
docker compose logs -f mlflow

# Check service status
docker compose ps

# Stop all services
docker compose down

# Stop and remove all data
docker compose down -v

Data Persistence & Storage

Volume Management

The setup uses Docker named volumes for persistent data:

  • postgres_data: Contains all MLflow metadata

    • Experiments, runs, parameters, metrics
    • Model registry information
    • User data and configurations
  • minio_data: Contains all artifact storage

    • Trained models and model artifacts
    • Plots, images, and visualizations
    • Datasets and custom files
    • Logged artifacts from experiments

About

MLOps infrastructure stack featuring MLflow, PostgreSQL, and MinIO, orchestrated with Docker Compose.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors