Exasol MLflow Server

A MLflow HTTP API service for AI inference, designed for serving HuggingFace models through a robust, scalable web service for the Exasol RDBMS.

Features

Fast HTTP API - FastAPI-based service with automatic OpenAPI documentation
HuggingFace Integration - Seamless integration with HuggingFace Transformers
Model Hot-Swapping - Dynamic model loading and switching
Concurrent Processing - Built-in request limiting and parallel processing
MLflow Integration - Full MLflow model registry and tracking support
JWT Authentication - Optional token-based authentication with model access control
Docker Ready - Production-ready containerization
Comprehensive Testing - Full test suite with coverage reporting
Production Grade - Security scanning, linting, and CI/CD pipeline

Quick Start

Installation

# Install the package
pip install exasol-mlflow-server

# Or for development
git clone https://github.com/exasol/exasol-labs-mlflow-server.git
cd exasol-labs-mlflow-server
pip install -e .[dev]

Running the Service

# Start the service with default configuration
mlflow-server

# Or with custom configuration
python -m mlflow_service.server --config configs/models.yaml --api-port 50051

The service will start this server:

AI API Server: http://localhost:50051 (Model inference API)

Basic Usage

from mlflow_service.client import AIClient
import pandas as pd

# Connect to the service
client = AIClient(host="localhost", port=50051)

# Prepare input data
data = pd.DataFrame({"text": ["I love this product!", "This is terrible."]})

# Get predictions
predictions = client.predict(data)
print(predictions)
# Output: [{"label": "POSITIVE", "score": 0.95}, {"label": "NEGATIVE", "score": 0.89}]

Using cURL

# Check service status
curl http://localhost:50051/status

# Run inference on specific model
curl -X POST http://localhost:50051/model/small/infer \
  -H "Content-Type: application/json" \
  -d '{"text": ["I love this!", "Not great."]}'

# List available models
curl http://localhost:50051/models

Authentication

The service supports optional JWT-based authentication with fine-grained model access control.

Enabling Authentication

Set environment variables to enable authentication:

export MLFLOW_AUTH_ENABLED=true
export MLFLOW_JWT_SECRET_KEY="your-secret-key-change-this-in-production"
export MLFLOW_TOKEN_EXPIRE_MINUTES=1440  # 24 hours (optional)

Token Management

POST /auth/token

Generate JWT access tokens with model permissions.

Request:

{
  "subject": "user123",
  "models": ["small", "medium"],
  "admin": false,
  "expire_minutes": 1440
}

Response:

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "bearer",
  "expires_in": 86400
}

Note: This endpoint requires admin privileges when authentication is enabled.

GET /auth/me

Get current token information and permissions.

Response:

{
  "subject": "user123",
  "models": ["small", "medium"],
  "admin": false,
  "expires_at": 1640995200,
  "issued_at": 1640908800
}

Using Authenticated Endpoints

Include the JWT token in the Authorization header:

curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"text": ["Hello world"]}' \
     http://localhost:50051/model/small/infer

Token Permissions

Model Access: Tokens specify which models the user can access
Admin Privileges: Admin tokens can access all models and manage authentication
Wildcard Access: Use ["*"] in models array for access to all models

Security Considerations

Change the default JWT secret key in production
Use HTTPS in production environments
Tokens expire automatically (default: 24 hours)
Admin tokens should be carefully managed and rotated regularly

Configuration

Model Configuration

Models are configured in configs/models.yaml:

# Example model configuration
small:
  hf_model_name: "cardiffnlp/twitter-roberta-base-sentiment-latest"
  mlflow_class: "HFSentimentModel"
  batch_size: 8

medium:
  hf_model_name: "nlptown/bert-base-multilingual-uncased-sentiment"
  mlflow_class: "HFSentimentModel"
  batch_size: 4

Command Line Options

python -m mlflow_service.server --help

Options:
  --config PATH              Model configuration file (default: configs/models.yaml)
  --api-port INT            AI API port (default: 50051)
  --max-parallel-requests   Maximum concurrent requests (default: 2)
  --memory-limit-mb INT     Memory limit in MB (default: 0, unlimited)
  --gpu-memory-fraction     GPU memory fraction (default: 0.0, auto-growth)

API Reference

The service provides a comprehensive REST API with full OpenAPI documentation available at http://localhost:50051/docs when running.

Core Endpoints

POST /model/{model_tag}/infer

Run AI model inference on a specific model.

Parameters:

model_tag - Model identifier (e.g., "small", "medium")

Request:

{
  "text": ["I love this product!", "This is terrible."]
}

Response:

{
  "predictions": [
    {"label": "POSITIVE", "score": 0.95},
    {"label": "NEGATIVE", "score": 0.89}
  ],
  "model_used": "small"
}

Features:

Automatic model loading if not currently active
Thread-safe model switching
Structured response format with confidence scores

POST /model/{model_tag}/load

Explicitly load or reload a specific model.

Parameters:

model_tag - Model identifier to load

Request: Empty body

Response:

{
  "status": "model loaded",
  "model_uri": "models:/small/1",
  "tag": "small"
}

GET /models

List available models with enriched configuration and registry details.

Response (example):

{
  "default": "small",
  "models": {"small": "models:/small/1", "medium": "models:/medium/1"},
  "current": "small",
  "details": {
    "small": {
      "tag": "small",
      "model_uri": "models:/small/1",
      "is_default": true,
      "is_loaded": true,
      "exists_in_registry": true,
      "mlflow_class": "HFSentimentModel",
      "hf_model_name": "cardiffnlp/twitter-roberta-base-sentiment-latest",
      "batch_size": 8,
      "registry_versions": [
        {
          "version": "1",
          "stage": "Staging",
          "status": "READY",
          "run_id": "abc123",
          "source": "runs:/abc123/small",
          "last_updated_timestamp": 1700000000,
          "size_bytes": 123456789
        }
      ]
    }
  }
}

GET /status

Get service status and performance metrics.

Response:

{
  "max_parallel_requests": 2,
  "active_requests": 1,
  "waiting_requests": 0,
  "total_requests": 42,
  "current_model": "small",
  "queue_available": 1
}

Model Management Endpoints (Admin Only)

POST /model/{model_tag}

Add a new model to the service at runtime.

Parameters:

model_tag - Unique identifier for the new model

Request:

{
  "model_uri": "models:/custom-model/1",
  "hf_model_name": "distilbert-base-uncased-finetuned-sst-2-english",
  "mlflow_class": "HFSentimentModel",
  "batch_size": 1
}

Response:

{
  "status": "success",
  "message": "Model 'custom-model' added successfully",
  "tag": "custom-model",
  "model_uri": "models:/custom-model/1"
}

DELETE /model/{model_tag}

Remove a model from the service.

Parameters:

model_tag - Model identifier to remove

Response:

{
  "status": "success",
  "message": "Model 'custom-model' removed successfully",
  "tag": "custom-model",
  "model_uri": "models:/custom-model/1"
}

Note: Cannot remove the default model or currently loaded model.

Class Management Endpoints (Admin Only)

POST /class/{class_name}

Register an external model class at runtime.

Parameters:

class_name - Name to register the class under

Request:

{
  "module_name": "examples.custom_models",
  "class_name": "CustomSentimentModel"
}

Response:

{
  "status": "success",
  "message": "Successfully registered model class: CustomSentiment",
  "class_name": "CustomSentiment"
}

DELETE /class/{class_name}

Remove a model class from the service.

Parameters:

class_name - Name of the class to remove

Response:

{
  "status": "success",
  "message": "Successfully removed model class: CustomSentiment",
  "class_name": "CustomSentiment"
}

Note: Built-in model classes cannot be removed.

GET /classes

List all registered model classes.

Response:

{
  "model_classes": ["HFSentimentModel", "CustomSentiment"],
  "details": {
    "HFSentimentModel": "HFSentimentModel",
    "CustomSentiment": "CustomSentimentModel"
  }
}

Interactive API Documentation

Swagger UI: http://localhost:50051/docs
ReDoc: http://localhost:50051/redoc
OpenAPI Spec: http://localhost:50051/openapi.json

Development

Setting up Development Environment

# Clone the repository
git clone https://github.com/exasol/exasol-labs-mlflow-server.git
cd exasol-labs-mlflow-server

# Install development dependencies
pip install -e .[dev]

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=mlflow_service --cov-report=html

Code Quality

The project uses several tools for code quality:

# Format code
make format

# Lint code
make lint

# Run all pre-commit checks
pre-commit run --all-files

Project Structure

mlflow_service/             # Main service implementation
    __init__.py
    server.py               # FastAPI server and MLflow integration
    client.py               # HTTP client for the API
    models.py               # MLflow model wrappers
    sql/                    # UDF SQL files
configs/                    # Configuration files
    models.yaml             # Model definitions
tests/                      # Test suite
.github/workflows/          # CI/CD pipelines
Dockerfile                  # Container definition
pyproject.toml              # Project configuration

Docker Deployment

Building the Image

docker build -t mlflow-server .

Running with Docker

# Run with default configuration
docker run -p 50051:50051 mlflow-server

# Run with custom configuration
docker run -p 50051:50051 \
  -v $(pwd)/configs:/app/configs \
  mlflow-server --config configs/models.yaml

Docker Compose

version: '3.8'
services:
  mlflow-server:
    image: mlflow-server
    ports:
      - "5000:5000"   # MLflow UI
      - "50051:50051" # API Server
    volumes:
      - ./configs:/app/configs
      - ./mlruns:/app/mlruns
    environment:
      - MLFLOW_BACKEND_STORE_URI=sqlite:///mlflow.db

Extending the Service

Adding External Model Classes

The MLflow server supports loading custom model classes externally in two ways:

1. Command Line Arguments

Load external model classes when starting the server:

# Load a single external model class
python -m mlflow_service.server \
  --external-models "examples.custom_models:CustomSentimentModel"

# Load multiple classes with custom names
python -m mlflow_service.server \
  --external-models \
    "examples.custom_models:CustomSentimentModel:CustomSentiment" \
    "my_package.models:AdvancedClassifier:Advanced"

Client CLI

The package ships a command-line client mlflow-client to help with operations, Exasol integration, and token management.

Install and Build

Install: make install (or pip install -e .)
Build wheel: make build → creates dist/*.whl

Environment Configuration (.env)

The client and server both support loading a .env file via --env .env.
Example variables (see .env.example):
- Server/auth: MLFLOW_AUTH_ENABLED, MLFLOW_JWT_SECRET_KEY, MLFLOW_JWT_ALGORITHM, MLFLOW_TOKEN_EXPIRE_MINUTES, MLFLOW_HTTP_PORT, MLFLOW_CONFIG_PATH
- Client/API: MLFLOW_API_HOST, MLFLOW_API_PORT
- Exasol: EXA_DSN, EXA_USER, EXA_PASSWORD, EXA_SCHEMA, EXA_CONNECTION_NAME
- Token: MLFLOW_ADMIN_TOKEN (only where needed; avoid committing!)
- BucketFS (preferred single URL):
  - EXA_BUCKETFS_URL=http://USER:BASE64_PASSWORD@HOST:PORT/buckets/BUCKET/PATH
  - Example: http://w:dw==@127.0.0.1:6583/buckets/default/mlflow (dw== is base64("w"))
  - Or use components: EXA_BUCKETFS_HOST, EXA_BUCKETFS_PORT, EXA_BUCKETFS_BUCKET, EXA_BUCKETFS_PATH, EXA_BUCKETFS_USER, EXA_BUCKETFS_PASSWORD

Generate Tokens

Create tokens by calling the server (preferred) or by signing offline.

Server-side (requires admin token when auth is enabled):
- mlflow-client create-token --subject admin --models '*' --admin --expire-minutes 43200 --env .env
- If auth is enabled, provide an admin token: --token "$MLFLOW_ADMIN_TOKEN" or set MLFLOW_ADMIN_TOKEN in .env.
Offline signing (no server call; needs server secret locally):
- mlflow-client create-token --offline --subject user1 --models small,large --expire-minutes 1440 --env .env
- Reads MLFLOW_JWT_SECRET_KEY and MLFLOW_JWT_ALGORITHM from .env unless --secret/--algorithm are given.
- Output is a JWT printed to stdout.

Bootstrap tip: If you don’t have an admin token yet, you can temporarily start the server with MLFLOW_AUTH_ENABLED=false to mint the first admin token via /auth/token, then restart with auth enabled.

Store Admin Token in Exasol

Store a token securely in Exasol using CREATE CONNECTION:

mlflow-client store-admin-token --env .env
# Or pass explicit flags: --dsn --user --password --connection --token

UDFs will read the token from the connection specified by EXA_CONNECTION_NAME (default MLFLOW_ADMIN_TOKEN).

Create UDFs in Exasol

Create Python UDFs to call the MLflow service:

mlflow-client create-udfs --env .env
# Or pass: --dsn --user --password --schema --connection

This creates scripts in the target schema:

MLFLOW_INFER_JSON(model_tag VARCHAR, text VARCHAR) RETURNS JSON
MLFLOW_LOAD_MODEL(model_tag VARCHAR) RETURNS JSON
MLFLOW_LIST_MODELS() RETURNS JSON
MLFLOW_STATUS() RETURNS JSON

These UDFs call http://MLFLOW_API_HOST:MLFLOW_API_PORT and add Authorization: Bearer <token> if the token connection exists.

Upload Wheel to BucketFS

Upload the client/server wheel to BucketFS for Exasol environments:

# Upload newest dist/*.whl using EXA_BUCKETFS_URL from .env
mlflow-client bucketfs-upload --env .env

# Or specify a file and components explicitly
mlflow-client bucketfs-upload --file dist/exasol_mlflow_server-0.1.0-py3-none-any.whl \
  --host 127.0.0.1 --port 6583 --bucket default --path mlflow --user w --password w

Programmatic Client Notes

The programmatic client requires a model tag in calls, matching the API:

from mlflow_service.client import AIClient
import pandas as pd

client = AIClient()  # reads MLFLOW_API_HOST/PORT if set
client.token = "<JWT>"  # optional

client.load("small")
resp = client.predict("small", pd.DataFrame({"text": ["great", "bad"]}))
print(resp)

Set MLFLOW_API_HOST and MLFLOW_API_PORT in .env or pass host/port to AIClient.

2. Runtime API Registration

Register model classes at runtime using the REST API:

# Register a new model class
curl -X POST http://localhost:50051/register-model-class \
  -H "Content-Type: application/json" \
  -d '{
    "module_name": "examples.custom_models",
    "class_name": "CustomSentimentModel",
    "register_name": "CustomSentiment"
  }'

# List all registered model classes
curl http://localhost:50051/model-classes

3. Programmatic Registration

from mlflow_service.models import register_model_class, load_external_model_class

# Method 1: Register an already imported class
from examples.custom_models import CustomSentimentModel
register_model_class("CustomSentiment", CustomSentimentModel)

# Method 2: Load and register from module
load_external_model_class(
    "examples.custom_models",
    "CustomSentimentModel",
    "CustomSentiment"
)

Creating Custom Model Classes

Create a custom model class that inherits from HFModel:

from mlflow_service.models import HFModel
import pandas as pd

class MyCustomModel(HFModel):
    def _load_pipeline(self):
        """Load your HuggingFace pipeline."""
        self.pipeline = self._pipeline_fn(
            "text-classification",  # or your task
            model=self.hf_model_name,
            device=self.device,
            batch_size=self.batch_size,
        )

    def predict(self, context, model_input, params=None):
        """Implement your prediction logic."""
        texts = model_input["text"].astype(str).tolist()
        outputs = self.pipeline(texts, batch_size=self.batch_size)

        # Process outputs as needed
        results = []
        for output in outputs:
            results.append({
                "label": output["label"],
                "score": output["score"],
                # Add custom fields
                "custom_field": "custom_value"
            })

        return pd.DataFrame(results)

    def input_example(self):
        """Return example input for signature inference."""
        return pd.DataFrame({"text": ["example text"]})

Save your model in a Python module (e.g., my_models.py)
Load it using one of the methods above
Configure it in your models.yaml:

my_custom_model:
  hf_model_name: "your-model-name"
  mlflow_class: "MyCustomModel"  # Use the registered name
  batch_size: 4

Example Custom Models

See examples/custom_models.py for complete examples including:

CustomSentimentModel: Enhanced sentiment analysis with preprocessing/postprocessing
TextClassificationModel: Generic text classification for various tasks

Model Class Requirements

All custom model classes must:

Inherit from mlflow_service.models.HFModel
Implement the abstract methods:
- _load_pipeline(): Initialize your HuggingFace pipeline
- predict(): Process input and return predictions
- input_example(): Return example input DataFrame
Follow the expected input/output format (DataFrame with "text" column)

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Run tests and ensure they pass (pytest)
Run code quality checks (pre-commit run --all-files)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with MLflow for model management
Powered by FastAPI for HTTP service
Integrated with HuggingFace Transformers for model inference
Developed by Exasol Labs

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
configs		configs
mlflow_service		mlflow_service
tests		tests
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
DISCLAIMER.md		DISCLAIMER.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

License

exasol-labs/exasol-labs-mlflow-server

Folders and files

Latest commit

History

Repository files navigation

Exasol MLflow Server

Features

Quick Start

Installation

Running the Service

Basic Usage

Using cURL

Authentication

Enabling Authentication

Token Management

POST /auth/token

GET /auth/me

Using Authenticated Endpoints

Token Permissions

Security Considerations

Configuration

Model Configuration

Command Line Options

API Reference

Core Endpoints

POST /model/{model_tag}/infer

POST /model/{model_tag}/load

GET /models

GET /status

Model Management Endpoints (Admin Only)

POST /model/{model_tag}

DELETE /model/{model_tag}

Class Management Endpoints (Admin Only)

POST /class/{class_name}

DELETE /class/{class_name}

GET /classes

Interactive API Documentation

Development

Setting up Development Environment

Running Tests

Code Quality

Project Structure

Docker Deployment

Building the Image

Running with Docker

Docker Compose

Extending the Service

Adding External Model Classes

1. Command Line Arguments

Client CLI

Install and Build

Environment Configuration (.env)

Generate Tokens

Store Admin Token in Exasol

Create UDFs in Exasol

Upload Wheel to BucketFS

Programmatic Client Notes

2. Runtime API Registration

3. Programmatic Registration

Creating Custom Model Classes

Example Custom Models

Model Class Requirements

Contributing

License

Acknowledgments

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages