Skip to content

Blog Post Submission: mlflow-modal - Extending MLflow Deployments to Serverless GPU Infrastructure #462

@debu-sinha

Description

@debu-sinha

Blog Post Submission

Post Type

  • Deep Dive
  • How-To
  • Use Case
  • Tips / Best Practices
  • Features

Topics

  • GenAI
  • Advanced
  • Deployment
  • Core

Title

mlflow-modal: Extending MLflow Deployments to Serverless GPU Infrastructure

Abstract

MLflow serves approximately 29 million downloads per month (PyPI Stats, Feb 2026). This post announces and documents mlflow-modal, a deployment plugin I built to extend MLflow's deployment ecosystem to Modal's serverless GPU infrastructure.

The plugin adds a new deployment target so MLflow models can run on serverless GPUs without managing infrastructure:

  1. New deployment target - deploy MLflow models to Modal with mlflow deployments create -t modal
  2. Full Deployments API support - create, update, delete, predict, list endpoints
  3. Production features - GPU selection (T4 through H100), autoscaling, streaming predictions
  4. Environment handling - automatic detection of uv/pip/conda, private PyPI support

The plugin is published on PyPI (pip install mlflow-modal, v0.5.0) and validated against Modal's 1.0 API.

Target Length

~3000 words (flagship length per maintainer guidance)

Related Artifacts

Provenance

  • Plugin architecture, design, and implementation: Debu Sinha (@debu-sinha)
  • Technical validation: Modal team

Consent Acknowledgment

  • I will request technical acknowledgment from Modal before PR

Additional Context

This plugin fills a gap in MLflow's deployment ecosystem -- there was no serverless GPU deployment target. For developers who want pay-per-use GPU inference with scale-to-zero and sub-second cold starts, this is a straightforward path from MLflow model to production endpoint.

This was specifically requested as a "2-3 page" flagship post by MLflow blog maintainer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions