Feature Request: Multi-Model Serving & Deployment Management

## Feature Request

### Problem / Use Case

Currently, vLLM Playground manages a single vLLM server instance at a time. The architecture connects one Web UI to one vLLM container, and the CLI (start/stop/status) operates on a single server. While the v0.1.5 remote server feature allows connecting to one external vLLM instance, there is no support for managing multiple models or server instances simultaneously.

For production and development workflows, it is common to need multiple models running concurrently - for example, a coding assistant model alongside a general-purpose chat model, or A/B testing different model versions.

### Proposed Feature

Support for managing multiple vLLM server instances simultaneously, including:

- **Multi-instance dashboard** - Start, stop, and monitor multiple vLLM servers from a single UI, each serving a different model on different ports/GPUs
- **Model registry / catalog** - A central view of available models with one-click deploy, showing which are currently running and their resource usage
- **Per-model configuration** - Independent configuration (GPU assignment, quantization, context length, etc.) for each running instance
- **Dynamic model switching** - Ability to route chat sessions to different running models without restarting the playground
- **Resource-aware scheduling** - Visibility into GPU/CPU/memory utilization to help decide which models can be co-located

### Why This Matters

As teams scale their use of local LLM serving, the ability to orchestrate multiple models from a single interface becomes essential. This would complement the existing OpenShift/K8s deployment support and make vLLM Playground a more complete solution for both individual developers and enterprise teams.

### Environment

- vLLM Playground v0.1.5
- Feature request (not a bug)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Multi-Model Serving & Deployment Management #39

Feature Request

Problem / Use Case

Proposed Feature

Why This Matters

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature Request: Multi-Model Serving & Deployment Management #39

Description

Feature Request

Problem / Use Case

Proposed Feature

Why This Matters

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions