Support Scaling Data Designer on SLURM

When Data Designer is used on a SLURM-managed GPU cluster, it should be able to automatically manage model servers required to run generation and preview jobs.

**What this feature should do**

* Automatically spin up and tear down model servers on SLURM
* Launch model servers (e.g. via vLLM) as SLURM jobs when needed.
* Shut them down when they are no longer in use.

**Support interactive preview workflows**
* Allow users to interactively query models for Data Designer preview jobs.
* Support streaming responses.
* Keep model servers alive for the duration of an interactive session, then clean them up.

**Support large-scale batch generation**
* Scale model servers up and down to efficiently execute Data Designer jobs.
* Execute work within a user-defined GPU budget for the job.
* Users explicitly specify how many GPUs they are making available to a Data Designer job.
* Data Designer uses only those GPUs and does not require manual placement or provisioning.

**Data Designer determines how to:**
* Split work across models.
* Scale model replicas.
* Assign GPUs to each model instance.
* Provide a simple user-facing configuration

**Users specify:**

* Which models they want to use.
* The total number of GPUs available to the job (and optionally per-model GPU needs).
* Data Designer handles model lifecycle, scaling, and GPU utilization automatically.

**Outcome**

From the user’s perspective, running Data Designer on SLURM should require no manual model orchestration. Users declare their model needs and GPU budget, and Data Designer automatically provisions, scales, and cleans up model servers within those constraints.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Scaling Data Designer on SLURM #160

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support Scaling Data Designer on SLURM #160

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions