Change default `vllm_mode` to `"colocate"` and add v0→v1 migration guide by qgallouedec · Pull Request #5255 · huggingface/trl

qgallouedec · 2026-03-10T04:57:38Z

Summary

Default vllm_mode changed from "server" to "colocate" across all configs (GRPOConfig, RLOOConfig, GOLDConfig, OnlineDPOConfig). Colocate is a better default — it works out of the box without spinning up a separate vLLM server, which is the most common friction point for new users. Users who need server mode can still set vllm_mode="server" explicitly.
Updated docs and tests that implicitly assumed the old "server" default to explicitly set vllm_mode="server" where needed (vllm_integration.md, grpo_trainer.md, rloo_trainer.md, speeding_up_training.md, test_online_dpo_trainer.py).
Added a minimal MIGRATION.md at the repo root for v0 → v1. It should stay small — we have already done most of the breaking work in v0.29 (trainers moved to experimental, model classes removed, etc.), so the guide just highlights what is genuinely new and keeps the v0.29 changes in collapsible sections for anyone migrating from an older version.

Note

Medium Risk
Changes a widely-used default that can affect training/inference GPU memory usage and runtime behavior for users relying on implicit vLLM server mode; code changes are small but behaviorally impactful.

Overview
Switches the default vLLM integration mode from server to colocate across configs (GRPOConfig, RLOOConfig, OnlineDPOConfig, GOLDConfig) and updates VLLMGeneration’s default accordingly, so use_vllm=True now runs vLLM in-process unless vllm_mode="server" is explicitly set.

Documentation and examples are updated to reflect the new default and to explicitly opt into server mode where a separate vLLM server is assumed (grpo_trainer.md, rloo_trainer.md, speeding_up_training.md, vllm_integration.md), and the Online DPO vLLM test is adjusted to run in server mode explicitly and to assert the new default. Adds a small root MIGRATION.md highlighting the breaking default change for v0→v1.

^{Written by Cursor Bugbot for commit 40beb9d. This will update automatically on new commits. Configure here.}

Colocate mode is a better default: it works out of the box without requiring a separate vLLM server process, which is the most common friction point for new users. Users who need server mode can still set `vllm_mode="server"`. Also adds a minimal migration guide for v0 to v1. Most breaking changes (trainers moved to experimental, removed model classes, etc.) already shipped in v0.29, so the guide focuses on what's new and keeps v0.29 changes in a collapsible section for reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

HuggingFaceDocBuilderDev · 2026-03-10T05:00:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 424fc21035

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

MIGRATION.md

- Add explicit vllm_mode="server" in docs/tests that rely on server mode - Remove stale "default value, can be omitted" comments - Move migration guide from docs to MIGRATION.md at repo root Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Colocate is now Option 1 (default), server is Option 2. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…mments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

albertvillanova

Thanks. A couple of questions below.

albertvillanova · 2026-03-10T13:39:52Z

trl/generation/vllm_generation.py

-        mode (`str`, *optional*, defaults to `"server"`): vLLM mode. Must be one of `"server"` or
-            `"colocate"`.
+        mode (`str`, *optional*, defaults to `"colocate"`):
+            vLLM mode. Must be one of `"colocate"` or `"server"`.


Shouldn't we follow a deprecation cycle instead? This would give users time to explicitly set the mode until v1 is released.

I considered a deprecation cycle, but since v1 is right ahead, I’d recommend we skip it. Best practice is usually a deprecation cycle when you’ve got more pre-releases coming, but here, v0.29 was the last one, and v1 should be our stable baseline.
Let’s just set the final defaults at v1, communicate it clearly with a migration guide.

albertvillanova · 2026-03-10T13:46:30Z

MIGRATION.md

Not sure of fully understanding this: are we creating a migration guide for changes that were already released in the previous minor version?

Yeah, I was a bit hesitant about including those pre-0.29 changes. I realize now it might be confusing, since they’re already effective. I see how that could mislead, so I’ll take them out and keep the guide focused on what’s truly new.

Thanks for your clarifying explanation: that makes sense to me.

That said, I also understand the original motivation. I think having a place summarizing the recent breaking changes can be useful for users upgrading after skipping a few versions.

What about this compromise?

Keep MIGRATION.md focused only on the v0 to v1 migration

Document the v0.29 breaking changes elsewhere

For now, we document these in the release notes: is this sufficient?

What about introducing a CHANGELOG.md to track changes across versions?

Even though for v1.x series we will try to avoid breaking changes, maybe we could have a clear place to document them in case any become unavoidable

We could optionally add a short note in the migration guide pointing users to the v0.29 release notes in case they are upgrading from an earlier version

+1 on the CHANGELOG.md idea.

I usually think about migration from a major to another if they were breaking changes. So I think this file is more of "nice to have" for users and mandatory for now ?

albertvillanova

Thanks for your explanation. Just a comment below.

albertvillanova · 2026-03-11T06:54:36Z

MIGRATION.md

Thanks for your clarifying explanation: that makes sense to me.

That said, I also understand the original motivation. I think having a place summarizing the recent breaking changes can be useful for users upgrading after skipping a few versions.

What about this compromise?

Keep MIGRATION.md focused only on the v0 to v1 migration

Document the v0.29 breaking changes elsewhere

For now, we document these in the release notes: is this sufficient?

What about introducing a CHANGELOG.md to track changes across versions?

Even though for v1.x series we will try to avoid breaking changes, maybe we could have a clear place to document them in case any become unavoidable

We could optionally add a short note in the migration guide pointing users to the v0.29 release notes in case they are upgrading from an earlier version

chatgpt-codex-connector bot reviewed Mar 10, 2026

View reviewed changes

MIGRATION.md Show resolved Hide resolved

qgallouedec and others added 3 commits March 10, 2026 05:03

Swap colocate/server order in docs to reflect new default

7c7d1b3

Colocate is now Option 1 (default), server is Option 2. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Swap colocate/server order in vllm_integration.md and remove stale co…

918635b

…mments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

qgallouedec requested review from AmineDiro and albertvillanova March 10, 2026 05:18

albertvillanova reviewed Mar 10, 2026

View reviewed changes

qgallouedec and others added 2 commits March 10, 2026 20:46

Remove already shipped changes in MIGRATION.md

7e7047d

Merge branch 'main' into default-vllm-colocate

40beb9d

albertvillanova approved these changes Mar 11, 2026

View reviewed changes

AmineDiro approved these changes Mar 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change default `vllm_mode` to `"colocate"` and add v0→v1 migration guide#5255

Change default `vllm_mode` to `"colocate"` and add v0→v1 migration guide#5255
qgallouedec wants to merge 6 commits intomainfrom
default-vllm-colocate

qgallouedec commented Mar 10, 2026 •

edited by cursor bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

albertvillanova left a comment

Uh oh!

albertvillanova Mar 10, 2026

Uh oh!

qgallouedec Mar 10, 2026

Uh oh!

albertvillanova Mar 10, 2026

Uh oh!

qgallouedec Mar 10, 2026

Uh oh!

albertvillanova Mar 11, 2026

Uh oh!

AmineDiro Mar 11, 2026

Uh oh!

albertvillanova left a comment

Uh oh!

albertvillanova Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

qgallouedec commented Mar 10, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

HuggingFaceDocBuilderDev commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

albertvillanova left a comment

Choose a reason for hiding this comment

Uh oh!

albertvillanova Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

albertvillanova Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

albertvillanova Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

AmineDiro Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

albertvillanova left a comment

Choose a reason for hiding this comment

Uh oh!

albertvillanova Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

qgallouedec commented Mar 10, 2026 •

edited by cursor bot

Loading