Skip to content

Kd vllm generation#5351

Open
cmpatino wants to merge 7 commits intohuggingface:mainfrom
cmpatino:kd-vllm-generation
Open

Kd vllm generation#5351
cmpatino wants to merge 7 commits intohuggingface:mainfrom
cmpatino:kd-vllm-generation

Conversation

@cmpatino
Copy link
Collaborator

@cmpatino cmpatino commented Mar 23, 2026

What does this PR do?

Addresses the comment from #5137 to use trl.generation.VLLMGeneration instead of the separate vLLM logic.

Before submitting

  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.


Note

Medium Risk
Refactors GOLD’s vLLM integration and weight-sync path, which can affect distributed generation correctness/performance (server vs colocated, TP/FSDP/ZeRO/PEFT). Risk is moderate due to changed generation inputs (token IDs) and new sync scheduling, but scope is limited to GOLD’s vLLM code paths and new config knobs.

Overview
Refactors GOLD’s vLLM integration to use the shared trl.generation.VLLMGeneration helper instead of maintaining trainer-local vLLM engine/client, weight-sync, and sampling logic.

Adds new vLLM configuration options in GOLDConfig (e.g., vllm_server_base_url, vllm_group_port, vllm_max_model_length, vllm_model_impl) and updates on-policy generation to pass prompt token IDs directly, with periodic sync_weights() driven by vllm_sync_frequency rather than a custom Trainer callback.

Written by Cursor Bugbot for commit 91715cb. This will update automatically on new commits. Configure here.

@cmpatino cmpatino marked this pull request as ready for review March 23, 2026 12:15
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants