-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
Hello again, the pinned versions of vllm and transformers are quite outdated and as a result it is not simple to run newer models / archs with PipelineRL. If I recall correctly, one reason to pin vllm was due to the generator / trainer mismatch, but this can largely be solved with either (https://blog.vllm.ai/2025/11/10/bitwise-consistent-train-inference.html), upcasting the LM head to FP32 or applying importance sampling.
It would be nice if the deps could be pinned to the latest compatible versions with the codebase :)
rafapi
Metadata
Metadata
Assignees
Labels
No labels