You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an issue to request for comments (RFC) from veRL maintainers on an implementation that would fundamentally support synchronizing shard weights from training instance to inferencing instance.
Background
If the training instance and inference instance (VerlEngine) is launched separately, i.e. they are not in the same process, model weights needed to transported from training instance to inference engine. It is a pretty large overhead, especially for models like DeepSeek V3 671B with bf16 datatype. If tp size is 32 for inferencing, every inference instance only needs 671/32=21B parameter.
Use-case:
Inference Side
Build sync weights process group with training instance
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
This is an issue to request for comments (RFC) from veRL maintainers on an implementation that would fundamentally support synchronizing shard weights from training instance to inferencing instance.
If the training instance and inference instance (VerlEngine) is launched separately, i.e. they are not in the same process, model weights needed to transported from training instance to inference engine. It is a pretty large overhead, especially for models like DeepSeek V3 671B with bf16 datatype. If tp size is 32 for inferencing, every inference instance only needs 671/32=21B parameter.
Inference Side
Build sync weights process group with training instance
Receive shard weights from training instance and update weights in place
Training Side
Build sync weights process group with inference instance
Beta Was this translation helpful? Give feedback.
All reactions