You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* update with miles
* adding FSDP in miles
* delete link
* add ack
* folding details
* solve miles
* fix complie
---------
Co-authored-by: zhaochenyang20 <[email protected]>
After training ends, the latest weights are synchronized back to the Inference Engine (this is the best definition of the term refit). In `update_weight_utis.py`, we fully support all modes: `colocated` and `distributed`. The former alternates train / rollout occupying the same batch of GPUs, while the latter distributes train / rollout on different GPUs. For both methods, we adopted a bucketed asynchronous update strategy [Reference](https://hebiao064.github.io/rl-weight-sync), synchronizing chunked weights to the inference engine one by one, minimizing peak memory usage as much as possible.
140
140
141
141
<palign="center">
142
-
<imgsrc="./pic/4_fsdp_refit.png"alt="Update weights from training to inference with async tensor handle and bucket"width="50%" />
142
+
<imgsrc="/images/blog/miles-fsdp/4_fsdp_refit.png"alt="Update weights from training to inference with async tensor handle and bucket"width="50%" />
143
143
</p>
144
144
145
145
> ✅ For specific mechanisms of weight update, welcome to check the previous blogs of SGLang RL group: [**RL System Deep Thinking: Weight Update Mechanisms**](https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/sys-design/readme-1-EN.md)
@@ -162,7 +162,7 @@ Experimental Environment: Single node H100, Miles 0.5.5post1
162
162
Megatron, FSDP colocated w ref model, FSDP colocated w/o ref model
0 commit comments