Skip to content

Commit b257760

Browse files
authored
[VLM, FSDP] Update Experiment Readme (#1079)
1 parent a9cfd75 commit b257760

File tree

3 files changed

+16
-3
lines changed

3 files changed

+16
-3
lines changed

examples/true_on_policy_vlm/README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,21 @@
22

33
This example demonstrates true on-policy training with Qwen3-VL dense model on FSDP. The core concepts and expected observations are the same as [true_on_policy](../true_on_policy/README.md).
44

5+
<p align="center">
6+
<img src="diff.png" alt="Training Inference Log Prob Diff" width="800">
7+
</p>
58
## Usage
69

710
```bash
8-
python examples/true_on_policy_vlm/run_simple.py
11+
SLIME_SCRIPT_NUM_GPUS=8 python examples/true_on_policy_vlm/run_simple.py
912
```
13+
14+
## How it is Implemented
15+
16+
For the text backbone, please refer to [true_on_policy for the text-only model](../true_on_policy/README.md).
17+
18+
For the VLM, we only need to ensure that the image encoder behaves as expected. Please refer to [SGLang#14636](https://github.com/sgl-project/sglang/pull/14636). We need to align numeric operation details between the two systems, so that the ViT forward pass matches the behavior in both SGLang and transformers.
19+
20+
## Notes
21+
22+
It is expected that the true-on-policy version is slower.
51.7 KB
Loading

slime/backends/fsdp_utils/actor.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,9 +149,9 @@ def init(self, args: Namespace, role: str, with_ref: bool = False) -> int: # ty
149149
def get_model_cls(self):
150150
# Vision models have `vision_config` in the config
151151
if hasattr(self.hf_config, "vision_config"):
152-
from transformers import AutoModelForVision2Seq
152+
from transformers import AutoModelForImageTextToText
153153

154-
return AutoModelForVision2Seq
154+
return AutoModelForImageTextToText
155155
else:
156156
from transformers import AutoModelForCausalLM
157157

0 commit comments

Comments
 (0)