Commit af31a16
fix ddp for nvfp4 on A100 (vllm-project#2404)
depends on vllm-project/compressed-tensors#603
Summary:
nccl only allows broadcasting fp8 on a100 but we can work around it with
this util
Test Plan:
<details>
Test Script
</details>
Signed-off-by: HDCharles <charlesdavidhernandez@gmail.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>1 parent 6b8a224 commit af31a16
1 file changed
+4
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| |||
358 | 358 | | |
359 | 359 | | |
360 | 360 | | |
361 | | - | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
362 | 364 | | |
363 | 365 | | |
364 | 366 | | |
| |||
0 commit comments