Skip to content

Commit 2924d85

Browse files
authored
[docker] fix r3 gather buffer (#1129)
1 parent 8a825f7 commit 2924d85

File tree

1 file changed

+7
-3
lines changed

1 file changed

+7
-3
lines changed

docker/patch/latest/sglang.patch

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -301,10 +301,10 @@ index e7d5a67cc..639e47163 100644
301301
out_hidden_states[begin_chunk_idx:end_chunk_idx],
302302
diff --git a/python/sglang/srt/layers/moe/routed_experts_capturer.py b/python/sglang/srt/layers/moe/routed_experts_capturer.py
303303
new file mode 100644
304-
index 000000000..732f7859d
304+
index 000000000..7369f9dc9
305305
--- /dev/null
306306
+++ b/python/sglang/srt/layers/moe/routed_experts_capturer.py
307-
@@ -0,0 +1,304 @@
307+
@@ -0,0 +1,308 @@
308308
+import logging
309309
+from abc import ABC
310310
+from contextlib import contextmanager
@@ -496,8 +496,12 @@ index 000000000..732f7859d
496496
+ )
497497
+
498498
+ if get_moe_a2a_backend().is_deepep():
499+
+ attn_tp_size = get_attention_tp_size() if is_dp_attention_enabled() else 1
499500
+ self.gather_buffer = torch.empty(
500-
+ (self.device_cache.buffer.shape[0], self.device_cache.buffer.shape[2]),
501+
+ (
502+
+ self.device_cache.buffer.shape[0] * attn_tp_size,
503+
+ self.device_cache.buffer.shape[2],
504+
+ ),
501505
+ dtype=torch.int32,
502506
+ device=device,
503507
+ )

0 commit comments

Comments
 (0)