Skip to content

Commit d72add1

Browse files
authored
[Deepseek] Pass hidden_states_fp4 to shared_experts (NVIDIA#3819)
Signed-off-by: Hao Lu <[email protected]@users.noreply.github.com>
1 parent ccd1eb6 commit d72add1

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

tensorrt_llm/_torch/models/modeling_deepseekv3.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,8 @@ def forward(
466466
assert not self.use_dp
467467

468468
def _compute_shared_output():
469-
shared_output = self.shared_experts(hidden_states)
469+
shared_output = self.shared_experts(hidden_states_fp4
470+
or hidden_states)
470471
if self.shared_output_scale is not None:
471472
shared_output *= self.shared_output_scale
472473
return shared_output

0 commit comments

Comments
 (0)