Skip to content

Commit 71c5576

Browse files
[TRTLLM-8734][feat] AutoDeploy: Enable the nvfp4 for Nemotron MOE (#8737)
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com> Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
1 parent ec31363 commit 71c5576

File tree

1 file changed

+0
-2
lines changed

1 file changed

+0
-2
lines changed

tensorrt_llm/_torch/auto_deploy/transform/library/quantization.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -375,12 +375,10 @@ def load_hook(self, state_dict, prefix, *args, weight_name):
375375
)
376376
state_dict[input_scale_name] = 1 / state_dict[input_scale_name]
377377
weight_scale = state_dict[weight_name + "_scale"].view(float4_sf_dtype)
378-
ori_shape = weight_scale.shape
379378
state_dict[weight_name + "_scale"] = (
380379
torch.ops.trtllm.block_scale_interleave(
381380
weight_scale.view(torch.uint8).cpu().contiguous()
382381
)
383-
.reshape(ori_shape)
384382
.view(float4_sf_dtype)
385383
.reshape(-1)
386384
)

0 commit comments

Comments
 (0)