Skip to content

Commit 7e3e379

Browse files
committed
Switch back to triton backend for fuse_rmsnorm
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
1 parent 57c5794 commit 7e3e379

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

tests/integration/defs/examples/test_ad_speculative_decoding.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,9 @@ def run_with_autodeploy(model, speculative_model_dir, batch_size):
8181
"world_size": 1,
8282
"kv_cache_config": kv_cache_config,
8383
"disable_overlap_scheduler": True,
84+
"transforms": {
85+
"fuse_rmsnorm": {"rmsnorm_backend": "triton"},
86+
},
8487
"max_num_tokens": 64,
8588
}
8689

0 commit comments

Comments
 (0)