Skip to content

Commit 96704b5

Browse files
committed
Update on "add module level benchmark for gemma3 model"
This diff adds a module-level benchmark for the GEMMA3 model. Also introduce mutlmodal_benchmark.cpp to replace original voxtral_runner.cpp for benchmarking both gemma3 and voxtral model in module level. Differential Revision: [D84958564](https://our.internmc.facebook.com/intern/diff/D84958564/) [ghstack-poisoned]
1 parent b325341 commit 96704b5

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

backends/cuda/cuda_backend.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,9 @@ def preprocess(
140140
user_input_placeholders.append(node.meta["val"])
141141

142142
options: dict[str, typing.Any] = {
143+
# Disable this to support sdpa decomposition
144+
# TODO(gasoonjia): remove it after pin bump to latest pytorch
145+
"loop_ordering_after_fusion": False,
143146
# Better model precision
144147
"emulate_precision_casts": True,
145148
# Embed CUDA kernel binaries directly into the compiled shared object

0 commit comments

Comments
 (0)