Skip to content

Commit 41901d1

Browse files
committed
clip input shape to max tunable token count
Signed-off-by: Anthony Chang <[email protected]>
1 parent 4121b67 commit 41901d1

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

tensorrt_llm/_torch/autotuner.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1141,7 +1141,10 @@ def _optimization_profiles(
11411141
# Add the current input value as one of the opt values
11421142
opt_shapes = set(opt_shapes)
11431143
opt_shapes.add(
1144-
base_profile.shapes[spec.input_idx][spec.dim_idx].val)
1144+
min(
1145+
tuning_config.tune_max_num_tokens,
1146+
base_profile.shapes[spec.input_idx][spec.dim_idx].val,
1147+
))
11451148
opt_shapes = sorted(list(opt_shapes))
11461149
opt_shapes_max = tuple(opt_shapes[1:]) + (float('inf'), )
11471150
opt_shapes_max = {

0 commit comments

Comments
 (0)