Skip to content

Commit 87551df

Browse files
turn off plugin for all fp8 qwen models
1 parent 743ae21 commit 87551df

File tree

6 files changed

+6
-6
lines changed

6 files changed

+6
-6
lines changed

qwen/engine-qwen-2-5-14b-coder-instruct/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ trt_llm:
4242
tensor_parallel_count: 1
4343
plugin_configuration:
4444
use_paged_context_fmha: true
45-
use_fp8_context_fmha: true
45+
use_fp8_context_fmha: false
4646
paged_kv_cache: true
4747
runtime:
4848
batch_scheduler_policy: max_utilization

qwen/engine-qwen-2-5-14b-instruct/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ trt_llm:
3636
tensor_parallel_count: 1
3737
plugin_configuration:
3838
use_paged_context_fmha: true
39-
use_fp8_context_fmha: true
39+
use_fp8_context_fmha: false
4040
paged_kv_cache: true
4141
runtime:
4242
batch_scheduler_policy: max_utilization

qwen/engine-qwen-2-5-32b-coder-instruct/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ trt_llm:
4242
tensor_parallel_count: 1
4343
plugin_configuration:
4444
use_paged_context_fmha: true
45-
use_fp8_context_fmha: true
45+
use_fp8_context_fmha: false
4646
paged_kv_cache: true
4747
runtime:
4848
batch_scheduler_policy: max_utilization

qwen/engine-qwen-2-5-32b-instruct/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ trt_llm:
4242
tensor_parallel_count: 1
4343
plugin_configuration:
4444
use_paged_context_fmha: true
45-
use_fp8_context_fmha: true
45+
use_fp8_context_fmha: false
4646
paged_kv_cache: true
4747
runtime:
4848
batch_scheduler_policy: max_utilization

qwen/engine-qwen-2-5-72b-instruct/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ trt_llm:
4242
tensor_parallel_count: 2
4343
plugin_configuration:
4444
use_paged_context_fmha: true
45-
use_fp8_context_fmha: true
45+
use_fp8_context_fmha: false
4646
paged_kv_cache: true
4747
runtime:
4848
batch_scheduler_policy: max_utilization

qwen/engine-qwen-2-5-72b-math-instruct/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ trt_llm:
4242
tensor_parallel_count: 2
4343
plugin_configuration:
4444
use_paged_context_fmha: true
45-
use_fp8_context_fmha: true
45+
use_fp8_context_fmha: false
4646
paged_kv_cache: true
4747
runtime:
4848
batch_scheduler_policy: max_utilization

0 commit comments

Comments
 (0)