Skip to content

Commit 50378ba

Browse files
committed
Revert "hack: acti - mid num blocks"
This reverts commit f670fa4.
1 parent f670fa4 commit 50378ba

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

csrc/fused_moe/cutlass_backend/cutlass_fused_moe_kernels.cuh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2439,7 +2439,7 @@ void doActivation(T* output, GemmOutputType const* gemm_result, float const* fp8
24392439

24402440
static int64_t const smCount = tensorrt_llm::common::getMultiProcessorCount();
24412441
// Note: Launching 8 blocks per SM can fully leverage the memory bandwidth (tested on B200).
2442-
int64_t const blocks = std::min(smCount * 128, std::max(expanded_num_tokens, num_padding_tokens));
2442+
int64_t const blocks = std::min(smCount * 16384, std::max(expanded_num_tokens, num_padding_tokens));
24432443
int64_t const threads = ACTIVATION_THREADS_PER_BLOCK;
24442444

24452445
auto fn = [&]() {

0 commit comments

Comments
 (0)