Skip to content

Commit b6a7f8f

Browse files
smallv0221FrostML
andauthored
topk kernel optimization (#811)
* topk kernel memory optimization Co-authored-by: liu zhengxi <[email protected]>
1 parent 79eaa2b commit b6a7f8f

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

paddlenlp/ops/patches/FasterTransformer/cuda/topk_kernels.cu

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -604,8 +604,7 @@ void topK_sampling_kernel_kernelLauncher(void* workspace,
604604

605605
int topk_tmp_ids_buf_size =
606606
args.batch_size_ * args.candidate_num_; // type int
607-
int temp_log_probs_buf_size =
608-
args.batch_size_ * args.candidate_num_ * vocab_size;
607+
int temp_log_probs_buf_size = args.batch_size_ * vocab_size;
609608
int topk_tmp_val_buf_size = args.batch_size_ * args.candidate_num_; // type T
610609

611610
temp_log_probs_buf_size = (int)(ceil(temp_log_probs_buf_size / 4.)) * 4;

0 commit comments

Comments
 (0)