Skip to content

Commit 3b7c20a

Browse files
authored
[Bugfix] Apply same sampling parameters for both n=1 and n>1 (vllm-project#26005)
Signed-off-by: Kenichi Maehashi <[email protected]>
1 parent f9e7148 commit 3b7c20a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/v1/engine/async_llm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -290,7 +290,7 @@ async def add_request(
290290
return queue
291291

292292
# Fan out child requests (for n>1).
293-
parent_request = ParentRequest(request_id, params)
293+
parent_request = ParentRequest(request_id, request.sampling_params)
294294
for idx in range(params.n):
295295
request_id, params = parent_request.get_child_info(idx)
296296
child_request = request if idx == params.n - 1 else copy(request)

0 commit comments

Comments
 (0)