Skip to content

Commit f2dd0ee

Browse files
authored
[None][chore] Correct sorting order for attention DP scheduling to prioritize non-relaxed requests (#11106)
Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>
1 parent 322471c commit f2dd0ee

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

tensorrt_llm/_torch/pyexecutor/executor_request_queue.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -445,7 +445,7 @@ def get_relax_value(req_item):
445445
return True
446446
return scheduling_params.attention_dp_relax
447447

448-
new_requests = sorted(new_requests, key=get_relax_value, reverse=True)
448+
new_requests = sorted(new_requests, key=get_relax_value)
449449

450450
# Try to put the requests to the target dp rank until the max_num_active_requests is reached
451451
remaining_unscheduled = []

0 commit comments

Comments
 (0)