Commit 36c2d4a
[rollout] fix: correct heap-based load balancing in AsyncLLMServerManager (verl-project#4505)
### What does this PR do?
This PR fixes the load balancing issue in AsyncLLMServerManager where
the heap-based server selection was using hash values instead of
indices, causing unpredictable server selection order after shuffling.
Problem:
The original implementation used hash(server) as the secondary sort key
in the heap
When all servers had the same request count (0), the heap would select
the server with the minimum hash value, not the first server in the
shuffled list
This resulted in poor load distribution and defeated the purpose of
random shuffling
Solution:
Replace hash(server) with explicit indices in the heap structure
Heap now sorts by (request_count, index, server) instead of
(request_count, hash, server)
Ensures deterministic selection: when request counts are equal, the
server with the lowest index (first in the shuffled list) is always
chosen
Example:
# Before (❌ Broken):
server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles) # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, hash(s), s] for s in server_handles]
# Heap might select Server_A first (min hash), not Server_C!
# After (✅ Fixed):
server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles) # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, idx, s] for idx, s in enumerate(server_handles)]
# Heap correctly selects Server_C first (idx=0)
Co-authored-by: Aleksandr Semikin <aesemikin@alice-a100.sas.yp-c.yandex.net>1 parent ec14a87 commit 36c2d4a
1 file changed
+2
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
84 | | - | |
| 84 | + | |
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| |||
0 commit comments