Skip to content

Commit 685c99e

Browse files
Yue Zhangnjhill
andauthored
[KV offload] Offloading connector async scheduling support (vllm-project#27648)
Signed-off-by: KevinCheung2259 <2651309292@qq.com> Co-authored-by: Nick Hill <nhill@redhat.com>
1 parent 1e88fb7 commit 685c99e

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

vllm/distributed/kv_transfer/kv_connector/v1/offloading_connector.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -274,8 +274,8 @@ def _get_reqs_to_store(self, scheduler_output: SchedulerOutput):
274274
if num_new_blocks <= 0:
275275
continue
276276

277-
num_gpu_blocks = num_blocks * self.block_size_factor
278-
assert len(req.block_hashes) >= num_gpu_blocks
277+
# NOTE: In async scheduling, placeholders may temporarily make
278+
# len(req.block_hashes) < num_blocks * self.block_size_factor.
279279

280280
new_block_hashes = self._get_block_hashes(
281281
req, start_idx=start_block_idx, end_idx=num_blocks

0 commit comments

Comments
 (0)