Skip to content

v1: Offloading connector #22595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

orozery
Copy link
Contributor

@orozery orozery commented Aug 10, 2025

This PR adds an offloading connector that delegates to a generic API introduced in #19848.
The actual implementation of this API is built using a factory which is currently empty.
A follow-up small PR will register a CPU implementation based on #20075 (scheduler-side implementation) and #21448 (worker-side implementation).

Part of RFC #19854.
Depends on PRs #19728, #19848, #19737.

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR introduces a new offloading connector. The implementation is extensive and adds a lot of new components. My review found several critical issues that need to be addressed. These include a race condition in the tests, a critical assertion that would crash workers on transfer failures, a resource leak due to unjoined threads, and an incorrect list slicing that would lead to errors. These issues affect both the correctness of the new feature and the reliability of its tests.

@orozery orozery force-pushed the offloading-connector branch 2 times, most recently from 4b24d03 to 4fca175 Compare August 10, 2025 14:49
@KuntaiDu
Copy link
Collaborator

mark, will take a look and review after this PR gets stable.

@orozery orozery force-pushed the offloading-connector branch from 4fca175 to 8d7a0d7 Compare August 11, 2025 13:43
@mergify mergify bot added the documentation Improvements or additions to documentation label Aug 11, 2025
@orozery orozery force-pushed the offloading-connector branch from 8d7a0d7 to 866a51c Compare August 11, 2025 18:54
This commit adds a new offloading component, composed of:
1. A scheduler side OffloadingManager (abstract) which kicks-off KV data transfers and keeps track of offloaded data.
2. A worker side OffloadingQueueManager which asynchronously manages KV transfers.

Signed-off-by: Or Ozeri <[email protected]>
This commit move the request block hashes from the KVCacheManager
to the Request object itself.
In particular, this will allow connectors to access the request block hashes.

Signed-off-by: Or Ozeri <[email protected]>
This commit adds a new scheduler-side connector API
to collect KV cache events.
Additionally, we add a medium field to KV events, to allow
distinguishing KV events on different mediums
(e.g. blocks stored on cpu, disk, or gpu (default)).

Signed-off-by: Or Ozeri <[email protected]>
This commit introduces a new OffloadingConnector for
offloading blocks of KV data via a generic interface.

Signed-off-by: Or Ozeri <[email protected]>
@orozery orozery force-pushed the offloading-connector branch from 866a51c to 4872976 Compare August 11, 2025 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build documentation Improvements or additions to documentation v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants