-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
[Misc] allow pulling vllm in Ray runtime environment #21143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables the propagation of a Ray runtime environment to vLLM's distributed workers. This is a useful feature when vLLM is used as a component within a larger Ray application that defines a specific runtime environment.
The changes are well-targeted:
- The
ParallelConfig
is extended to hold an optionalruntime_env
. - When creating the engine configuration inside a Ray actor, the current
runtime_env
is fetched from the Ray context and stored in theParallelConfig
. - When the Ray executor initializes the Ray cluster, it now passes this
runtime_env
toray.init()
, ensuring that subsequently created workers inherit the correct environment.
I've reviewed the implementation, and the logic appears sound and correctly handles the cases where Ray is already initialized versus when vLLM needs to initialize it. The changes are constrained to the Ray execution path and should not affect other backends. Overall, this is a good addition to improve vLLM's integration with the Ray ecosystem.
Signed-off-by: eric-higgins-ai <[email protected]>
Signed-off-by: eric-higgins-ai <[email protected]>
Signed-off-by: eric-higgins-ai <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I fully understand the problem. @eric-higgins-ai Would you mind clarifying a bit more?
cc @lk-chen @kouroshHakha are you aware of this Ray Data LLM issue?
# This call initializes Ray automatically if it is not initialized, | ||
# but we should not do this here. | ||
placement_group = ray.util.get_current_placement_group() | ||
runtime_env = ray.get_runtime_context().runtime_env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are passed in the runtime env?
@ruisearch42 sorry, I probably should have explained it more thoroughly in the description. We're pulling vllm in the ray runtime environment, so we launch the job with something like This issue was also reported a few days ago here (I intended to link this in the description but seem to have forgotten) |
@eric-higgins-ai thanks for the context. The PR generally looks good to me.
Can you clarify what you runtime_env you saw in the Ray dashboard? Did you pass in pip dependencies? |
@eric-higgins-ai , could you add a unit test? It will be useful to verify it works and prevent future regressions. |
@ruisearch42 to answer your questions:
|
Thanks @eric-higgins-ai . |
thanks for the fix @ruisearch42! would rather not send my email here for fear of receiving spam email 😅 I'm ok with not being credited as co-author |
fyi @eric-higgins-ai if you go to https://github.com/settings/emails you can use a GitHub provided noreply email for signing off commits |
Purpose
The engine is run in a spawned subprocess, which Ray interprets as a new job with its own runtime environment. This means that vllm can't be pulled through the Ray runtime environment, as we don't pass the original job's runtime env through to the subprocess.
This issue was reported here.
Test Plan
Ran a Ray job with the following code
Test Result
I checked in the Ray dashboard that the launched job has the runtime env provided in the
engine_kwargs
.