[Misc] allow pulling vllm in Ray runtime environment #21143

eric-higgins-ai · 2025-07-17T21:31:13Z

Purpose

The engine is run in a spawned subprocess, which Ray interprets as a new job with its own runtime environment. This means that vllm can't be pulled through the Ray runtime environment, as we don't pass the original job's runtime env through to the subprocess.

This issue was reported here.

Test Plan

Ran a Ray job with the following code

vision_processor_config = vLLMEngineProcessorConfig(
        model="Qwen/Qwen2.5-VL-32B-Instruct",
        engine_kwargs=dict(
            tensor_parallel_size=1,  
            pipeline_parallel_size=NUMBER_OF_GPUS,
            max_model_len=4096,
            enable_chunked_prefill=True,
            max_num_batched_tokens=2048,
            distributed_executor_backend="ray",
            device="cuda",
        ),
        # Override Ray's runtime env to include the Hugging Face token. Ray Data uses Ray under the hood to orchestrate the inference pipeline.
        runtime_env=dict(
            env_vars=dict(
                HF_TOKEN="<token>",
                VLLM_USE_V1="1",
            ),
        ),
        batch_size=1,
        concurrency=1,
        has_image=False
    )
    
    #build the processor
    processor = build_llm_processor(
        vision_processor_config,
        preprocess=lambda row: dict(
            messages=[
                {"role": "system", "content": "You are a bot that responds with haikus."},
                {"role": "user", "content": row["item"]}
            ],
            sampling_params=dict(
                temperature=0.3,
                max_tokens=250,
            )
        ),
        postprocess=lambda row: dict(
            answer=row["generated_text"],
            **row  # This will return all the original columns in the dataset.
        ),
    )

    #create the dataset
    ds = ray.data.from_items(["Start of the haiku is: Complete this for me..."])
    ds = processor(ds)
    ds.show(limit=1)

Test Result

I checked in the Ray dashboard that the launched job has the runtime env provided in the engine_kwargs.

github-actions · 2025-07-17T21:31:21Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request enables the propagation of a Ray runtime environment to vLLM's distributed workers. This is a useful feature when vLLM is used as a component within a larger Ray application that defines a specific runtime environment.

The changes are well-targeted:

The ParallelConfig is extended to hold an optional runtime_env.
When creating the engine configuration inside a Ray actor, the current runtime_env is fetched from the Ray context and stored in the ParallelConfig.
When the Ray executor initializes the Ray cluster, it now passes this runtime_env to ray.init(), ensuring that subsequently created workers inherit the correct environment.

I've reviewed the implementation, and the logic appears sound and correctly handles the cases where Ray is already initialized versus when vLLM needs to initialize it. The changes are constrained to the Ray execution path and should not affect other backends. Overall, this is a good addition to improve vLLM's integration with the Ray ecosystem.

Signed-off-by: eric-higgins-ai <[email protected]>

ruisearch42

Not sure if I fully understand the problem. @eric-higgins-ai Would you mind clarifying a bit more?
cc @lk-chen @kouroshHakha are you aware of this Ray Data LLM issue?

ruisearch42 · 2025-07-19T00:49:31Z

vllm/engine/arg_utils.py

            # This call initializes Ray automatically if it is not initialized,
            # but we should not do this here.
            placement_group = ray.util.get_current_placement_group()
+            runtime_env = ray.get_runtime_context().runtime_env


what are passed in the runtime env?

eric-higgins-ai · 2025-07-20T06:20:58Z

@ruisearch42 sorry, I probably should have explained it more thoroughly in the description. We're pulling vllm in the ray runtime environment, so we launch the job with something like ray job submit --runtime-env-json '{"pip": ["vllm==0.9.2"]}' -- python3 script.py. vllm spawns a subprocess for the engine core and then runs ray.init in that subprocess - since this process is independent from the main one, ray interprets this as a new job with its own runtime environment. The current code doesn't pass any runtime environment into the ray.init call, so any tasks spawned by this new process (like the workers here) won't have vllm installed.

This issue was also reported a few days ago here (I intended to link this in the description but seem to have forgotten)

ruisearch42 · 2025-07-21T17:11:26Z

@eric-higgins-ai thanks for the context. The PR generally looks good to me.
Some quick questions: is the main goal passing pip dependencies in runtime_env? Is using an image with vLLM not an option?

I checked in the Ray dashboard that the launched job has the runtime env provided in the engine_kwargs.

Can you clarify what you runtime_env you saw in the Ray dashboard? Did you pass in pip dependencies?

ruisearch42 · 2025-07-22T16:08:52Z

@eric-higgins-ai , could you add a unit test? It will be useful to verify it works and prevent future regressions.
After that I will approve and merge the PR.

eric-higgins-ai · 2025-07-23T22:13:02Z

@ruisearch42 to answer your questions:

The main goal is indeed to pass pip dependencies in runtime_env. We could build a docker image with vllm, but 1. it doesn't integrate that well with our infra, and 2. it seems to me like this should be supported by vllm anyway, and it doesn't seem that hard to add support for it (considering that this PR is quite small)
I passed in {"pip": ["vllm==0.9.2"]} and saw the same thing in the ray dashboard. We're also passing some env vars and I saw those too
I'm a little busy with other things right now, but I'll try to add a unit test in the next few days

ruisearch42 · 2025-07-31T22:47:24Z

Thanks @eric-higgins-ai .
I'm fixing the issue in this PR: #22040
similar to what's done here and with added unit test.
Could you let me know your email, will add you as co-author.

eric-higgins-ai · 2025-08-08T18:44:22Z

thanks for the fix @ruisearch42! would rather not send my email here for fear of receiving spam email 😅 I'm ok with not being credited as co-author

hmellor · 2025-08-08T19:11:13Z

fyi @eric-higgins-ai if you go to https://github.com/settings/emails you can use a GitHub provided noreply email for signing off commits

eric-higgins-ai requested review from simon-mo, WoosukKwon, youkaichao, robertgshaw2-redhat, mgoin, tlrmchlsmth, houseroad and hmellor as code owners July 17, 2025 21:31

gemini-code-assist bot reviewed Jul 17, 2025

View reviewed changes

eric-higgins-ai force-pushed the main branch from 7fab730 to cd6f791 Compare July 17, 2025 21:32

[Misc] pass Ray runtime env to engine core

fb38d2b

Signed-off-by: eric-higgins-ai <[email protected]>

eric-higgins-ai force-pushed the main branch from cd6f791 to fb38d2b Compare July 17, 2025 21:34

simon-mo assigned ruisearch42 Jul 17, 2025

eric-higgins-ai added 2 commits July 17, 2025 14:58

fix pre-commit

e098713

Signed-off-by: eric-higgins-ai <[email protected]>

oops

5cb10cb

Signed-off-by: eric-higgins-ai <[email protected]>

ruisearch42 reviewed Jul 19, 2025

View reviewed changes

ruisearch42 mentioned this pull request Aug 1, 2025

[Misc] Getting and passing ray runtime_env to workers #22040

Merged

4 tasks

eric-higgins-ai closed this Aug 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] allow pulling vllm in Ray runtime environment #21143

[Misc] allow pulling vllm in Ray runtime environment #21143

Uh oh!

eric-higgins-ai commented Jul 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

ruisearch42 left a comment

Uh oh!

ruisearch42 Jul 19, 2025

Uh oh!

eric-higgins-ai commented Jul 20, 2025 •

edited

Loading

Uh oh!

ruisearch42 commented Jul 21, 2025 •

edited

Loading

Uh oh!

ruisearch42 commented Jul 22, 2025

Uh oh!

eric-higgins-ai commented Jul 23, 2025 •

edited

Loading

Uh oh!

ruisearch42 commented Jul 31, 2025

Uh oh!

eric-higgins-ai commented Aug 8, 2025

Uh oh!

hmellor commented Aug 8, 2025

Uh oh!

Uh oh!

Uh oh!

[Misc] allow pulling vllm in Ray runtime environment #21143

[Misc] allow pulling vllm in Ray runtime environment #21143

Uh oh!

Conversation

eric-higgins-ai commented Jul 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Jul 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ruisearch42 left a comment

Choose a reason for hiding this comment

Uh oh!

ruisearch42 Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

eric-higgins-ai commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ruisearch42 commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ruisearch42 commented Jul 22, 2025

Uh oh!

eric-higgins-ai commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ruisearch42 commented Jul 31, 2025

Uh oh!

eric-higgins-ai commented Aug 8, 2025

Uh oh!

hmellor commented Aug 8, 2025

Uh oh!

Uh oh!

eric-higgins-ai commented Jul 17, 2025 •

edited by github-actions bot

Loading

eric-higgins-ai commented Jul 20, 2025 •

edited

Loading

ruisearch42 commented Jul 21, 2025 •

edited

Loading

eric-higgins-ai commented Jul 23, 2025 •

edited

Loading