fix lmi/vllm virtual envs, update to vllm 0.7.1 by siddvenk · Pull Request #2703 · deepjavalibrary/djl-serving

siddvenk · 2025-02-03T02:20:47Z

Description

This changes updates to vllm 0.7.1, which involves shuffling around some dependencies and being less strict with dependency versions.

Additionally, it updates the chat processing for vllm to be functional. There is still a good amount we need to implement for chat processing, that i'll take up in a follow up PR:

use the sampling params provided directly from the vllm chat object (to_sampling_params method). this will ensure we use the correct sampling params for chat
validate function calling and tool usage with this update
allow user to specify override chat template, and chat format

I have tested this with (single test for each):

HF non rolling batch
HF scheduler rolling batch
vLLM rolling batch
lmi-dist rolling batch

I also added a chat test for mistral with vllm.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Checklist:

Please add the link of Integration Tests Executor run with related tests.
Have you manually built the docker image and verify the change?
Have you run related tests? Check how to set up the test environment here; One example would be pytest tests.py -k "TestCorrectnessLmiDist" -m "lmi_dist"
Have you added tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Test A
Logs for Test A
Test B
Logs for Test B

siddvenk · 2025-02-03T02:21:04Z

engines/python/setup/djl_python/chat_completions/vllm_chat_utils.py

                                         resolve_chat_template_content_format)


-def is_chat_completions_request(inputs: Dict) -> bool:


deleted because it's not used

siddvenk · 2025-02-03T02:21:16Z

engines/python/setup/djl_python/chat_completions/vllm_chat_utils.py

            "You must enable rolling batch to use the chat completions format."
        )

-    if not is_mistral_tokenizer and not hasattr(tokenizer,


deleted because the vllm utils do this validation for us already

sindhuvahinis · 2025-02-03T02:32:55Z

serving/docker/scripts/create_virtual_env.sh

-git reset --hard 4b2092c
-$venv_pip install .
-cd ..
-rm -rf AutoFP8


Do we not need FP8 installation?

not anymore! we're using llm compressor now #2701

siddvenk requested review from a team and zachgk as code owners February 3, 2025 02:20

siddvenk commented Feb 3, 2025

View reviewed changes

sindhuvahinis reviewed Feb 3, 2025

View reviewed changes

fix lmi/vllm virtual envs, update to vllm 0.7.1

16cc16a

siddvenk force-pushed the vllm-chat-params branch from d43e822 to 16cc16a Compare February 3, 2025 04:15

sindhuvahinis approved these changes Feb 3, 2025

View reviewed changes

siddvenk merged commit 1d05281 into deepjavalibrary:master Feb 3, 2025
9 checks passed

siddvenk deleted the vllm-chat-params branch April 18, 2025 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix lmi/vllm virtual envs, update to vllm 0.7.1#2703

fix lmi/vllm virtual envs, update to vllm 0.7.1#2703
siddvenk merged 1 commit intodeepjavalibrary:masterfrom
siddvenk:vllm-chat-params

siddvenk commented Feb 3, 2025

Uh oh!

siddvenk Feb 3, 2025

Uh oh!

siddvenk Feb 3, 2025

Uh oh!

sindhuvahinis Feb 3, 2025

Uh oh!

siddvenk Feb 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		resolve_chat_template_content_format)


		def is_chat_completions_request(inputs: Dict) -> bool:

Conversation

siddvenk commented Feb 3, 2025

Description

Type of change

Checklist:

Feature/Issue validation/testing

Uh oh!

siddvenk Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

siddvenk Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

sindhuvahinis Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

siddvenk Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants