⬆️✅ Support 0.6.5+ vllm #7

evaline-ju · 2025-01-02T23:04:38Z

vllm APIs including OpenAIServingChat that the chat detection base class is built on underwent some breaking changes. The decision was made in this PR to just update the lower bound of vllm instead of maintaining conditional support in the tests for 0.6.2-pre0.6.5, since vllm APIs move quickly. At time of writing, there are at least two patch versions 0.6.5-0.6.6 with these supported changes. Some post-0.6.6 but in main branch changes have been noted as inline comments.

Key changes

https://github.com/vllm-project/vllm/pull/9919 building on https://github.com/vllm-project/vllm/pull/9358 added the non-optional chat_template_content_format field to OpenAIServingChat
https://github.com/vllm-project/vllm/pull/10463 that allows extra fields now in the vllm API since the OpenAI API now allows extra fields, impacting request/response fields like ChatCompletionRequest [used to make the request to chat completions]
https://github.com/vllm-project/vllm/pull/11164 added get_diff_sampling_params to model configs

Closes: #6

Signed-off-by: Evaline Ju <[email protected]>

gkumbhat · 2025-01-09T16:27:51Z

tests/test_protocol.py

-    assert type(request) == ErrorResponse
-    assert request.code == HTTPStatus.BAD_REQUEST.value
+    # As of vllm >= 0.6.5, extra fields are allowed
+    assert type(request) == ChatCompletionRequest


hmm. this will change the general API behavior from our side. Does orchestrator expects bad request in such scenario or passthrough?

This will just cause a passthrough of the variable from my testing. My worry is that adding additional validation when vllm and openAI allow passthrough is then we're even more tied to small API changes (like tracking all expected fields)

we're even more tied to small API changes

Completely agree with that. My concern is not around validation for vllm API, which I completely agree should be as lean as possible. But I am wondering if this will create inconsistency in the way we handle validation in orchestrator across detectors. So like do we expect such validation from detectors in general, or its consistent with what orchestrator expects from other detectors.

I think this is consistent with what orchestrator expects from other detectors/detector server(s) today, in the sense that for "other" detectors currently, users can also pass in any detector_params, which will be passed through and validated by the individual detector server(s). It will be then up to the individual server implementations to work based on expected/unexpected params. The one parameter exception currently is the threshold parameter, that the orchestrator uses, but the orchestrator would not be passing that on to the detectors, including the ones here.

gkumbhat

Looks good to me!

evaline-ju added 5 commits January 2, 2025 08:28

⬆️ Unpin vllm

ecb818f

Signed-off-by: Evaline Ju <[email protected]>

✅🔧 Update mock model configs

0b2c9be

Signed-off-by: Evaline Ju <[email protected]>

✅ Update test for extra fields

f0a70b5

Signed-off-by: Evaline Ju <[email protected]>

⬆️ Upgrade lower bound of vllm

1964f07

Signed-off-by: Evaline Ju <[email protected]>

🔥 Remove error on extra params tests

66b9acf

Signed-off-by: Evaline Ju <[email protected]>

evaline-ju force-pushed the vllm-latest branch from 843768f to 66b9acf Compare January 2, 2025 23:14

♻️ API server updates

15bf20c

Signed-off-by: Evaline Ju <[email protected]>

evaline-ju marked this pull request as ready for review January 8, 2025 21:21

gkumbhat reviewed Jan 9, 2025

View reviewed changes

gkumbhat approved these changes Jan 13, 2025

View reviewed changes

evaline-ju merged commit 930d7bb into foundation-model-stack:main Jan 13, 2025
3 checks passed

evaline-ju deleted the vllm-latest branch January 13, 2025 17:07

evaline-ju mentioned this pull request Jan 16, 2025

Support post-0.6.6 vllm API server updates #10

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⬆️✅ Support 0.6.5+ vllm #7

⬆️✅ Support 0.6.5+ vllm #7

Uh oh!

evaline-ju commented Jan 2, 2025 •

edited

Loading

Uh oh!

gkumbhat Jan 9, 2025

Uh oh!

evaline-ju Jan 9, 2025 •

edited

Loading

Uh oh!

gkumbhat Jan 13, 2025

Uh oh!

evaline-ju Jan 13, 2025

Uh oh!

gkumbhat left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

⬆️✅ Support 0.6.5+ vllm #7

⬆️✅ Support 0.6.5+ vllm #7

Uh oh!

Conversation

evaline-ju commented Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gkumbhat Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

evaline-ju Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gkumbhat Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

evaline-ju Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

gkumbhat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

evaline-ju commented Jan 2, 2025 •

edited

Loading

evaline-ju Jan 9, 2025 •

edited

Loading