Skip to content

Conversation

@evaline-ju
Copy link
Collaborator

@evaline-ju evaline-ju commented Jan 2, 2025

vllm APIs including OpenAIServingChat that the chat detection base class is built on underwent some breaking changes. The decision was made in this PR to just update the lower bound of vllm instead of maintaining conditional support in the tests for 0.6.2-pre0.6.5, since vllm APIs move quickly. At time of writing, there are at least two patch versions 0.6.5-0.6.6 with these supported changes. Some post-0.6.6 but in main branch changes have been noted as inline comments.

Key changes

  • https://github.com/vllm-project/vllm/pull/9919 building on https://github.com/vllm-project/vllm/pull/9358 added the non-optional chat_template_content_format field to OpenAIServingChat
  • https://github.com/vllm-project/vllm/pull/10463 that allows extra fields now in the vllm API since the OpenAI API now allows extra fields, impacting request/response fields like ChatCompletionRequest [used to make the request to chat completions]
  • https://github.com/vllm-project/vllm/pull/11164 added get_diff_sampling_params to model configs

Closes: #6

Signed-off-by: Evaline Ju <[email protected]>
@evaline-ju evaline-ju marked this pull request as ready for review January 8, 2025 21:21
assert type(request) == ErrorResponse
assert request.code == HTTPStatus.BAD_REQUEST.value
# As of vllm >= 0.6.5, extra fields are allowed
assert type(request) == ChatCompletionRequest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. this will change the general API behavior from our side. Does orchestrator expects bad request in such scenario or passthrough?

Copy link
Collaborator Author

@evaline-ju evaline-ju Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will just cause a passthrough of the variable from my testing. My worry is that adding additional validation when vllm and openAI allow passthrough is then we're even more tied to small API changes (like tracking all expected fields)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're even more tied to small API changes

Completely agree with that. My concern is not around validation for vllm API, which I completely agree should be as lean as possible. But I am wondering if this will create inconsistency in the way we handle validation in orchestrator across detectors. So like do we expect such validation from detectors in general, or its consistent with what orchestrator expects from other detectors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is consistent with what orchestrator expects from other detectors/detector server(s) today, in the sense that for "other" detectors currently, users can also pass in any detector_params, which will be passed through and validated by the individual detector server(s). It will be then up to the individual server implementations to work based on expected/unexpected params. The one parameter exception currently is the threshold parameter, that the orchestrator uses, but the orchestrator would not be passing that on to the detectors, including the ones here.

Copy link
Collaborator

@gkumbhat gkumbhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@evaline-ju evaline-ju merged commit 930d7bb into foundation-model-stack:main Jan 13, 2025
3 checks passed
@evaline-ju evaline-ju deleted the vllm-latest branch January 13, 2025 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support 0.6.5+ vllm

2 participants