-
Notifications
You must be signed in to change notification settings - Fork 53
fix: some minor fixes #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
| FancyLogger.get_logger().warning( | ||
| "utilizing device mps with a `generate_from_raw` request; you may see issues when submitting batches of prompts to a huggingface backend; ensure all ModelOutputThunks have non-empty values." | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that's what this was!! I was having issues when I was running hf tests for this but it disappeared when I stepped into the while debugging. Thanks for adding this!
avinash2692
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except not sure why the we're moving response_format into extra_params. Let me know if there is a reasoning for it and it can be documented here for posterity.
| extra_params: dict[str, Any] = {} | ||
| if _format is not None: | ||
| response_format = { | ||
| extra_params["response_format"] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for the additional abstraction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with avi; response_format = None is better if the old value causes errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
response_format = None sometimes causes issues as well with some backends (at least with the OpenAI backend I believe it used to). It's best to just not pass a response_format parameter if possible.
|
Fixes issues related to vllm-project/vllm#26639 |
guicho271828
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved, but its nice to fix minor requests
| # if switching between async and sync calls. | ||
| if el != self._event_loop: | ||
| self._underlying_model.shutdown_background_loop() | ||
| self._underlying_model.start_background_loop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: assert el == self._underlying_model.background_loop ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They call that a background_loop but it's not an event loop, it's actually a Future. Even the _background_loop_unshielded is a Task object.
I think it's fine to manage the reference to the event loop on our side. We only ever have the one AsyncLLMEngine per LocalVLLMBackend so there shouldn't be issues with us tracking it this way. Happy to change it if it causes issues later on.
| extra_params: dict[str, Any] = {} | ||
| if _format is not None: | ||
| response_format = { | ||
| extra_params["response_format"] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with avi; response_format = None is better if the old value causes errors
* fix: enforce minimum vllm version * fix: remove tests that look for "β" * fix: remove default response_format from litellm and openai backends * fix: remove xdist from pytests * fix: fix vllm tests * fix: vllm async event loop * feat: add contexts to validation results * fix: add warning for mps with huggingface generate from raw * fix: remove .py from folder name * fix: remove pytest-xdist specific args * fix: add exception with vllm backend when env var not set
I ran into enough issues where I decided to fix them all. All tests pass locally now.
Fixes: #222 (comment)
List of changes:
.[all]vs.[vllm]; I could not figure out why but forcing the version works.βtests that always fail.pyin its name