feat: Adds a vllm backend #122

guicho271828 · 2025-09-05T00:26:23Z

basic VLLM backend without tool and alora support.

Tests are copied from huggingface, failing results are optional ones as follows

=================================================== short test summary info ===================================================
FAILED test/backends/test_vllm.py::test_multiturn - assert 'β' in "\nokay, let's see. the user asked to take the result of"
FAILED test/backends/test_vllm.py::test_generate_from_raw_with_format - AssertionError: formatting directive failed for { "n...
========================================== 2 failed, 5 passed, 3 warnings in 14.99s ===========================================

Requires VLLM_V1=0 to be set when running.

mergify · 2025-09-05T00:26:57Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

mellea/backends/vllm.py

mellea/backends/huggingface.py

jakelorocco

looks mostly good to me; I left a few comments and looks like there's some pre-commit checks failing as well

mellea/backends/vllm.py

mellea/backends/utils.py

mellea/backends/vllm.py

jakelorocco

I think this looks good. There will be conflicts / changes required based on my async PR: #137. I'm happy to make those commits though once the async stuff gets merged and this branch gets rebased.

guicho271828 · 2025-10-01T18:38:48Z

async support is already implemented

Signed-off-by: Masataro Asai <[email protected]>

…mode

mellea/backends/huggingface.py

mellea/backends/vllm.py

test/backends/test_vllm.py

…PR that migrates to V1

nrfulton · 2025-10-29T18:46:45Z

I just merged main.

If tests do not predictably pass in the CI/CD pipeline, we should mark those as non-cicd tests (ask @avinash2692 if you need a pointer on how to do this).

Please run the full test suite locally after the main merge and ensure all tests are passing locally.

Other than that, LGTM.

* feat: added smaller qwen models for debugging Signed-off-by: Masataro Asai <[email protected]> * feat(vllm): copied from huggingface Signed-off-by: Masataro Asai <[email protected]> * fix(vllm): remove alora and cache Signed-off-by: Masataro Asai <[email protected]> * fix(vllm): remove tool calls Signed-off-by: Masataro Asai <[email protected]> * fix(vllm): finished the implementation with limited functionality: free-form and constrained generation Signed-off-by: Masataro Asai <[email protected]> * fix(vllm): passing mypy and linter Signed-off-by: Masataro Asai <[email protected]> * fix(vllm): added vllm optional dep in pyproject.toml Signed-off-by: Masataro Asai <[email protected]> * feat(vllm test): copied from huggingface Signed-off-by: Masataro Asai <[email protected]> * fix(vllm test): implemented the test Signed-off-by: Masataro Asai <[email protected]> * test: require V0 in vllm test Signed-off-by: Masataro Asai <[email protected]> * refactor: ctx to chat conversion function * refactor: use_alora function * refactor: moved _extract_model_tool_requests to mellea.backends.utils * feat(vllm): added tool calls * test(tools): run test with mistral * fix(vllm): rename model_options -> engine_args * fix(vllm): use FancyLogger * fix(vllm): ignore type checking for vllm and msgspec * fix(vllm): fixed the backend name in the log * feat(vllm): asynchronous call support * test(vllm): asynchronous call support * fix(vllm): avoid unnecessary incremental processing in non-streaming mode * fix(vllm): fix for the new return format * fix(vllm): fixed vllm test for the new contexts * fix(vllm): addressed minor comments * fix(vllm): uv lock * fix(vllm): mark V0 api test qualitative; will be removed in a future PR that migrates to V1 --------- Signed-off-by: Masataro Asai <[email protected]> Co-authored-by: MASATARO ASAI [email protected] <[email protected]> Co-authored-by: Nathan Fulton <[email protected]>

guicho271828 force-pushed the masa/vllm-backend branch 6 times, most recently from ba3bb52 to 300c148 Compare September 5, 2025 00:57

jakelorocco reviewed Sep 5, 2025

View reviewed changes

mellea/backends/vllm.py Outdated Show resolved Hide resolved

mellea/backends/vllm.py Show resolved Hide resolved

jakelorocco changed the title ~~Masa/vllm backend~~ feat(vllm): Masa/vllm backend Sep 8, 2025

nrfulton changed the title ~~feat(vllm): Masa/vllm backend~~ feat: Adds a vllm backend Sep 10, 2025

guicho271828 force-pushed the masa/vllm-backend branch 3 times, most recently from 93a191a to 69d3be8 Compare September 10, 2025 22:17

nrfulton requested review from jakelorocco and nrfulton September 17, 2025 12:35

jakelorocco reviewed Sep 17, 2025

View reviewed changes

mellea/backends/huggingface.py Show resolved Hide resolved

jakelorocco requested changes Sep 17, 2025

View reviewed changes

mellea/backends/vllm.py Show resolved Hide resolved

mellea/backends/utils.py Outdated Show resolved Hide resolved

mellea/backends/vllm.py Outdated Show resolved Hide resolved

mellea/backends/vllm.py Outdated Show resolved Hide resolved

guicho271828 force-pushed the masa/vllm-backend branch from 87cbc96 to 9701c08 Compare September 19, 2025 15:39

jakelorocco reviewed Sep 23, 2025

View reviewed changes

guicho271828 force-pushed the masa/vllm-backend branch 5 times, most recently from e84ff46 to 048e90d Compare September 24, 2025 18:53

guicho271828 requested a review from jakelorocco September 24, 2025 19:15

guicho271828 force-pushed the masa/vllm-backend branch 4 times, most recently from 61d6384 to 291b459 Compare October 1, 2025 18:35

guicho271828 and others added 17 commits October 21, 2025 18:59

feat(vllm test): copied from huggingface

872c529

Signed-off-by: Masataro Asai <[email protected]>

fix(vllm test): implemented the test

12df402

Signed-off-by: Masataro Asai <[email protected]>

test: require V0 in vllm test

015c62b

Signed-off-by: Masataro Asai <[email protected]>

refactor: ctx to chat conversion function

003e766

refactor: use_alora function

9a5f557

refactor: moved _extract_model_tool_requests to mellea.backends.utils

f586128

feat(vllm): added tool calls

3571098

test(tools): run test with mistral

cd339c7

fix(vllm): rename model_options -> engine_args

e8a69f0

fix(vllm): use FancyLogger

a3ee501

fix(vllm): ignore type checking for vllm and msgspec

d25579a

fix(vllm): fixed the backend name in the log

fd1b3a4

feat(vllm): asynchronous call support

2a12031

test(vllm): asynchronous call support

356dfd5

fix(vllm): avoid unnecessary incremental processing in non-streaming …

39ad431

…mode

fix(vllm): fix for the new return format

7452a1f

fix(vllm): fixed vllm test for the new contexts

9daa4e2

guicho271828 force-pushed the masa/vllm-backend branch from 5009659 to 186af0f Compare October 21, 2025 15:59

fix(vllm): addressed minor comments

55397ad

guicho271828 force-pushed the masa/vllm-backend branch from 186af0f to 55397ad Compare October 21, 2025 16:21

fix(vllm): uv lock

32db5eb

jakelorocco reviewed Oct 22, 2025

View reviewed changes

mellea/backends/huggingface.py Show resolved Hide resolved

mellea/backends/vllm.py Show resolved Hide resolved

test/backends/test_vllm.py Show resolved Hide resolved

fix(vllm): mark V0 api test qualitative; will be removed in a future …

9709103

…PR that migrates to V1

guicho271828 requested a review from jakelorocco October 22, 2025 17:25

jakelorocco approved these changes Oct 22, 2025

View reviewed changes

Merge branch 'main' into masa/vllm-backend

d236dca

Merge branch 'main' into masa/vllm-backend

41f2622

guicho271828 merged commit 21908e5 into generative-computing:main Oct 29, 2025
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Adds a vllm backend #122

feat: Adds a vllm backend #122

guicho271828 commented Sep 5, 2025

Uh oh!

mergify bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jakelorocco left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jakelorocco left a comment

Uh oh!

guicho271828 commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nrfulton commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Adds a vllm backend #122

feat: Adds a vllm backend #122

Conversation

guicho271828 commented Sep 5, 2025

Uh oh!

mergify bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge Protections

🟢 Enforce conventional commit

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

guicho271828 commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nrfulton commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mergify bot commented Sep 5, 2025 •

edited

Loading