Use more recent version of vllm by tleyden · Pull Request #150 · kyutai-labs/unmute

tleyden · 2025-10-30T20:52:47Z

Checklist

Read CONTRIBUTING.md, and accept the CLA by including the provided snippet. We will not accept PR without this.

PR Description

Before this change, I was getting the error:

root@f6aeecd7dbf8:/workspace/unmute# ./dockerless/start_llm.sh
++ dirname ./dockerless/start_llm.sh
+ cd ./dockerless/..
+ uv tool run vllm@v0.9.0 serve --model=google/gemma-3-1b-it --max-model-len=8192 --dtype=bfloat16 --gpu-memory-utilization=0.3 --port=8091
Installed 149 packages in 638ms
INFO 10-30 20:45:24 [__init__.py:243] Automatically detected platform cuda.
Traceback (most recent call last):
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/bin/vllm", line 6, in <module>
    from vllm.entrypoints.cli.main import main
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/__init__.py", line 12, in <module>
    from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 20, in <module>
    from vllm.config import (BlockSize, CacheConfig, CacheDType, CompilationConfig,
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/config.py", line 38, in <module>
    from vllm.transformers_utils.config import (
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 31, in <module>
    from vllm.transformers_utils.configs import (ChatGLMConfig, Cohere2Config,
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/transformers_utils/configs/__init__.py", line 26, in <module>
    from vllm.transformers_utils.configs.ovis import OvisConfig
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/transformers_utils/configs/ovis.py", line 75, in <module>
    AutoConfig.register("aimv2", AIMv2Config)
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line 1401, in register
    CONFIG_MAPPING.register(model_type, config, exist_ok=exist_ok)
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line 1081, in register
    raise ValueError(f"'{key}' is already used by a Transformers config, pick another name.")
ValueError: 'aimv2' is already used by a Transformers config, pick another name.

According to vllm-project/vllm-ascend#2046, later versions of vllm fixed the issue.

I verified the fix, now it gets past that error:

(APIServer pid=21797) OSError: You are trying to access a gated repo.
(APIServer pid=21797) Make sure to have access to it at https://huggingface.co/google/gemma-3-1b-it.
(APIServer pid=21797) 401 Client Error. (Request ID: Root=1-6903cf4c-20f4c78f03d4b2ae361ee686;2b90f892-c001-4988-b44e-4375d64090e2)
(APIServer pid=21797)
(APIServer pid=21797) Cannot access gated repo for url https://huggingface.co/google/gemma-3-1b-it/resolve/main/config.json.
(APIServer pid=21797) Access to model google/gemma-3-1b-it is restricted. You must have access to it and be authenticated to access it. Please log in.

vvolhejn · 2025-10-31T14:20:03Z

Thank you, could you update all occurrences? Try searching for "0.9.1".

tleyden · 2025-10-31T14:47:45Z

Sure, no problem. I'll update the PR

tleyden · 2025-10-31T14:54:19Z

Done

vvolhejn · 2025-11-03T09:43:13Z

Sorry, I think you're still missing start_llm.sh. Happy to merge afterwards!

tleyden · 2025-11-03T09:53:16Z

I just double checked, and it does look like it was updated in my latest commit: https://github.com/kyutai-labs/unmute/pull/150/files#diff-7087b91bbe9bc682a0e2720f057132e100923f2d63274fe0e42085b130bef6afL5

Can you take another look?

vvolhejn · 2025-11-03T10:09:44Z

Oh, weird, I must've been looking at an outdated version. Sorry!

(CLA accepted in #149.)

Use more recent version of vllm

d00aa1d

bump version everywhere

88af235

vvolhejn merged commit 47a838e into kyutai-labs:main Nov 3, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use more recent version of vllm#150

Use more recent version of vllm#150
vvolhejn merged 2 commits intokyutai-labs:mainfrom
tleyden:patch-2

tleyden commented Oct 30, 2025

Uh oh!

vvolhejn commented Oct 31, 2025

Uh oh!

tleyden commented Oct 31, 2025

Uh oh!

tleyden commented Oct 31, 2025

Uh oh!

vvolhejn commented Nov 3, 2025

Uh oh!

tleyden commented Nov 3, 2025

Uh oh!

vvolhejn commented Nov 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tleyden commented Oct 30, 2025

Checklist

PR Description

Uh oh!

vvolhejn commented Oct 31, 2025

Uh oh!

tleyden commented Oct 31, 2025

Uh oh!

tleyden commented Oct 31, 2025

Uh oh!

vvolhejn commented Nov 3, 2025

Uh oh!

tleyden commented Nov 3, 2025

Uh oh!

vvolhejn commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vvolhejn commented Nov 3, 2025 •

edited

Loading