Skip to content

Use more recent version of vllm#150

Merged
vvolhejn merged 2 commits intokyutai-labs:mainfrom
tleyden:patch-2
Nov 3, 2025
Merged

Use more recent version of vllm#150
vvolhejn merged 2 commits intokyutai-labs:mainfrom
tleyden:patch-2

Conversation

@tleyden
Copy link
Copy Markdown
Contributor

@tleyden tleyden commented Oct 30, 2025

Checklist

  • Read CONTRIBUTING.md, and accept the CLA by including the provided snippet. We will not accept PR without this.

PR Description

Before this change, I was getting the error:

root@f6aeecd7dbf8:/workspace/unmute# ./dockerless/start_llm.sh
++ dirname ./dockerless/start_llm.sh
+ cd ./dockerless/..
+ uv tool run vllm@v0.9.0 serve --model=google/gemma-3-1b-it --max-model-len=8192 --dtype=bfloat16 --gpu-memory-utilization=0.3 --port=8091
Installed 149 packages in 638ms
INFO 10-30 20:45:24 [__init__.py:243] Automatically detected platform cuda.
Traceback (most recent call last):
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/bin/vllm", line 6, in <module>
    from vllm.entrypoints.cli.main import main
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/__init__.py", line 12, in <module>
    from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 20, in <module>
    from vllm.config import (BlockSize, CacheConfig, CacheDType, CompilationConfig,
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/config.py", line 38, in <module>
    from vllm.transformers_utils.config import (
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 31, in <module>
    from vllm.transformers_utils.configs import (ChatGLMConfig, Cohere2Config,
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/transformers_utils/configs/__init__.py", line 26, in <module>
    from vllm.transformers_utils.configs.ovis import OvisConfig
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/vllm/transformers_utils/configs/ovis.py", line 75, in <module>
    AutoConfig.register("aimv2", AIMv2Config)
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line 1401, in register
    CONFIG_MAPPING.register(model_type, config, exist_ok=exist_ok)
  File "/root/.cache/uv/archive-v0/I7XKAyY3qSMAu9bQ1RA5X/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line 1081, in register
    raise ValueError(f"'{key}' is already used by a Transformers config, pick another name.")
ValueError: 'aimv2' is already used by a Transformers config, pick another name.

According to vllm-project/vllm-ascend#2046, later versions of vllm fixed the issue.

I verified the fix, now it gets past that error:

(APIServer pid=21797) OSError: You are trying to access a gated repo.
(APIServer pid=21797) Make sure to have access to it at https://huggingface.co/google/gemma-3-1b-it.
(APIServer pid=21797) 401 Client Error. (Request ID: Root=1-6903cf4c-20f4c78f03d4b2ae361ee686;2b90f892-c001-4988-b44e-4375d64090e2)
(APIServer pid=21797)
(APIServer pid=21797) Cannot access gated repo for url https://huggingface.co/google/gemma-3-1b-it/resolve/main/config.json.
(APIServer pid=21797) Access to model google/gemma-3-1b-it is restricted. You must have access to it and be authenticated to access it. Please log in.

@vvolhejn
Copy link
Copy Markdown
Collaborator

Thank you, could you update all occurrences? Try searching for "0.9.1".

image

@tleyden
Copy link
Copy Markdown
Contributor Author

tleyden commented Oct 31, 2025

Sure, no problem. I'll update the PR

@tleyden
Copy link
Copy Markdown
Contributor Author

tleyden commented Oct 31, 2025

Done

@vvolhejn
Copy link
Copy Markdown
Collaborator

vvolhejn commented Nov 3, 2025

Sorry, I think you're still missing start_llm.sh. Happy to merge afterwards!

@tleyden
Copy link
Copy Markdown
Contributor Author

tleyden commented Nov 3, 2025

I just double checked, and it does look like it was updated in my latest commit: https://github.com/kyutai-labs/unmute/pull/150/files#diff-7087b91bbe9bc682a0e2720f057132e100923f2d63274fe0e42085b130bef6afL5

Can you take another look?

@vvolhejn
Copy link
Copy Markdown
Collaborator

vvolhejn commented Nov 3, 2025

Oh, weird, I must've been looking at an outdated version. Sorry!

(CLA accepted in #149.)

@vvolhejn vvolhejn merged commit 47a838e into kyutai-labs:main Nov 3, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants