-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CI/Build] Add terratorch for AMD
ci/build
rocm
Related to AMD ROCm
#29205
opened Nov 21, 2025 by
rjrock
Loading…
3 of 5 tasks
[CI] Bug: Fix triton import issue
force-merge
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#29202
opened Nov 21, 2025 by
yewentao256
Loading…
[CI/Build] Replace COPY scripts with bind mounts to reduce layers
ci/build
#29201
opened Nov 21, 2025 by
mirzaim
Loading…
5 tasks
Display warning only when ROCm version is less than Pytorch required version
ci/build
rocm
Related to AMD ROCm
#29200
opened Nov 21, 2025 by
Inokinoki
Loading…
5 tasks
[Model] Restore Gemma3 GGUF multimodal support with GGUF-only guards
v1
#29198
opened Nov 21, 2025 by
lucianommartins
Loading…
9 tasks done
[Frontend] Implement robust video frame recovery for corrupted videos
documentation
Improvements or additions to documentation
multi-modality
Related to multi-modality (#4194)
performance
Performance-related issues
[not for land] online fp8 quant with streaming weight post-processing
#29196
opened Nov 21, 2025 by
vkuzo
Loading…
5 tasks
[Build/CI][DP/EP] Add QWen/Qwen3-30B-A3B-FP8 + EPLB tests to Nightly H100 and B200
ci/build
qwen
Related to Qwen models
#29195
opened Nov 21, 2025 by
varun-sundar-rabindranath
Loading…
[Model Runner V2] Change bookkeeping logic in preparation for spec decoding
nvidia
v1
#29194
opened Nov 21, 2025 by
WoosukKwon
•
Draft
[perf][cpu] Accelerate attention GEMMs (QK, PV) on Arm CPUs with NEON
aarch64-cpu
performance
Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#29193
opened Nov 21, 2025 by
fadara01
Loading…
2 tasks
[Models] Lfm2-VL Architecture
documentation
Improvements or additions to documentation
new-model
Requests to new models
#29191
opened Nov 21, 2025 by
paulpak58
Loading…
5 tasks
Fix: Resolve circular import in model_loader/utils.py
#29189
opened Nov 21, 2025 by
nandan2003
Loading…
[Doc] Update more docs with respect to V1
documentation
Improvements or additions to documentation
#29188
opened Nov 21, 2025 by
DarkLight1337
Loading…
5 tasks
[Misc] Further clean up chunked prefill and prefix caching init
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#29186
opened Nov 21, 2025 by
DarkLight1337
Loading…
5 tasks
[Core] NGram GPU Implementation compatible with Async Scheduler
speculative-decoding
v1
#29184
opened Nov 21, 2025 by
PatchouliTIS
•
Draft
5 tasks
Add fused MoE config for H200 E160 N192 fp8
#29182
opened Nov 21, 2025 by
FlintyLemming
Loading…
3 of 5 tasks
[Frontend][Responses API] Multi-turn (with type: "output_text") support for non-harmony requests
frontend
gpt-oss
Related to GPT-OSS models
#29175
opened Nov 21, 2025 by
madskildegaard
Loading…
3 tasks done
docs: fixes distributed executor backend config for multi-node vllm
documentation
Improvements or additions to documentation
#29173
opened Nov 21, 2025 by
michaelact
Loading…
5 tasks
[docs] Fix cudagraph mode config
documentation
Improvements or additions to documentation
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
#29170
opened Nov 21, 2025 by
angelayi
Loading…
[BugFix] Call Base Layer Directly if LoRA A/B in Parallel Vocab are 0
#29167
opened Nov 21, 2025 by
alex-jw-brooks
Loading…
5 tasks
[Core] Add xxHash as a high-performance hash option for accelerating prefix caching
ci/build
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
gpt-oss
Related to GPT-OSS models
kv-connector
llama
Related to Llama models
multi-modality
Related to multi-modality (#4194)
new-model
Requests to new models
nvidia
performance
Performance-related issues
qwen
Related to Qwen models
rocm
Related to AMD ROCm
speculative-decoding
structured-output
v1
#29163
opened Nov 21, 2025 by
LuminolT
Loading…
4 of 5 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.