-
-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Attention][3/n] Remove usage of deprecated ONLY add when PR is ready to merge/full CI is needed
v1
seq_lens_cpu and num_computed_tokens_cpu CommonAttentionMetadata properties
ready
#31850
opened Jan 7, 2026 by
LucasWilkinson
Loading…
[CI] Fix weight mapping test for transformers v5 tied weights
multi-modality
Related to multi-modality (#4194)
#31849
opened Jan 7, 2026 by
AndreasKaratzas
Loading…
[Model] Add Grok-2
documentation
Improvements or additions to documentation
#31847
opened Jan 7, 2026 by
dangoldbj
Loading…
5 tasks
[4/n] Migrate pos_encoding sampler and fused_qknorm_rope to libtorch stable ABI
ci/build
cpu
Related to CPU backends
nvidia
#31842
opened Jan 6, 2026 by
mikaylagawarecki
•
Draft
5 tasks
[Bugfix] Fix race condition in async-scheduling for vlm model
v1
#31841
opened Jan 6, 2026 by
tianshu-Michael-yu
Loading…
3 of 5 tasks
[1/2][lmcache connector] clean up lmcache multi-process adapter
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
#31838
opened Jan 6, 2026 by
ApostaC
Loading…
5 tasks
[Perf] Fuse stride preparation for NVFP4 cutlass_moe
nvidia
performance
Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
#31837
opened Jan 6, 2026 by
mgoin
Loading…
5 tasks
[responsesAPI] fix incomplete_messages for simple/parsable context
frontend
#31836
opened Jan 6, 2026 by
qandrew
Loading…
[CI/Build] Enable test_kv_cache_events_dp for AMD
rocm
Related to AMD ROCm
v1
#31834
opened Jan 6, 2026 by
rjrock
Loading…
3 tasks done
[ROCm][CI] v1 cpu offloading attention backend fix
rocm
Related to AMD ROCm
v1
#31833
opened Jan 6, 2026 by
AndreasKaratzas
Loading…
[Perf][Kernel] Fused SiLU+Mul+Quant kernel for NVFP4 cutlass_moe
nvidia
performance
Performance-related issues
#31832
opened Jan 6, 2026 by
mgoin
Loading…
5 tasks
[Perf] Optimize cutlass moe problem size calculation, 5.3% E2E Throughput improvement, 2.2% TTFT improvement
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
#31830
opened Jan 6, 2026 by
yewentao256
Loading…
[Perf] Slight improvement of ITL with multiple GPUs
#31826
opened Jan 6, 2026 by
access2rohit
Loading…
2 of 5 tasks
[Model] Enable LoRA support for tower and connector in DotsOCR
documentation
Improvements or additions to documentation
#31825
opened Jan 6, 2026 by
ShaanveerS
Loading…
[CI] Add CUDA 13 nightly containers
ci/build
nvidia
#31822
opened Jan 6, 2026 by
csahithi
Loading…
5 tasks
[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA backend
rocm
Related to AMD ROCm
v1
#31816
opened Jan 6, 2026 by
vllmellm
Loading…
5 tasks
[Bugfix] Fix TorchAO quantization bugs and add
--torchao-config CLI support
#31815
opened Jan 6, 2026 by
jwpark33
Loading…
5 tasks
[Bugfix] Inject JSON schema descriptions into prompt for structured outputs
frontend
#31814
opened Jan 6, 2026 by
ricky-chaoju
Loading…
Enable LoRA support for tower and connector in Mistral and Voxtral
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
qwen
Related to Qwen models
#31812
opened Jan 6, 2026 by
Anexdeus
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.