-
Notifications
You must be signed in to change notification settings - Fork 37
Pull requests: waybarrios/vllm-mlx
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: repetition detector for degenerate token loops
#65
opened Feb 11, 2026 by
janhilgard
Loading…
1 of 3 tasks
Add more aggressive nemotron XML tool call parsing, fixes #63
#64
opened Feb 11, 2026 by
selimrecep
Loading…
Add KV cache quantization for prefix cache memory reduction
#62
opened Feb 11, 2026 by
waybarrios
Loading…
fix: MLLM vision models hallucinate and ignore instructions in BatchedEngine
#54
opened Feb 9, 2026 by
janhilgard
Loading…
8 tasks done
feat: GPT-OSS reasoning parser for channel-based token format
#53
opened Feb 9, 2026 by
janhilgard
Loading…
5 tasks done
fix: route text-only requests through MLLM scheduler for vision models
#52
opened Feb 9, 2026 by
janhilgard
Loading…
4 tasks done
feat: Add speculative decoding support with draft models
#45
opened Feb 5, 2026 by
janhilgard
Loading…
7 tasks done
ProTip!
Filter pull requests by the default branch with base:main.