Release vllm-stack-0.1.8 · vllm-project/production-stack

The stack deployment of vLLM

What's Changed

[Feat] Add GKE example for lmcache cpu ram + local disk offloading by @dannawang0221 in #678
[Feat] Use the lmcache 0.3.5 for kvaware routing by @zerofishnoodles in #673
[Feat]: add pull policy option to to ray-cluster.yaml (helm chart) by @moriabs88 in #686
[Feat] Add support for scaling down to zero in KEDA by @Romero027 in #679
[bugfix] Small fix to observability tutorial by @Romero027 in #695
[Feat][Router]: add vision model type by @max-wittig in #603
Adding Support for Sleep Mode for vLLM Container without Command Args by @dumb0002 in #696
[Bugfix] Increase liveness failure threshold for crd by @zerofishnoodles in #688
[bugfix] Add close method for static discovery by @zerofishnoodles in #692
[Bugfix][Router]: loop through model_names by @max-wittig in #694
[Misc] bump up otel col version and use a simplified image by @JaredTan95 in #698
[vllm-router] fall back to remote tokenizer as 2nd path by @panpan0000 in #702
[Bugfix][Router]: do not filter by model label in transcription by @max-wittig in #712
[CI] move e2e machine to self hosted by @zerofishnoodles in #716
[Feat] Add Production-ready vLLM EKS terraform stack tutorial by @brokedba in #704
[bugfix] Add annotation to pod after loading the lora adapter to trigger the modify event by @zerofishnoodles in #703
[Feat] [Router] [Misc] [Doc] increased configurability of affinity and probes by @Garrukh in #715
[Bugfix] fix pd client initialization issue by @zerofishnoodles in #717
[Bugfix] Update aiohttp to resolve CVE-2024-23334 vulnerability by @ikaadil in #722
[Bugfix/Feature] Support extraPorts in service-vllm by @NargiT in #725
Update gateway-inference-extension.rst by @linsun in #728
feat(helm): Use emptyDir as pvcStorage by @Jimmy-Newtron in #616
[Bugfix] Support service discovery by service name: add missing role and rolebinding for #586 by @NargiT in #724
Update doc 04-GCP-GKE-lmcache-local-disk.md by @dannawang0221 in #727
[Feat] Enable MIG support for Ray Head Node using chart.resources helper by @shima8823 in #732
[feat] Enable session key in request body by @zerofishnoodles in #741
[Feat] Add basic integration path for semantic router by @zerofishnoodles in #740
[Bugfix] Pod rolebinding are requiered even with k8s_discovery_mode=serivce-name by @NargiT in #744
[Feat] allow annotation on router pod by @NargiT in #743
[Integration]: Add Intelligent Semantic Routing with vLLM-SR by @Xunzhuo in #750
[Integration]: Update Docs with vLLM-SR by @Xunzhuo in #752
[Bugfix] kv aware routing for lmcache 0.3.9 by @zerofishnoodles in #697
[Feat] Ability to add labels to model pvc by @NargiT in #754
[Bugfix] Helm: Add security context support, fix #756 by @aplufr in #757
[Bugfix] lmcache server points to wrong file in entrypoint by @Senne-Mennes in #730
[Feat] Add per-model runtimeClassName configuration support by @HanFa in #755
Bumping version to 0.1.8 by @YuhanLiu11 in #738

New Contributors

@dannawang0221 made their first contribution in #678
@moriabs88 made their first contribution in #686
@JaredTan95 made their first contribution in #698
@panpan0000 made their first contribution in #702
@brokedba made their first contribution in #704
@Garrukh made their first contribution in #715
@NargiT made their first contribution in #725
@linsun made their first contribution in #728
@Jimmy-Newtron made their first contribution in #616
@shima8823 made their first contribution in #732
@aplufr made their first contribution in #757
@Senne-Mennes made their first contribution in #730
@HanFa made their first contribution in #755

Full Changelog: vllm-stack-0.1.7...vllm-stack-0.1.8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vllm-stack-0.1.8

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!