The stack deployment of vLLM
What's changed
- [CI]: change the entrypoint of nightly docker images (#514) (by @sammshen )
- Add support for sleep and wake_up endpoints (#498) (by @dumb0002 )
- [Bugfix] add health probe for lmcache server (#520) (by @zerofishnoodles )
- [Doc, Feat] basic KEDA support and tutorials (#487) (by @Romero027 )
- [Misc] Delete Unnecessary file (#521) (by @zerofishnoodles )
- change keda name (#529) (by @zerofishnoodles )
- [CI/CD] Add roundrobin router e2e test (#525) (by @zerofishnoodles )
- [Doc] Add CRD deployment docs (#530) (by @kobe0938 )
- [Doc] Kubernetes in Docker (kind) tutorial (#534) (by @lucas-tucker )
- FEAT introduce ruff to project 1 - tests (#527) (by @BrianPark314 )
- [CI/CD] Add static e2e test for prefixaware (#532) (by @zerofishnoodles )
- fix(request): make sure to extend full_response (#536) (by @max-wittig )
- [CI/CD] Add prefix aware routing test (#523) (by @zerofishnoodles )
- [Bugfix][Helm] prevent duplicate securitycontext entry for containers (#544) (by @Hexoplon )
- feature/gateway-inference-extension (#537) (by @BrianPark314 )
- Add Artifact Hub metadata for verified publisher (#540) (by @kobe0938 )
- [CI/CD] Add multiple routing logic test (#547) (by @zerofishnoodles )
- [Doc] Adding security context for disaggregated prefill (#555) (by @YuhanLiu11 )
- [CI/CD] Add checkov security check for infomation (by @zerofishnoodles )
- fix(reconciler): trigger update when image or replicas are changed (#554) (by @googs1025 )
- [Feat] Terraform Quickstart Tutorials for MS Azure (#552) (by @falconlee236 )
- [Router] Expose /tokenize and /detokenize endpoints (#541) (by @Exchioz )
- feature/ruff-router (#553) (by @BrianPark314 )
- [Doc] Adding tutorial for Gateway Inference Extension support (#570) (@YuhanLiu11 )
- fix: race condition in trie insert (by @zhouwfang )
- [Feature] Moving default vLLM version from v0 to v1 (#580) (@YuhanLiu11 )
- feat(helm): make imagePullPolicy configurable & fix router service annotation for LoadBalancer (#573) (by @lonelygo )
- perf: minimize lock contention (#581) (by @zhouwfang )
- [BugFix] fix lora controller reconcile logic (#565) (by @zerofishnoodles )
- [FEAT] Add LoRA helm deployment (#563) (by @zerofishnoodles )