Skip to content

vllm-stack-0.1.7

Choose a tag to compare

@github-actions github-actions released this 03 Sep 22:10
b6dd717

The stack deployment of vLLM

What's Changed

  • [Feat] Added option to specify priority class by @Fabhiahn in #557
  • [CI/Build] Change CI runner to L4 by @Shaoting-Feng in #595
  • [Bugfix] fix dynamic config by @zerofishnoodles in #598
  • [refactor] redesign RST documentation by @kobe0938 in #592
  • [Misc] revert uv.lock by @kobe0938 in #604
  • [CI/Build] Specify transformers version in router end to end test by @Shaoting-Feng in #607
  • [Feat] allow service discovery by service names by @learner0810 in #586
  • feat: add hpa for router by @BrianPark314 in #568
  • [Misc] add helm configuration values table by @zerofishnoodles in #599
  • [Feat] Use sidecar to download lora model for helm deployment by @zerofishnoodles in #618
  • [Router] Improve performance of round-robin router by @zhouwfang in #584
  • [Feat] Use sidecar to download lora model for operator deployment by @zerofishnoodles in #622
  • [Feature] Add method to check Pod termination status and update Pod readiness logic by @KevinCheung2259 in #602
  • [Feat] Add Sentry Continuous Profiling Support to vLLM Router by @ikaadil in #624
  • [Feat][Helm] Add HTTPRoute template for Gateway API support by @Hexoplon in #610
  • [Feat] Init Container "extraVolumeMount" by @cm-enfuse in #600
  • [Bugfix][Router]: simplify test payload by @max-wittig in #613
  • [Feat] Add Support HAMi resources variables by @andresd95 in #579
  • feature/KV-cache-aware-routing by @BrianPark314 in #550
  • [Router] Replace httpx with aiohttp in vllm_router for enhanced high-concurrency performance by @ikaadil in #589
  • feature/prefix-aware-routing by @BrianPark314 in #546
  • [Feat][Router]: add extra support for YAML config file by @antoineauger in #621
  • [CI] Add stress testing for router by @kobe0938 in #633
  • [Misc] Auto-size Minikube memory via calculate_safe_memory by @fulvius31 in #637
  • [Bugfix][Router]: reconfigure callbacks with dynamic config by @antoineauger in #642
  • [Doc] Add a missing word in the description by @JiangJiaWei1103 in #645
  • [Bugfix] Correct the routing logic for KV cache aware routing by @JiangJiaWei1103 in #648
  • [Router][CI/CD and misc.] Add RoundRobinRouter logic testing by @lucas-tucker in #639
  • [Docs] Modify the kvaware routing doc by @zerofishnoodles in #652
  • [Router] Optimize request parsing by removing duplicate await calls by @ikaadil in #629
  • [Feat][Router] Add configurable timeout_seconds for Kubernetes watchers by @ikaadil in #654
  • [Misc] Change community meeting time by @zerofishnoodles in #662
  • [Bugfix] Fix install script path prefix by @nicolasj92 in #665
  • [Feat] Env from secret by @redno2 in #641
  • [Bugfix] Fix routing to delete endpoint by @zerofishnoodles in #668
  • Bugfix(vllm-operator): add missing RBAC permissions for PVCs and Ingresses by @mahmoudk1000 in #647
  • [feat]: add transcription API endpoint using OpenAI Whisper-small by @davidgao7 in #469
  • Bump helm chart version by @philandstuff in #674

New Contributors

Full Changelog: vllm-stack-0.1.6...vllm-stack-0.1.7