Skip to content

vllm-stack-0.1.2

Choose a tag to compare

@github-actions github-actions released this 29 Apr 19:56
2404918

The stack deployment of vLLM

What's Changed

  • [Feat] Adding support to turn on/off engine deployment by @dumb0002 #311
  • [Feat] Add nodeSelectorTerms for router & cacher servers by @kinoute #314
  • [Bugfix] Update logger handler to handle stdout/stderr properly @corona10 #320
  • [CI] Always upload logs of Helm functionality checks @pwuersch #321
  • [CI/Build] Remove sudo requirements in CI/CD @Shaoting-Feng #325
  • [Feat] Multiple service creation when multiple models specified @lucas-tucker #326
  • [CI] Add coverage tracking @zhuohangu #330
  • [CLI/Doc]Update on gke deployment with gpu quota @EaminC #334
  • [Bugfix] Fix thread creation to pass parameters properly. @corona10 #336
  • [Feat] OpenTelemetry Support Example @lucas-tucker #346
  • [Feat] Tool calling support for MCP client integration @YuhanLiu11 #352
  • [Benchmark] Add api key option @Kimdongui #354
  • [Bugfix] fix init container pvc volume mount @zerofishnoodles #359
  • [Feat] Enabled latency monitor and added average latency computation logic @insukim1994 #362
  • [Feat] Added a tutorial document for deploying production stack on amd gpus @insukim1994 #364
  • [Bugfix] Deprecated least loaded routing logic @insukim1994 #366
  • [Bugfix] added model name to deployment selector @TamKej #367
  • [Feat] helm: add routerSpec.serviceType value @marquiz #368
  • [Feat] Support Multi-Model Deployment with Enhanced vLLM Configurations @haitwang-cloud #371
  • [Bugfix] Fixing issues on the engine svc labels @dumb0002 #376
  • [Bugfix] Declare logger properly for protocols.py @corona10 #381
  • [Feat] Adding a tutorial for using vLLM v1 in production stack @YuhanLiu11 #390