Skip to content

v0.3.0

Latest

Choose a tag to compare

@flesher0813 flesher0813 released this 30 Jan 08:47
· 44 commits to develop since this release
8dd98d1

HighLights

  • Refinement of PipelineStore Architecture and Enhancement of Core Capabilities #653 #711
  • Now supports 3FS for scalable and efficient storage backends #622
  • Features the new GSAOnDevice sparse attention algorithm, enabling high-performance HBM utilization across both CUDA and Ascend platforms.#647 #638
  • Aligned CacheBlend with the new UCM storage and sparse engine updates to support vLLM 0.9.2. #664

Known Issues

  • Layerwise is not supported when using vllm 0.11.0
    • Currently, installing with pip install uc-manager does not support using vllm 0.11.0.
    • If you need to use vLLM 0.11.0+ with UCM layerwise, please refer to vllm-project/vllm#26675 for modifications.

What's Changed

New Contributors

Full Changelog: v0.2.0...v0.3.0