Skip to content

v0.2.2

Compare
Choose a tag to compare
@github-actions github-actions released this 19 Nov 05:58
· 9729 commits to main since this release
c5f7740

Major changes

  • Bump up to PyTorch v2.1 + CUDA 12.1 (vLLM+CUDA 11.8 is also provided)
  • Extensive refactoring for better tensor parallelism & quantization support
  • New models: Yi, ChatGLM, Phi
  • Changes in scheduler: from 1D flattened input tensor to 2D tensor
  • AWQ support for all models
  • Added LogitsProcessor API
  • Preliminary support for SqueezeLLM

What's Changed

New Contributors

Full Changelog: v0.2.1...v0.2.2