Skip to content

Release 25.01 Alpha

Pre-release
Pre-release

Choose a tag to compare

@dreaming-panda dreaming-panda released this 14 Jan 01:15
· 67 commits to v0.1.0 since this release
  • Model Supported: Llama series and AWQ version
  • Application Supported: CLI Chatbot, API Server/Client, Gradio
  • Inference Engine: Static and dynamic tree speculative decoding (including GPU-only and offloading)