Skip to content

Releases: Infini-AI-Lab/UMbreLLa

Release 25.01 Alpha

14 Jan 01:15

Choose a tag to compare

Release 25.01 Alpha Pre-release
Pre-release
  • Model Supported: Llama series and AWQ version
  • Application Supported: CLI Chatbot, API Server/Client, Gradio
  • Inference Engine: Static and dynamic tree speculative decoding (including GPU-only and offloading)