Releases: Infini-AI-Lab/UMbreLLa
Releases · Infini-AI-Lab/UMbreLLa
Release 25.01 Alpha
- Model Supported: Llama series and AWQ version
- Application Supported: CLI Chatbot, API Server/Client, Gradio
- Inference Engine: Static and dynamic tree speculative decoding (including GPU-only and offloading)