File tree Expand file tree Collapse file tree 1 file changed +12
-12
lines changed
Expand file tree Collapse file tree 1 file changed +12
-12
lines changed Original file line number Diff line number Diff line change 11# Roadmap
22
33## Functionality
4- - [ ] Fine-tune Medusa heads together with LM head from scratch
5- - [ ] Distill from any model without access to the original training data
6- - [ ] Batched inference
7- - [ ] Fine-grained KV cache management
4+ - [ ] Fine-tune Medusa heads together with LM head from scratch
5+ - [ ] Distill from any model without access to the original training data
6+ - [ ] Batched inference
7+ - [ ] Fine-grained KV cache management
88
99## Integration
1010### Local Deployment
11- - [ ] [ mlc-llm] ( https://github.com/mlc-ai/mlc-llm )
12- - [ ] [ exllama] ( https://github.com/turboderp/exllama )
13- - [ ] [ llama.cpp] ( https://github.com/ggerganov/llama.cpp )
11+ - [ ] [ mlc-llm] ( https://github.com/mlc-ai/mlc-llm )
12+ - [ ] [ exllama] ( https://github.com/turboderp/exllama )
13+ - [ ] [ llama.cpp] ( https://github.com/ggerganov/llama.cpp )
1414### Serving
15- - [ ] [ vllm] ( https://github.com/vllm-project/vllm )
16- - [ ] [ TGI] ( https://github.com/huggingface/text-generation-inference )
17- - [ ] [ lightllm] ( https://github.com/ModelTC/lightllm )
15+ - [ ] [ vllm] ( https://github.com/vllm-project/vllm )
16+ - [ ] [ TGI] ( https://github.com/huggingface/text-generation-inference )
17+ - [ ] [ lightllm] ( https://github.com/ModelTC/lightllm )
1818
1919## Research
20- - [ ] Optimize the tree-based attention to reduce additional computation
21- - [ ] Improve the acceptance scheme to generate more diverse sequences
20+ - [ ] Optimize the tree-based attention to reduce additional computation
21+ - [ ] Improve the acceptance scheme to generate more diverse sequences
You can’t perform that action at this time.
0 commit comments