File tree Expand file tree Collapse file tree 2 files changed +8
-2
lines changed
Expand file tree Collapse file tree 2 files changed +8
-2
lines changed Original file line number Diff line number Diff line change @@ -22,14 +22,15 @@ Python 3 and C++ compiler required. The command will download the model and the
2222| Model | Size | Command |
2323| --------------------------------- | -------- | ---------------------------------------------------- |
2424| Llama 3.1 8B Instruct Q40 | 6.32 GB | ` python launch.py llama3_1_8b_instruct_q40 ` |
25- | Llama 3.1 405B Instruct Q40. | 238 GB | ` python launch.py llama3_1_405b_instruct_q40 ` . |
25+ | Llama 3.1 405B Instruct Q40 | 238 GB | ` python launch.py llama3_1_405b_instruct_q40 ` . |
2626| Llama 3.2 1B Instruct Q40 | 1.7 GB | ` python launch.py llama3_2_1b_instruct_q40 ` |
2727| Llama 3.2 3B Instruct Q40 | 3.4 GB | ` python launch.py llama3_2_3b_instruct_q40 ` |
2828| Llama 3.3 70B Instruct Q40 | 40 GB | ` python launch.py llama3_3_70b_instruct_q40 ` |
2929| DeepSeek R1 Distill Llama 8B Q40 | 6.32 GB | ` python launch.py deepseek_r1_distill_llama_8b_q40 ` |
3030| Qwen 3 0.6B Q40 | 0.9 GB | ` python launch.py qwen3_0.6b_q40 ` |
3131| Qwen 3 1.7B Q40 | 2.2 GB | ` python launch.py qwen3_1.7b_q40 ` |
32- | Qwen 3 8B Q40 | 6.7 GB | ` python launch.py qwen3_8b_q40 ` |
32+ | Qwen 3 8B Q40 | 6.7 GB | ` python launch.py qwen3_8b_q40 ` |
33+ | Qwen 3 14B Q40 | 10.9 GB | ` python launch.py qwen3_14b_q40 ` |
3334
3435### 🛠️ Convert Model Manually
3536
Original file line number Diff line number Diff line change @@ -59,6 +59,11 @@ def parts(length):
5959 'https://huggingface.co/b4rtaz/Qwen3-8B-Q40-Distributed-Llama/resolve/main/dllama_tokenizer_qwen3_8b.t?download=true' ,
6060 'q40' , 'q80' , 'chat' , '--max-seq-len 4096'
6161 ],
62+ 'qwen3_14b_q40' : [
63+ list (map (lambda suffix : f'https://huggingface.co/b4rtaz/Qwen3-14B-Q40-Distributed-Llama/resolve/main/dllama_model_qwen3_14b_q40_{ suffix } ?download=true' , parts (2 ))),
64+ 'https://huggingface.co/b4rtaz/Qwen3-14B-Q40-Distributed-Llama/resolve/main/dllama_tokenizer_qwen3_14b.t?download=true' ,
65+ 'q40' , 'q80' , 'chat' , '--max-seq-len 4096'
66+ ],
6267}
6368
6469def confirm (message : str ):
You can’t perform that action at this time.
0 commit comments