@@ -56,17 +56,17 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
5656
5757## Model Supports
5858
59- | Model Name | FP16 | Q8_0 | Q4_0 |
59+ | Model Name | FP16 | Q4_0 | Q8_0 |
6060| :----------------------------| :-----:| :----:| :----:|
6161| Llama-2 | √ | √ | √ |
6262| Llama-3 | √ | √ | √ |
6363| Mistral-7B | √ | √ | √ |
6464| Mistral MOE | √ | √ | √ |
65- | DBRX | ? | ? | ? |
65+ | DBRX | - | - | - |
6666| Falcon | √ | √ | √ |
6767| Chinese LLaMA/Alpaca | √ | √ | √ |
6868| Vigogne(French) | √ | √ | √ |
69- | BERT | √ | √ | √ |
69+ | BERT | x | x | x |
7070| Koala | √ | √ | √ |
7171| Baichuan | √ | √ | √ |
7272| Aquila 1 & 2 | √ | √ | √ |
@@ -80,7 +80,7 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
8080| Qwen models | √ | √ | √ |
8181| PLaMo-13B | √ | √ | √ |
8282| Phi models | √ | √ | √ |
83- | PhiMoE | ? | ? | ? |
83+ | PhiMoE | √ | √ | √ |
8484| GPT-2 | √ | √ | √ |
8585| Orion | √ | √ | √ |
8686| InternlLM2 | √ | √ | √ |
@@ -89,45 +89,45 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
8989| Mamba | √ | √ | √ |
9090| Xverse | √ | √ | √ |
9191| command-r models | √ | √ | √ |
92- | Grok-1 | ? | ? | ? |
92+ | Grok-1 | - | - | - |
9393| SEA-LION | √ | √ | √ |
9494| GritLM-7B | √ | √ | √ |
9595| OLMo | √ | √ | √ |
9696| OLMo 2 | √ | √ | √ |
97- | OLMoE | ? | ? | ? |
97+ | OLMoE | √ | √ | √ |
9898| Granite models | √ | √ | √ |
99- | GPT-NeoX | ? | ? | ? |
99+ | GPT-NeoX | √ | √ | √ |
100100| Pythia | √ | √ | √ |
101- | Snowflake-Arctic MoE | ? | ? | ? |
101+ | Snowflake-Arctic MoE | - | - | - |
102102| Smaug | √ | √ | √ |
103103| Poro 34B | √ | √ | √ |
104104| Bitnet b1.58 models | √ | x | x |
105105| Flan-T5 | √ | √ | √ |
106- | Open Elm models | x | x | x |
106+ | Open Elm models | x | √ | √ |
107107| chatGLM3-6B + ChatGLM4-9b + GLMEdge-1.5b + GLMEdge-4b | √ | √ | √ |
108108| GLM-4-0414 | √ | √ | √ |
109109| SmolLM | √ | √ | √ |
110110| EXAONE-3.0-7.8B-Instruct | √ | √ | √ |
111111| FalconMamba Models | √ | √ | √ |
112- | Jais Models | ? | ? | ? |
112+ | Jais Models | - | x | x |
113113| Bielik-11B-v2.3 | √ | √ | √ |
114- | RWKV-6 | √ | √ | √ |
114+ | RWKV-6 | - | √ | √ |
115115| QRWKV-6 | √ | √ | √ |
116116| GigaChat-20B-A3B | x | x | x |
117117| Trillion-7B-preview | √ | √ | √ |
118118| Ling models | √ | √ | √ |
119119
120120
121121** Multimodal**
122- | LLaVA 1.5 models, LLaVA 1.6 models | ? | ? | ? |
123- | BakLLaVA | ? | ? | ? |
124- | Obsidian | ? | ? | ? |
125- | ShareGPT4V | ? | ? | ? |
126- | MobileVLM 1.7B/3B models | ? | ? | ? |
127- | Yi-VL | ? | ? | ? |
122+ | LLaVA 1.5 models, LLaVA 1.6 models | x | x | x |
123+ | BakLLaVA | √ | √ | √ |
124+ | Obsidian | √ | - | - |
125+ | ShareGPT4V | x | - | - |
126+ | MobileVLM 1.7B/3B models | - | - | - |
127+ | Yi-VL | - | - | - |
128128| Mini CPM | √ | √ | √ |
129129| Moondream | √ | √ | √ |
130- | Bunny | ? | ? | ? |
130+ | Bunny | √ | - | - |
131131| GLM-EDGE | √ | √ | √ |
132132| Qwen2-VL | √ | √ | √ |
133133
0 commit comments