Skip to content

Commit 30d7835

Browse files
mudlergithub-actions[bot]
authored andcommitted
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 820bd7d commit 30d7835

File tree

1 file changed

+47
-0
lines changed

1 file changed

+47
-0
lines changed

gallery/index.yaml

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,51 @@
11
---
2+
- name: "todo-14b-i1"
3+
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
4+
urls:
5+
- https://huggingface.co/mradermacher/Todo-14B-i1-GGUF
6+
description: |
7+
The **Todo-14B** model is a quantized version of the base model **EcthelionLiu/Todo-14B**, optimized for efficiency and performance. It is designed for use with GGUF format, offering a range of quantized variants tailored for different trade-offs between accuracy, size, and speed. Key details include:
8+
9+
### **Base Model**
10+
- **Name**: EcthelionLiu/Todo-14B
11+
- **Language Support**: English and Chinese
12+
- **Library**: Transformers (via HuggingFace)
13+
- **License**: Apache-2.0
14+
15+
### **Quantized Versions**
16+
- **Recommended**: `Todo-14B.i1-Q4_K_M.gguf` (size: 9.1 GB, fast, recommended)
17+
- **Best Quality**: `Todo-14B.i1-Q4_K_S.gguf` (size: 8.7 GB, optimal balance of size, speed, and accuracy)
18+
- **Lower-Quality Options**: Various IQ (Int8, Q3_K, Q2_K, etc.) variants, with trade-offs in accuracy for smaller sizes.
19+
20+
### **Usage**
21+
- Requires GGUF file format.
22+
- Use [TheBloke's READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for detailed instructions on file concatenation.
23+
- Available at [nethype GmbH](https://www.nethype.de/).
24+
25+
### **Notes**
26+
- Quantization by mradermacher for efficiency.
27+
- Higher-quality variants (e.g., Q4_K_M) are preferred for best performance.
28+
29+
This model is ideal for deploying with low latency and minimal memory usage, suitable for applications requiring fast inference.
30+
overrides:
31+
parameters:
32+
model: llama-cpp/models/Todo-14B.i1-Q4_K_M.gguf
33+
name: Todo-14B-i1-GGUF
34+
backend: llama-cpp
35+
template:
36+
use_tokenizer_template: true
37+
known_usecases:
38+
- chat
39+
function:
40+
grammar:
41+
disable: true
42+
description: Imported from https://huggingface.co/mradermacher/Todo-14B-i1-GGUF
43+
options:
44+
- use_jinja:true
45+
files:
46+
- filename: llama-cpp/models/Todo-14B.i1-Q4_K_M.gguf
47+
sha256: 0eac62c574f052145b6580c1b1d5f78f020171386c017ef7f57e24dc29e28654
48+
uri: https://huggingface.co/mradermacher/Todo-14B-i1-GGUF/resolve/main/Todo-14B.i1-Q4_K_M.gguf
249
- &nanbeige4
350
name: "nanbeige4.1-3b-q8"
451
url: "github:mudler/LocalAI/gallery/nanbeige4.1.yaml@master"

0 commit comments

Comments
 (0)