|
1 | 1 | --- |
| 2 | +- name: "todo-14b-i1" |
| 3 | + url: "github:mudler/LocalAI/gallery/virtual.yaml@master" |
| 4 | + urls: |
| 5 | + - https://huggingface.co/mradermacher/Todo-14B-i1-GGUF |
| 6 | + description: | |
| 7 | + The **Todo-14B** model is a quantized version of the base model **EcthelionLiu/Todo-14B**, optimized for efficiency and performance. It is designed for use with GGUF format, offering a range of quantized variants tailored for different trade-offs between accuracy, size, and speed. Key details include: |
| 8 | + |
| 9 | + ### **Base Model** |
| 10 | + - **Name**: EcthelionLiu/Todo-14B |
| 11 | + - **Language Support**: English and Chinese |
| 12 | + - **Library**: Transformers (via HuggingFace) |
| 13 | + - **License**: Apache-2.0 |
| 14 | + |
| 15 | + ### **Quantized Versions** |
| 16 | + - **Recommended**: `Todo-14B.i1-Q4_K_M.gguf` (size: 9.1 GB, fast, recommended) |
| 17 | + - **Best Quality**: `Todo-14B.i1-Q4_K_S.gguf` (size: 8.7 GB, optimal balance of size, speed, and accuracy) |
| 18 | + - **Lower-Quality Options**: Various IQ (Int8, Q3_K, Q2_K, etc.) variants, with trade-offs in accuracy for smaller sizes. |
| 19 | + |
| 20 | + ### **Usage** |
| 21 | + - Requires GGUF file format. |
| 22 | + - Use [TheBloke's READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for detailed instructions on file concatenation. |
| 23 | + - Available at [nethype GmbH](https://www.nethype.de/). |
| 24 | + |
| 25 | + ### **Notes** |
| 26 | + - Quantization by mradermacher for efficiency. |
| 27 | + - Higher-quality variants (e.g., Q4_K_M) are preferred for best performance. |
| 28 | + |
| 29 | + This model is ideal for deploying with low latency and minimal memory usage, suitable for applications requiring fast inference. |
| 30 | + overrides: |
| 31 | + parameters: |
| 32 | + model: llama-cpp/models/Todo-14B.i1-Q4_K_M.gguf |
| 33 | + name: Todo-14B-i1-GGUF |
| 34 | + backend: llama-cpp |
| 35 | + template: |
| 36 | + use_tokenizer_template: true |
| 37 | + known_usecases: |
| 38 | + - chat |
| 39 | + function: |
| 40 | + grammar: |
| 41 | + disable: true |
| 42 | + description: Imported from https://huggingface.co/mradermacher/Todo-14B-i1-GGUF |
| 43 | + options: |
| 44 | + - use_jinja:true |
| 45 | + files: |
| 46 | + - filename: llama-cpp/models/Todo-14B.i1-Q4_K_M.gguf |
| 47 | + sha256: 0eac62c574f052145b6580c1b1d5f78f020171386c017ef7f57e24dc29e28654 |
| 48 | + uri: https://huggingface.co/mradermacher/Todo-14B-i1-GGUF/resolve/main/Todo-14B.i1-Q4_K_M.gguf |
2 | 49 | - &nanbeige4 |
3 | 50 | name: "nanbeige4.1-3b-q8" |
4 | 51 | url: "github:mudler/LocalAI/gallery/nanbeige4.1.yaml@master" |
|
0 commit comments