Skip to content

Commit 130b99d

Browse files
committed
Create README.md
0 parents  commit 130b99d

File tree

1 file changed

+21
-0
lines changed

1 file changed

+21
-0
lines changed

README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
## Release v1.0 — tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
2+
3+
A compact, quantized chat model file in GGUF format.
4+
5+
### Contents
6+
- tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf — 1.1B-parameter chat model quantized to Q4_K_M for reduced size and faster inference.
7+
8+
### Usage example (CLI)
9+
1. Download the GGUF file to your model directory.
10+
2. Load with a compatible runtime (example: llama.cpp / ggml-based runtimes):
11+
```
12+
./main -m ./models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf -p "Hello, how are you?"
13+
```
14+
15+
### Notes
16+
- Quantized format trades some precision for smaller size and speed — suitable for lightweight inference and experimentation.
17+
- Ensure your inference tool supports GGUF and the Q4_K_M quantization type.
18+
- No license or training data details included — check upstream/source for licensing and provenance.
19+
20+
### Contact
21+
For issues or questions, open an issue on this repository.

0 commit comments

Comments
 (0)