File tree Expand file tree Collapse file tree 1 file changed +21
-0
lines changed
Expand file tree Collapse file tree 1 file changed +21
-0
lines changed Original file line number Diff line number Diff line change 1+ ## Release v1.0 — tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
2+
3+ A compact, quantized chat model file in GGUF format.
4+
5+ ### Contents
6+ - tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf — 1.1B-parameter chat model quantized to Q4_K_M for reduced size and faster inference.
7+
8+ ### Usage example (CLI)
9+ 1 . Download the GGUF file to your model directory.
10+ 2 . Load with a compatible runtime (example: llama.cpp / ggml-based runtimes):
11+ ```
12+ ./main -m ./models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf -p "Hello, how are you?"
13+ ```
14+
15+ ### Notes
16+ - Quantized format trades some precision for smaller size and speed — suitable for lightweight inference and experimentation.
17+ - Ensure your inference tool supports GGUF and the Q4_K_M quantization type.
18+ - No license or training data details included — check upstream/source for licensing and provenance.
19+
20+ ### Contact
21+ For issues or questions, open an issue on this repository.
You can’t perform that action at this time.
0 commit comments