Skip to content

Commit 481c63a

Browse files
ngxsonVaibhavs10
andauthored
add ollama docs (#1447)
* add ollama docs * correct link for 1B example * Update docs/hub/ollama.md Co-authored-by: vb <[email protected]> --------- Co-authored-by: vb <[email protected]>
1 parent 33579c3 commit 481c63a

File tree

2 files changed

+74
-0
lines changed

2 files changed

+74
-0
lines changed

docs/hub/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,8 @@
144144
title: GGUF usage with llama.cpp
145145
- local: gguf-gpt4all
146146
title: GGUF usage with GPT4All
147+
- local: ollama
148+
title: Use Ollama with GGUF Model
147149
- title: Datasets
148150
local: datasets
149151
isExpanded: true

docs/hub/ollama.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Use Ollama with any GGUF Model on Hugging Face Hub
2+
3+
![cover](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/ollama/cover.png)
4+
5+
Ollama is an application based on llama.cpp to interact with LLMs directly through your computer. You can use any GGUF quants created by the community ([bartowski](https://huggingface.co/bartowski), [MaziyarPanahi](https://huggingface.co/MaziyarPanahi) and many more) on Hugging Face directly with Ollama, without creating a new `Modelfile`. At the time of writing there are 45K public GGUF checkpoints on the Hub, you can run any of them with a single `ollama run` command. We also provide customisations like choosing quantization type, system prompt and more to improve your overall experience.
6+
7+
Getting started is as simple as:
8+
9+
```sh
10+
ollama run hf.co/{username}/{repository}
11+
```
12+
13+
Please note that you can use both `hf.co` and `huggingface.co` as the domain name.
14+
15+
Here are some other models that you can try:
16+
17+
```sh
18+
ollama run hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
19+
ollama run hf.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF
20+
ollama run hf.co/arcee-ai/SuperNova-Medius-GGUF
21+
ollama run hf.co/bartowski/Humanish-LLama3-8B-Instruct-GGUF
22+
```
23+
24+
## Custom Quantization
25+
26+
By default, the `Q4_K_M` quantization scheme is used. To select a different scheme, simply add a tag:
27+
28+
```sh
29+
ollama run hf.co/{username}/{repository}:{quantization}
30+
```
31+
32+
![guide](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/ollama/guide.png)
33+
34+
For example:
35+
36+
```sh
37+
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:IQ3_M
38+
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:Q8_0
39+
40+
# the quantization name is case-insensitive, this will also work
41+
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:iq3_m
42+
43+
# you can also select a specific file
44+
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:Llama-3.2-3B-Instruct-IQ3_M.gguf
45+
```
46+
47+
## Custom Chat Template and Parameters
48+
49+
By default, a template will be selected automatically from a list of commonly used templates. It will be selected based on the built-in `tokenizer.chat_template` metadata stored inside the GGUF file.
50+
51+
If your GGUF file doesn't have a built-in template or uses a custom chat template, you can create a new file called `template` in the repository. The template must be a Go template, not a Jinja template. Here's an example:
52+
53+
```
54+
{{ if .System }}<|system|>
55+
{{ .System }}<|end|>
56+
{{ end }}{{ if .Prompt }}<|user|>
57+
{{ .Prompt }}<|end|>
58+
{{ end }}<|assistant|>
59+
{{ .Response }}<|end|>
60+
```
61+
62+
To know more about Go template format, please refer to [this documentation](https://github.com/ollama/ollama/blob/main/docs/template.md)
63+
64+
You can optionally configure a system prompt by putting it into a new file named `system` in the repository.
65+
66+
To change sampling parameters, create a file named `params` in the repository. The file must be in JSON format. For the list of all available parameters, please refer to [this documentation](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#parameter).
67+
68+
69+
## References
70+
71+
- https://github.com/ollama/ollama/blob/main/docs/README.md
72+
- https://huggingface.co/docs/hub/en/gguf

0 commit comments

Comments
 (0)