Merge pull request docker#110 from ArthurFlag/ENGDOCS-2787-update-model-pull-docs

ArthurFlag · web-flow · commit 73b89b6fc66a · 2025-07-01T10:39:14.000+02:00
docs: note about quantized models
diff --git a/cmd/cli/docs/reference/docker_model_pull.yaml b/cmd/cli/docs/reference/docker_model_pull.yaml
@@ -16,6 +16,11 @@ examples: |-
 
     You can pull GGUF models directly from [Hugging Face](https://huggingface.co/models?library=gguf).
 
+    **Note about quantization:** If no tag is specified, the command tries to pull the `Q4_K_M` version of the model.
+    If `Q4_K_M` doesn't exist, the command pulls the first GGUF found in the **Files** view of the model on HuggingFace.
+    To specify the quantization, provide it as a tag, for example:
+    `docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_S`
+
     ```console
     docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
     ```
diff --git a/cmd/cli/docs/reference/model_pull.md b/cmd/cli/docs/reference/model_pull.md
@@ -22,6 +22,11 @@ docker model pull ai/smollm2
 
 You can pull GGUF models directly from [Hugging Face](https://huggingface.co/models?library=gguf).
 
+**Note about quantization:** If no tag is specified, the command tries to pull the `Q4_K_M` version of the model.
+If `Q4_K_M` doesn't exist, the command pulls the first GGUF found in the **Files** view of the model on HuggingFace.
+To specify the quantization, provide it as a tag, for example:
+`docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_S`
+
 ```console
 docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
 ```