You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/backend/OPENCL.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,7 +50,7 @@ The llama.cpp OpenCL backend is designed to enable llama.cpp on **Qualcomm Adren
50
50
51
51
## Model Preparation
52
52
53
-
You can refer to the general [llama-quantize tool](tools/quantize/README.md) for steps to convert a model in Hugging Face safetensor format to GGUF with quantization.
53
+
You can refer to the general [llama-quantize tool](/tools/quantize/README.md) for steps to convert a model in Hugging Face safetensor format to GGUF with quantization.
54
54
55
55
Currently we support `Q4_0` quantization and have optimized for it. To achieve best performance on Adreno GPU, add `--pure` to `llama-quantize`. For example,
0 commit comments