Skip to content

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Jan 20, 2025

Same as -hf arg, but for the draft model.

Example

llama-server \
    -hf  ggml-org/DeepSeek-R1-Distill-Qwen-32B-Q8_0-GGUF \
    -hfd ggml-org/DeepSeek-R1-Distill-Qwen-1.5B-Q4_0-GGUF \
    --ctx-size 32768 -fa -t 1 --draft-max 16 --draft-min 2

@ggerganov ggerganov requested a review from ngxson as a code owner January 20, 2025 20:16
@ggerganov ggerganov merged commit 80d0d6b into master Jan 20, 2025
45 checks passed
@ggerganov ggerganov deleted the gg/arg-add-hfd branch January 20, 2025 20:29
anagri pushed a commit to BodhiSearch/llama.cpp that referenced this pull request Jan 26, 2025
* common : add -hfd option for the draft model

* cont : fix env var

* cont : more fixes
@eng1n88r
Copy link

eng1n88r commented Feb 4, 2025

How would you specify draft model local path in this case?

i.e.

llama-server \
    --hf-repo ggml-org/Qwen2.5-Coder-7B-Q8_0-GGUF \
    --hf-file qwen2.5-coder-7b-q8_0.gguf \
    --port 8012 -ngl 99 -fa -ub 1024 -b 1024 \
    --model /custom/model/path/Qwen2.5-Coder-7B-Q8_0.gguf \
    --ctx-size 0 --cache-reuse 256

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
* common : add -hfd option for the draft model

* cont : fix env var

* cont : more fixes
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
* common : add -hfd option for the draft model

* cont : fix env var

* cont : more fixes
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
* common : add -hfd option for the draft model

* cont : fix env var

* cont : more fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants