Skip to content

Commit 8d23bfc

Browse files
committed
common: add configuration presets for chat and reranking servers
Added two new configuration presets to simplify command-line usage: 1. --chat-llama3-8b-default for running a chat server with Llama3 8B model, 2. --rerank-bge-default for running a reranking server with the BGE model. These presets configure appropriate model paths, server ports, GPU settings, and other parameters. Refs: #10932
1 parent 27aa259 commit 8d23bfc

File tree

1 file changed

+30
-0
lines changed

1 file changed

+30
-0
lines changed

common/arg.cpp

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3325,5 +3325,35 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
33253325
}
33263326
).set_examples({LLAMA_EXAMPLE_SERVER}));
33273327

3328+
add_opt(common_arg(
3329+
{"--chat-llama3-8b-default"},
3330+
string_format("use default Llama3 8B model for chat server (note: can download weights from the internet)"),
3331+
[](common_params & params) {
3332+
params.model.hf_repo = "ggml-org/Llama-3-8B-Q8_0-GGUF";
3333+
params.model.hf_file = "llama-3-8b-q8_0.gguf";
3334+
params.port = 8080;
3335+
params.n_gpu_layers = 99;
3336+
params.flash_attn = true;
3337+
params.n_ubatch = 512;
3338+
params.n_batch = 512;
3339+
params.n_ctx = 4096;
3340+
params.n_cache_reuse = 256;
3341+
}
3342+
).set_examples({LLAMA_EXAMPLE_SERVER}));
3343+
3344+
add_opt(common_arg(
3345+
{"--rerank-bge-default"},
3346+
string_format("use default BGE reranker model for reranking server (note: can download weights from the internet)"),
3347+
[](common_params & params) {
3348+
params.model.hf_repo = "ggml-org/bge-reranker-base-Q8_0-GGUF";
3349+
params.model.hf_file = "bge-reranker-base-q8_0.gguf";
3350+
params.port = 8090;
3351+
params.n_gpu_layers = 99;
3352+
params.flash_attn = true;
3353+
params.n_ctx = 512;
3354+
params.reranking = true;
3355+
}
3356+
).set_examples({LLAMA_EXAMPLE_SERVER}));
3357+
33283358
return ctx_arg;
33293359
}

0 commit comments

Comments
 (0)