Skip to content

Conversation

@p1-0tr
Copy link

@p1-0tr p1-0tr commented Jun 12, 2025

llama.cpp can be built to support multiple variants of the CPU backend, and choose the best one at runtime. This requires a dynamically linked build of the llama-server. So modify the model-runner Docker file to play nice with such builds of the llama-server.

llama.cpp can be built to support multiple variants of the CPU backend,
and choose the best one at runtime. This requires a dynamically linked
build of the llama-server. So modify the model-runner Docker file to
play nice with such builds of the llama-server.

Signed-off-by: Piotr Stankiewicz <[email protected]>
@p1-0tr p1-0tr requested a review from xenoscopic June 12, 2025 13:32
@p1-0tr p1-0tr merged commit 0130eb6 into main Jun 12, 2025
4 checks passed
@p1-0tr p1-0tr deleted the ps-expand-supported-cpus branch June 12, 2025 15:24
ericcurtin referenced this pull request in ericcurtin/model-runner Sep 21, 2025
doringeman added a commit to doringeman/model-runner that referenced this pull request Oct 2, 2025
Configure inference backend via compose up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants