Prepare Docker file ahead of multi-backend builds of llama-server #79

p1-0tr · 2025-06-12T13:32:48Z

llama.cpp can be built to support multiple variants of the CPU backend, and choose the best one at runtime. This requires a dynamically linked build of the llama-server. So modify the model-runner Docker file to play nice with such builds of the llama-server.

llama.cpp can be built to support multiple variants of the CPU backend, and choose the best one at runtime. This requires a dynamically linked build of the llama-server. So modify the model-runner Docker file to play nice with such builds of the llama-server. Signed-off-by: Piotr Stankiewicz <[email protected]>

Configure inference backend via compose up

p1-0tr requested a review from xenoscopic June 12, 2025 13:32

xenoscopic approved these changes Jun 12, 2025

View reviewed changes

p1-0tr merged commit 0130eb6 into main Jun 12, 2025
4 checks passed

p1-0tr deleted the ps-expand-supported-cpus branch June 12, 2025 15:24

ericcurtin referenced this pull request in ericcurtin/model-runner Sep 21, 2025

fix Qwen/QwQ-32B (#79)

bc6cec9

doringeman added a commit to doringeman/model-runner that referenced this pull request Oct 2, 2025

Merge pull request docker#79 from doringeman/compose-llama-args

4f43230

Configure inference backend via compose up

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prepare Docker file ahead of multi-backend builds of llama-server #79

Prepare Docker file ahead of multi-backend builds of llama-server #79

Uh oh!

p1-0tr commented Jun 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Prepare Docker file ahead of multi-backend builds of llama-server #79

Prepare Docker file ahead of multi-backend builds of llama-server #79

Uh oh!

Conversation

p1-0tr commented Jun 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants