Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 18 additions & 9 deletions llamacpp/native/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,31 @@ project(

option(DDLLAMA_BUILD_SERVER "Build the DD llama.cpp server executable" ON)
option(DDLLAMA_BUILD_UTILS "Build utilities, e.g. nv-gpu-info" OFF)
set(DDLLAMA_PATCH_COMMAND "patch" CACHE STRING "patch command")

set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)

if (DDLLAMA_BUILD_SERVER)
set(LLAMA_BUILD_COMMON ON)
# Enable building the vanilla llama.cpp server
set(LLAMA_BUILD_COMMON ON CACHE BOOL "" FORCE)
set(LLAMA_BUILD_TOOLS ON CACHE BOOL "" FORCE)
set(LLAMA_BUILD_SERVER ON CACHE BOOL "" FORCE)
add_subdirectory(vendor/llama.cpp)
# Get build info and set version for mtmd just like it's done in llama.cpp/CMakeLists.txt
include(vendor/llama.cpp/cmake/build-info.cmake)
if (NOT DEFINED LLAMA_BUILD_NUMBER)
set(LLAMA_BUILD_NUMBER ${BUILD_NUMBER})

# Create a custom target that copies/renames the server binary after build
if (WIN32)
set(SERVER_OUTPUT_NAME com.docker.llama-server.exe)
else()
set(SERVER_OUTPUT_NAME com.docker.llama-server)
endif()
set(LLAMA_INSTALL_VERSION 0.0.${LLAMA_BUILD_NUMBER})
add_subdirectory(vendor/llama.cpp/tools/mtmd)
add_subdirectory(src/server)

add_custom_target(docker-llama-server ALL
DEPENDS llama-server
COMMAND ${CMAKE_COMMAND} -E copy
$<TARGET_FILE:llama-server>
${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/${SERVER_OUTPUT_NAME}
COMMENT "Creating ${SERVER_OUTPUT_NAME} from llama-server"
)
endif()

if (WIN32 AND DDLLAMA_BUILD_UTILS)
Expand Down
20 changes: 5 additions & 15 deletions llamacpp/native/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Native llama-server

This project builds the vanilla llama.cpp server and renames it to `com.docker.llama-server` for use with Docker model-runner.

## Building

cmake -B build
Expand All @@ -15,7 +17,7 @@

This project uses llama.cpp as a git submodule located at `vendor/llama.cpp`, which points to the official llama.cpp repository at https://github.com/ggml-org/llama.cpp.git.

The project applies custom patches to llama.cpp's server implementation (`server.cpp` and `utils.hpp`) to integrate with the Docker model-runner architecture. These patches are maintained in `src/server/server.patch`.
We use the vanilla llama.cpp server without any modifications. The build system simply builds the upstream `llama-server` and copies it to `com.docker.llama-server`.

### Prerequisites

Expand Down Expand Up @@ -45,19 +47,7 @@ If the submodule is already initialized, this command is safe to run and will en
popd
```

3. **Apply the custom llama-server patch:**

```bash
make -C src/server clean
make -C src/server
```

This will:
- Clean the previous patched files
- Copy the new `server.cpp` and `utils.hpp` from the updated llama.cpp
- Apply our custom patches from `src/server/server.patch`

4. **Build and test:**
3. **Build and test:**

```bash
# Build from the native directory
Expand All @@ -70,7 +60,7 @@ If the submodule is already initialized, this command is safe to run and will en

Make sure everything builds cleanly without errors.

5. **Commit the submodule update:**
4. **Commit the submodule update:**

```bash
git add vendor/llama.cpp
Expand Down
35 changes: 0 additions & 35 deletions llamacpp/native/src/server/CMakeLists.txt

This file was deleted.

16 changes: 0 additions & 16 deletions llamacpp/native/src/server/Makefile

This file was deleted.

24 changes: 0 additions & 24 deletions llamacpp/native/src/server/README.md

This file was deleted.

Loading