Skip to content

Commit ffe27e1

Browse files
committed
Merge remote-tracking branch 'ggml-org/master' into allozaur/svelte-webui
2 parents 3517660 + 710dfc4 commit ffe27e1

File tree

154 files changed

+8318
-4066
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+8318
-4066
lines changed

.devops/vulkan.Dockerfile

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,30 @@ ARG UBUNTU_VERSION=24.04
22

33
FROM ubuntu:$UBUNTU_VERSION AS build
44

5-
# Install build tools
6-
RUN apt update && apt install -y git build-essential cmake wget
5+
# Ref: https://vulkan.lunarg.com/doc/sdk/latest/linux/getting_started.html
76

8-
# Install Vulkan SDK and cURL
9-
RUN wget -qO - https://packages.lunarg.com/lunarg-signing-key-pub.asc | apt-key add - && \
10-
wget -qO /etc/apt/sources.list.d/lunarg-vulkan-noble.list https://packages.lunarg.com/vulkan/lunarg-vulkan-noble.list && \
11-
apt update -y && \
12-
apt-get install -y vulkan-sdk libcurl4-openssl-dev curl
7+
# Install build tools
8+
RUN apt update && apt install -y git build-essential cmake wget xz-utils
9+
10+
# Install Vulkan SDK
11+
ARG VULKAN_VERSION=1.4.321.1
12+
RUN ARCH=$(uname -m) && \
13+
wget -qO /tmp/vulkan-sdk.tar.xz https://sdk.lunarg.com/sdk/download/${VULKAN_VERSION}/linux/vulkan-sdk-linux-${ARCH}-${VULKAN_VERSION}.tar.xz && \
14+
mkdir -p /opt/vulkan && \
15+
tar -xf /tmp/vulkan-sdk.tar.xz -C /tmp --strip-components=1 && \
16+
mv /tmp/${ARCH}/* /opt/vulkan/ && \
17+
rm -rf /tmp/*
18+
19+
# Install cURL and Vulkan SDK dependencies
20+
RUN apt install -y libcurl4-openssl-dev curl \
21+
libxcb-xinput0 libxcb-xinerama0 libxcb-cursor-dev
22+
23+
# Set environment variables
24+
ENV VULKAN_SDK=/opt/vulkan
25+
ENV PATH=$VULKAN_SDK/bin:$PATH
26+
ENV LD_LIBRARY_PATH=$VULKAN_SDK/lib:$LD_LIBRARY_PATH
27+
ENV CMAKE_PREFIX_PATH=$VULKAN_SDK:$CMAKE_PREFIX_PATH
28+
ENV PKG_CONFIG_PATH=$VULKAN_SDK/lib/pkgconfig:$PKG_CONFIG_PATH
1329

1430
# Build it
1531
WORKDIR /app

.github/copilot-instructions.md

Lines changed: 262 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
# Copilot Instructions for llama.cpp
2+
3+
## Repository Overview
4+
5+
llama.cpp is a large-scale C/C++ project for efficient LLM (Large Language Model) inference with minimal setup and dependencies. The project enables running language models on diverse hardware with state-of-the-art performance.
6+
7+
**Key Facts:**
8+
- **Primary language**: C/C++ with Python utility scripts
9+
- **Size**: ~200k+ lines of code across 1000+ files
10+
- **Architecture**: Modular design with main library (`libllama`) and 40+ executable tools/examples
11+
- **Core dependency**: ggml tensor library (vendored in `ggml/` directory)
12+
- **Backends supported**: CPU (AVX/NEON optimized), CUDA, Metal, Vulkan, SYCL, ROCm, MUSA
13+
- **License**: MIT
14+
15+
## Build Instructions
16+
17+
### Prerequisites
18+
- CMake 3.14+ (primary build system)
19+
- C++17 compatible compiler (GCC 13.3+, Clang, MSVC)
20+
- Optional: ccache for faster compilation
21+
22+
### Basic Build (CPU-only)
23+
**ALWAYS run these commands in sequence:**
24+
```bash
25+
cmake -B build
26+
cmake --build build --config Release -j $(nproc)
27+
```
28+
29+
**Build time**: ~10 minutes on 4-core system with ccache enabled, ~25 minutes without ccache.
30+
31+
**Important Notes:**
32+
- The Makefile is deprecated - always use CMake
33+
- ccache is automatically detected and used if available
34+
- Built binaries are placed in `build/bin/`
35+
- Parallel builds (`-j`) significantly reduce build time
36+
37+
### Backend-Specific Builds
38+
For CUDA support:
39+
```bash
40+
cmake -B build -DGGML_CUDA=ON
41+
cmake --build build --config Release -j $(nproc)
42+
```
43+
44+
For Metal (macOS):
45+
```bash
46+
cmake -B build -DGGML_METAL=ON
47+
cmake --build build --config Release -j $(nproc)
48+
```
49+
50+
**Important Note**: While all backends can be built as long as the correct requirements for that backend are installed, you will not be able to run them without the correct hardware. The only backend that can be run for testing and validation is the CPU backend.
51+
52+
### Debug Builds
53+
Single-config generators:
54+
```bash
55+
cmake -B build -DCMAKE_BUILD_TYPE=Debug
56+
cmake --build build
57+
```
58+
59+
Multi-config generators:
60+
```bash
61+
cmake -B build -G "Xcode"
62+
cmake --build build --config Debug
63+
```
64+
65+
### Common Build Issues
66+
- **Issue**: Network tests fail in isolated environments
67+
**Solution**: Expected behavior - core functionality tests will still pass
68+
69+
## Testing
70+
71+
### Running Tests
72+
```bash
73+
ctest --test-dir build --output-on-failure -j $(nproc)
74+
```
75+
76+
**Test suite**: 38 tests covering tokenizers, grammar parsing, sampling, backends, and integration
77+
**Expected failures**: 2-3 tests may fail if network access is unavailable (they download models)
78+
**Test time**: ~30 seconds for passing tests
79+
80+
### Server Unit Tests
81+
Run server-specific unit tests after building the server:
82+
```bash
83+
# Build the server first
84+
cmake --build build --target llama-server
85+
86+
# Navigate to server tests and run
87+
cd tools/server/tests
88+
source ../../../.venv/bin/activate
89+
./tests.sh
90+
```
91+
**Server test dependencies**: The `.venv` environment includes the required dependencies for server unit tests (pytest, aiohttp, etc.). Tests can be run individually or with various options as documented in `tools/server/tests/README.md`.
92+
93+
### Test Categories
94+
- Tokenizer tests: Various model tokenizers (BERT, GPT-2, LLaMA, etc.)
95+
- Grammar tests: GBNF parsing and validation
96+
- Backend tests: Core ggml operations across different backends
97+
- Integration tests: End-to-end workflows
98+
99+
### Manual Testing Commands
100+
```bash
101+
# Test basic inference
102+
./build/bin/llama-cli --version
103+
104+
# Test model loading (requires model file)
105+
./build/bin/llama-cli -m path/to/model.gguf -p "Hello" -n 10
106+
```
107+
108+
## Code Quality and Linting
109+
110+
### C++ Code Formatting
111+
**ALWAYS format C++ code before committing:**
112+
```bash
113+
git clang-format
114+
```
115+
116+
Configuration is in `.clang-format` with these key rules:
117+
- 4-space indentation
118+
- 120 column limit
119+
- Braces on same line for functions
120+
- Pointer alignment: `void * ptr` (middle)
121+
- Reference alignment: `int & ref` (middle)
122+
123+
### Python Code
124+
**ALWAYS activate the Python environment in `.venv` and use tools from that environment:**
125+
```bash
126+
# Activate virtual environment
127+
source .venv/bin/activate
128+
```
129+
130+
Configuration files:
131+
- `.flake8`: flake8 settings (max-line-length=125, excludes examples/tools)
132+
- `pyrightconfig.json`: pyright type checking configuration
133+
134+
### Pre-commit Hooks
135+
Run before committing:
136+
```bash
137+
pre-commit run --all-files
138+
```
139+
140+
## Continuous Integration
141+
142+
### GitHub Actions Workflows
143+
Key workflows that run on every PR:
144+
- `.github/workflows/build.yml`: Multi-platform builds
145+
- `.github/workflows/server.yml`: Server functionality tests
146+
- `.github/workflows/python-lint.yml`: Python code quality
147+
- `.github/workflows/python-type-check.yml`: Python type checking
148+
149+
### Local CI Validation
150+
**Run full CI locally before submitting PRs:**
151+
```bash
152+
mkdir tmp
153+
154+
# CPU-only build
155+
bash ./ci/run.sh ./tmp/results ./tmp/mnt
156+
```
157+
158+
**CI Runtime**: 30-60 minutes depending on backend configuration
159+
160+
### Triggering CI
161+
Add `ggml-ci` to commit message to trigger heavy CI workloads on the custom CI infrastructure.
162+
163+
## Project Layout and Architecture
164+
165+
### Core Directories
166+
- **`src/`**: Main llama library implementation (`llama.cpp`, `llama-*.cpp`)
167+
- **`include/`**: Public API headers, primarily `include/llama.h`
168+
- **`ggml/`**: Core tensor library (submodule with custom GGML framework)
169+
- **`examples/`**: 30+ example applications and tools
170+
- **`tools/`**: Additional development and utility tools (server benchmarks, tests)
171+
- **`tests/`**: Comprehensive test suite with CTest integration
172+
- **`docs/`**: Detailed documentation (build guides, API docs, etc.)
173+
- **`scripts/`**: Utility scripts for CI, data processing, and automation
174+
- **`common/`**: Shared utility code used across examples
175+
176+
### Key Files
177+
- **`CMakeLists.txt`**: Primary build configuration
178+
- **`include/llama.h`**: Main C API header (~2000 lines)
179+
- **`src/llama.cpp`**: Core library implementation (~8000 lines)
180+
- **`CONTRIBUTING.md`**: Coding guidelines and PR requirements
181+
- **`.clang-format`**: C++ formatting rules
182+
- **`.pre-commit-config.yaml`**: Git hook configuration
183+
184+
### Built Executables (in `build/bin/`)
185+
Primary tools:
186+
- **`llama-cli`**: Main inference tool
187+
- **`llama-server`**: OpenAI-compatible HTTP server
188+
- **`llama-quantize`**: Model quantization utility
189+
- **`llama-perplexity`**: Model evaluation tool
190+
- **`llama-bench`**: Performance benchmarking
191+
- **`llama-convert-llama2c-to-ggml`**: Model conversion utilities
192+
193+
### Configuration Files
194+
- **CMake**: `CMakeLists.txt`, `cmake/` directory
195+
- **Linting**: `.clang-format`, `.clang-tidy`, `.flake8`
196+
- **CI**: `.github/workflows/`, `ci/run.sh`
197+
- **Git**: `.gitignore` (includes build artifacts, models, cache)
198+
199+
### Dependencies
200+
- **System**: OpenMP, libcurl (for model downloading)
201+
- **Optional**: CUDA SDK, Metal framework, Vulkan SDK, Intel oneAPI
202+
- **Bundled**: httplib, json (header-only libraries in vendored form)
203+
204+
## Common Validation Steps
205+
206+
### After Making Changes
207+
1. **Format code**: `git clang-format`
208+
2. **Build**: `cmake --build build --config Release`
209+
3. **Test**: `ctest --test-dir build --output-on-failure`
210+
4. **Server tests** (if modifying server): `cd tools/server/tests && source ../../../.venv/bin/activate && ./tests.sh`
211+
5. **Manual validation**: Test relevant tools in `build/bin/`
212+
213+
### Performance Validation
214+
```bash
215+
# Benchmark inference performance
216+
./build/bin/llama-bench -m model.gguf
217+
218+
# Evaluate model perplexity
219+
./build/bin/llama-perplexity -m model.gguf -f dataset.txt
220+
```
221+
222+
### Backend Validation
223+
```bash
224+
# Test backend operations
225+
./build/bin/test-backend-ops
226+
```
227+
228+
## Environment Setup
229+
230+
### Required Tools
231+
- CMake 3.14+ (install via system package manager)
232+
- Modern C++ compiler with C++17 support
233+
- Git (for submodule management)
234+
- Python 3.9+ with virtual environment (`.venv` is provided)
235+
236+
### Optional but Recommended
237+
- ccache: `apt install ccache` or `brew install ccache`
238+
- clang-format 15+: Usually included with LLVM/Clang installation
239+
- pre-commit: `pip install pre-commit`
240+
241+
### Backend-Specific Requirements
242+
- **CUDA**: NVIDIA CUDA Toolkit 11.2+
243+
- **Metal**: Xcode command line tools (macOS only)
244+
- **Vulkan**: Vulkan SDK
245+
- **SYCL**: Intel oneAPI toolkit
246+
247+
## Important Guidelines
248+
249+
### Code Changes
250+
- **Minimal dependencies**: Avoid adding new external dependencies
251+
- **Cross-platform compatibility**: Test on Linux, macOS, Windows when possible
252+
- **Performance focus**: This is a performance-critical inference library
253+
- **API stability**: Changes to `include/llama.h` require careful consideration
254+
255+
### Git Workflow
256+
- Always create feature branches from `master`
257+
- **Never** commit build artifacts (`build/`, `.ccache/`, `*.o`, `*.gguf`)
258+
- Use descriptive commit messages following project conventions
259+
260+
### Trust These Instructions
261+
Only search for additional information if these instructions are incomplete or found to be incorrect. This document contains validated build and test procedures that work reliably across different environments.
262+
Lines changed: 33 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
name: Build on RISCV Linux Machine by Cloud-V
22
on:
3+
pull_request:
34
workflow_dispatch:
45
workflow_call:
56

67
jobs:
7-
bianbu-riscv64-native: # Bianbu 2.2
8+
debian-13-riscv64-native: # Bianbu 2.2
89
runs-on: self-hosted
910

1011
steps:
@@ -20,24 +21,40 @@ jobs:
2021
build-essential \
2122
gcc-14-riscv64-linux-gnu \
2223
g++-14-riscv64-linux-gnu \
24+
ccache \
2325
cmake
2426
27+
- name: Setup ccache
28+
run: |
29+
mkdir -p $HOME/.ccache
30+
ccache -M 5G -d $HOME/.ccache
31+
export CCACHE_LOGFILE=/home/runneruser/ccache_debug/ccache.log
32+
export CCACHE_DEBUGDIR="/home/runneruser/ccache_debug"
33+
echo "$GITHUB_WORKSPACE"
34+
echo "CCACHE_LOGFILE=$CCACHE_LOGFILE" >> $GITHUB_ENV
35+
echo "CCACHE_DEBUGDIR=$CCACHE_DEBUGDIR" >> $GITHUB_ENV
36+
echo "CCACHE_BASEDIR=$GITHUB_WORKSPACE" >> $GITHUB_ENV
37+
echo "CCACHE_DIR=$HOME/.ccache" >> $GITHUB_ENV
38+
2539
- name: Build
2640
run: |
27-
cmake -B build -DLLAMA_CURL=OFF \
28-
-DCMAKE_BUILD_TYPE=Release \
29-
-DGGML_OPENMP=OFF \
30-
-DLLAMA_BUILD_EXAMPLES=ON \
31-
-DLLAMA_BUILD_TOOLS=ON \
32-
-DLLAMA_BUILD_TESTS=OFF \
33-
-DCMAKE_SYSTEM_NAME=Linux \
34-
-DCMAKE_SYSTEM_PROCESSOR=riscv64 \
35-
-DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
36-
-DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 \
37-
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
38-
-DCMAKE_FIND_ROOT_PATH=/usr/lib/riscv64-linux-gnu \
39-
-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
40-
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
41-
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH
41+
cmake -B build \
42+
-DLLAMA_CURL=OFF \
43+
-DCMAKE_BUILD_TYPE=Release \
44+
-DGGML_OPENMP=OFF \
45+
-DLLAMA_BUILD_EXAMPLES=ON \
46+
-DLLAMA_BUILD_TOOLS=ON \
47+
-DLLAMA_BUILD_TESTS=OFF \
48+
-DCMAKE_SYSTEM_NAME=Linux \
49+
-DCMAKE_SYSTEM_PROCESSOR=riscv64 \
50+
-DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
51+
-DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 \
52+
-DCMAKE_C_COMPILER_LAUNCHER=ccache \
53+
-DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
54+
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
55+
-DCMAKE_FIND_ROOT_PATH=/usr/lib/riscv64-linux-gnu \
56+
-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
57+
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
58+
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH
4259
4360
cmake --build build --config Release -j $(nproc)

.github/workflows/copilot-setup-steps.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ jobs:
3939
run: |
4040
sudo apt-get update
4141
sudo apt-get install build-essential libcurl4-openssl-dev
42+
# Install git-clang-format script for formatting only changed code
43+
wget -O /tmp/git-clang-format https://raw.githubusercontent.com/llvm/llvm-project/release/18.x/clang/tools/clang-format/git-clang-format
44+
sudo cp /tmp/git-clang-format /usr/local/bin/git-clang-format
45+
sudo chmod +x /usr/local/bin/git-clang-format
4246
4347
- name: Set up Python
4448
uses: actions/setup-python@v5
@@ -50,4 +54,4 @@ jobs:
5054
python3 -m venv .venv
5155
.venv/bin/activate
5256
pip install -r requirements/requirements-all.txt -r tools/server/tests/requirements.txt
53-
pip install flake8 pyright
57+
pip install flake8 pyright pre-commit

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,3 +147,4 @@ poetry.toml
147147
# Local scripts
148148
/run-vim.sh
149149
/run-chat.sh
150+
.ccache/

0 commit comments

Comments
 (0)