Skip to content

Commit 245be73

Browse files
CopilotCISC
andauthored
ci : add copilot-instructions.md (ggml-org#15286)
* Initial plan * Initialize copilot instructions exploration * Add comprehensive .github/copilot-instructions.md file * Update Python environment and tools directory documentation - Add instructions for using .venv Python environment - Include flake8 and pyright linting tools from virtual environment - Add tools/ as core directory in project layout - Reference existing configuration files (.flake8, pyrightconfig.json) * add more python dependencies to .venv * Update copilot instructions: add backend hardware note and server testing * Apply suggestions from code review * Apply suggestions from code review * Replace clang-format with git clang-format to format only changed code * Minor formatting improvements: remove extra blank line and add trailing newline * try installing git-clang-format * try just clang-format * Remove --binary flag from git clang-format and add git-clang-format installation to CI * download 18.x release * typo-- * remove --binary flag --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>
1 parent b2caf67 commit 245be73

File tree

3 files changed

+268
-1
lines changed

3 files changed

+268
-1
lines changed

.github/copilot-instructions.md

Lines changed: 262 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
# Copilot Instructions for llama.cpp
2+
3+
## Repository Overview
4+
5+
llama.cpp is a large-scale C/C++ project for efficient LLM (Large Language Model) inference with minimal setup and dependencies. The project enables running language models on diverse hardware with state-of-the-art performance.
6+
7+
**Key Facts:**
8+
- **Primary language**: C/C++ with Python utility scripts
9+
- **Size**: ~200k+ lines of code across 1000+ files
10+
- **Architecture**: Modular design with main library (`libllama`) and 40+ executable tools/examples
11+
- **Core dependency**: ggml tensor library (vendored in `ggml/` directory)
12+
- **Backends supported**: CPU (AVX/NEON optimized), CUDA, Metal, Vulkan, SYCL, ROCm, MUSA
13+
- **License**: MIT
14+
15+
## Build Instructions
16+
17+
### Prerequisites
18+
- CMake 3.14+ (primary build system)
19+
- C++17 compatible compiler (GCC 13.3+, Clang, MSVC)
20+
- Optional: ccache for faster compilation
21+
22+
### Basic Build (CPU-only)
23+
**ALWAYS run these commands in sequence:**
24+
```bash
25+
cmake -B build
26+
cmake --build build --config Release -j $(nproc)
27+
```
28+
29+
**Build time**: ~10 minutes on 4-core system with ccache enabled, ~25 minutes without ccache.
30+
31+
**Important Notes:**
32+
- The Makefile is deprecated - always use CMake
33+
- ccache is automatically detected and used if available
34+
- Built binaries are placed in `build/bin/`
35+
- Parallel builds (`-j`) significantly reduce build time
36+
37+
### Backend-Specific Builds
38+
For CUDA support:
39+
```bash
40+
cmake -B build -DGGML_CUDA=ON
41+
cmake --build build --config Release -j $(nproc)
42+
```
43+
44+
For Metal (macOS):
45+
```bash
46+
cmake -B build -DGGML_METAL=ON
47+
cmake --build build --config Release -j $(nproc)
48+
```
49+
50+
**Important Note**: While all backends can be built as long as the correct requirements for that backend are installed, you will not be able to run them without the correct hardware. The only backend that can be run for testing and validation is the CPU backend.
51+
52+
### Debug Builds
53+
Single-config generators:
54+
```bash
55+
cmake -B build -DCMAKE_BUILD_TYPE=Debug
56+
cmake --build build
57+
```
58+
59+
Multi-config generators:
60+
```bash
61+
cmake -B build -G "Xcode"
62+
cmake --build build --config Debug
63+
```
64+
65+
### Common Build Issues
66+
- **Issue**: Network tests fail in isolated environments
67+
**Solution**: Expected behavior - core functionality tests will still pass
68+
69+
## Testing
70+
71+
### Running Tests
72+
```bash
73+
ctest --test-dir build --output-on-failure -j $(nproc)
74+
```
75+
76+
**Test suite**: 38 tests covering tokenizers, grammar parsing, sampling, backends, and integration
77+
**Expected failures**: 2-3 tests may fail if network access is unavailable (they download models)
78+
**Test time**: ~30 seconds for passing tests
79+
80+
### Server Unit Tests
81+
Run server-specific unit tests after building the server:
82+
```bash
83+
# Build the server first
84+
cmake --build build --target llama-server
85+
86+
# Navigate to server tests and run
87+
cd tools/server/tests
88+
source ../../../.venv/bin/activate
89+
./tests.sh
90+
```
91+
**Server test dependencies**: The `.venv` environment includes the required dependencies for server unit tests (pytest, aiohttp, etc.). Tests can be run individually or with various options as documented in `tools/server/tests/README.md`.
92+
93+
### Test Categories
94+
- Tokenizer tests: Various model tokenizers (BERT, GPT-2, LLaMA, etc.)
95+
- Grammar tests: GBNF parsing and validation
96+
- Backend tests: Core ggml operations across different backends
97+
- Integration tests: End-to-end workflows
98+
99+
### Manual Testing Commands
100+
```bash
101+
# Test basic inference
102+
./build/bin/llama-cli --version
103+
104+
# Test model loading (requires model file)
105+
./build/bin/llama-cli -m path/to/model.gguf -p "Hello" -n 10
106+
```
107+
108+
## Code Quality and Linting
109+
110+
### C++ Code Formatting
111+
**ALWAYS format C++ code before committing:**
112+
```bash
113+
git clang-format
114+
```
115+
116+
Configuration is in `.clang-format` with these key rules:
117+
- 4-space indentation
118+
- 120 column limit
119+
- Braces on same line for functions
120+
- Pointer alignment: `void * ptr` (middle)
121+
- Reference alignment: `int & ref` (middle)
122+
123+
### Python Code
124+
**ALWAYS activate the Python environment in `.venv` and use tools from that environment:**
125+
```bash
126+
# Activate virtual environment
127+
source .venv/bin/activate
128+
```
129+
130+
Configuration files:
131+
- `.flake8`: flake8 settings (max-line-length=125, excludes examples/tools)
132+
- `pyrightconfig.json`: pyright type checking configuration
133+
134+
### Pre-commit Hooks
135+
Run before committing:
136+
```bash
137+
pre-commit run --all-files
138+
```
139+
140+
## Continuous Integration
141+
142+
### GitHub Actions Workflows
143+
Key workflows that run on every PR:
144+
- `.github/workflows/build.yml`: Multi-platform builds
145+
- `.github/workflows/server.yml`: Server functionality tests
146+
- `.github/workflows/python-lint.yml`: Python code quality
147+
- `.github/workflows/python-type-check.yml`: Python type checking
148+
149+
### Local CI Validation
150+
**Run full CI locally before submitting PRs:**
151+
```bash
152+
mkdir tmp
153+
154+
# CPU-only build
155+
bash ./ci/run.sh ./tmp/results ./tmp/mnt
156+
```
157+
158+
**CI Runtime**: 30-60 minutes depending on backend configuration
159+
160+
### Triggering CI
161+
Add `ggml-ci` to commit message to trigger heavy CI workloads on the custom CI infrastructure.
162+
163+
## Project Layout and Architecture
164+
165+
### Core Directories
166+
- **`src/`**: Main llama library implementation (`llama.cpp`, `llama-*.cpp`)
167+
- **`include/`**: Public API headers, primarily `include/llama.h`
168+
- **`ggml/`**: Core tensor library (submodule with custom GGML framework)
169+
- **`examples/`**: 30+ example applications and tools
170+
- **`tools/`**: Additional development and utility tools (server benchmarks, tests)
171+
- **`tests/`**: Comprehensive test suite with CTest integration
172+
- **`docs/`**: Detailed documentation (build guides, API docs, etc.)
173+
- **`scripts/`**: Utility scripts for CI, data processing, and automation
174+
- **`common/`**: Shared utility code used across examples
175+
176+
### Key Files
177+
- **`CMakeLists.txt`**: Primary build configuration
178+
- **`include/llama.h`**: Main C API header (~2000 lines)
179+
- **`src/llama.cpp`**: Core library implementation (~8000 lines)
180+
- **`CONTRIBUTING.md`**: Coding guidelines and PR requirements
181+
- **`.clang-format`**: C++ formatting rules
182+
- **`.pre-commit-config.yaml`**: Git hook configuration
183+
184+
### Built Executables (in `build/bin/`)
185+
Primary tools:
186+
- **`llama-cli`**: Main inference tool
187+
- **`llama-server`**: OpenAI-compatible HTTP server
188+
- **`llama-quantize`**: Model quantization utility
189+
- **`llama-perplexity`**: Model evaluation tool
190+
- **`llama-bench`**: Performance benchmarking
191+
- **`llama-convert-llama2c-to-ggml`**: Model conversion utilities
192+
193+
### Configuration Files
194+
- **CMake**: `CMakeLists.txt`, `cmake/` directory
195+
- **Linting**: `.clang-format`, `.clang-tidy`, `.flake8`
196+
- **CI**: `.github/workflows/`, `ci/run.sh`
197+
- **Git**: `.gitignore` (includes build artifacts, models, cache)
198+
199+
### Dependencies
200+
- **System**: OpenMP, libcurl (for model downloading)
201+
- **Optional**: CUDA SDK, Metal framework, Vulkan SDK, Intel oneAPI
202+
- **Bundled**: httplib, json (header-only libraries in vendored form)
203+
204+
## Common Validation Steps
205+
206+
### After Making Changes
207+
1. **Format code**: `git clang-format`
208+
2. **Build**: `cmake --build build --config Release`
209+
3. **Test**: `ctest --test-dir build --output-on-failure`
210+
4. **Server tests** (if modifying server): `cd tools/server/tests && source ../../../.venv/bin/activate && ./tests.sh`
211+
5. **Manual validation**: Test relevant tools in `build/bin/`
212+
213+
### Performance Validation
214+
```bash
215+
# Benchmark inference performance
216+
./build/bin/llama-bench -m model.gguf
217+
218+
# Evaluate model perplexity
219+
./build/bin/llama-perplexity -m model.gguf -f dataset.txt
220+
```
221+
222+
### Backend Validation
223+
```bash
224+
# Test backend operations
225+
./build/bin/test-backend-ops
226+
```
227+
228+
## Environment Setup
229+
230+
### Required Tools
231+
- CMake 3.14+ (install via system package manager)
232+
- Modern C++ compiler with C++17 support
233+
- Git (for submodule management)
234+
- Python 3.9+ with virtual environment (`.venv` is provided)
235+
236+
### Optional but Recommended
237+
- ccache: `apt install ccache` or `brew install ccache`
238+
- clang-format 15+: Usually included with LLVM/Clang installation
239+
- pre-commit: `pip install pre-commit`
240+
241+
### Backend-Specific Requirements
242+
- **CUDA**: NVIDIA CUDA Toolkit 11.2+
243+
- **Metal**: Xcode command line tools (macOS only)
244+
- **Vulkan**: Vulkan SDK
245+
- **SYCL**: Intel oneAPI toolkit
246+
247+
## Important Guidelines
248+
249+
### Code Changes
250+
- **Minimal dependencies**: Avoid adding new external dependencies
251+
- **Cross-platform compatibility**: Test on Linux, macOS, Windows when possible
252+
- **Performance focus**: This is a performance-critical inference library
253+
- **API stability**: Changes to `include/llama.h` require careful consideration
254+
255+
### Git Workflow
256+
- Always create feature branches from `master`
257+
- **Never** commit build artifacts (`build/`, `.ccache/`, `*.o`, `*.gguf`)
258+
- Use descriptive commit messages following project conventions
259+
260+
### Trust These Instructions
261+
Only search for additional information if these instructions are incomplete or found to be incorrect. This document contains validated build and test procedures that work reliably across different environments.
262+

.github/workflows/copilot-setup-steps.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ jobs:
3939
run: |
4040
sudo apt-get update
4141
sudo apt-get install build-essential libcurl4-openssl-dev
42+
# Install git-clang-format script for formatting only changed code
43+
wget -O /tmp/git-clang-format https://raw.githubusercontent.com/llvm/llvm-project/release/18.x/clang/tools/clang-format/git-clang-format
44+
sudo cp /tmp/git-clang-format /usr/local/bin/git-clang-format
45+
sudo chmod +x /usr/local/bin/git-clang-format
4246
4347
- name: Set up Python
4448
uses: actions/setup-python@v5
@@ -50,4 +54,4 @@ jobs:
5054
python3 -m venv .venv
5155
.venv/bin/activate
5256
pip install -r requirements/requirements-all.txt -r tools/server/tests/requirements.txt
53-
pip install flake8 pyright
57+
pip install flake8 pyright pre-commit

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,3 +147,4 @@ poetry.toml
147147
# Local scripts
148148
/run-vim.sh
149149
/run-chat.sh
150+
.ccache/

0 commit comments

Comments
 (0)