Skip to content

Commit 672544e

Browse files
committed
Merge remote-tracking branch 'upstream/master'
2 parents 7c26442 + 1d0125b commit 672544e

File tree

118 files changed

+4855
-4239
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

118 files changed

+4855
-4239
lines changed

.clang-tidy

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Checks: >
1717
clang-analyzer-*,
1818
-clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling,
1919
performance-*,
20+
-performance-enum-size,
2021
portability-*,
2122
-portability-simd-intrinsics,
2223
misc-*,

.github/workflows/build-riscv-native.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ on:
66

77
jobs:
88
debian-13-riscv64-native: # Bianbu 2.2
9-
runs-on: self-hosted
9+
runs-on: [self-hosted, RISCV64]
1010

1111
steps:
1212
- name: Install prerequisites

.github/workflows/build.yml

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1247,3 +1247,193 @@ jobs:
12471247
-DGGML_CANN=on \
12481248
-DSOC_TYPE=${{ matrix.device }}
12491249
cmake --build build -j $(nproc)
1250+
1251+
# TODO: simplify the following workflows using a matrix
1252+
# TODO: run lighter CI on PRs and the full CI only on master (if needed)
1253+
ggml-ci-x64-cpu-low-perf:
1254+
runs-on: [self-hosted, Linux, X64, CPU, low-perf]
1255+
1256+
steps:
1257+
- name: Clone
1258+
id: checkout
1259+
uses: actions/checkout@v4
1260+
1261+
- name: Test
1262+
id: ggml-ci
1263+
run: |
1264+
bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1265+
1266+
ggml-ci-arm64-cpu-low-perf:
1267+
runs-on: [self-hosted, Linux, ARM64, CPU, low-perf]
1268+
1269+
steps:
1270+
- name: Clone
1271+
id: checkout
1272+
uses: actions/checkout@v4
1273+
1274+
- name: Test
1275+
id: ggml-ci
1276+
run: |
1277+
bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1278+
1279+
ggml-ci-x64-cpu-high-perf:
1280+
runs-on: [self-hosted, Linux, X64, CPU, high-perf]
1281+
1282+
steps:
1283+
- name: Clone
1284+
id: checkout
1285+
uses: actions/checkout@v4
1286+
1287+
- name: Test
1288+
id: ggml-ci
1289+
run: |
1290+
bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1291+
1292+
ggml-ci-arm64-cpu-high-perf:
1293+
runs-on: [self-hosted, Linux, ARM64, CPU, high-perf]
1294+
1295+
steps:
1296+
- name: Clone
1297+
id: checkout
1298+
uses: actions/checkout@v4
1299+
1300+
- name: Test
1301+
id: ggml-ci
1302+
run: |
1303+
GG_BUILD_NO_BF16=1 GG_BUILD_EXTRA_TESTS_0=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1304+
1305+
ggml-ci-x64-nvidia-v100-cuda:
1306+
runs-on: [self-hosted, Linux, X64, NVIDIA, V100]
1307+
1308+
steps:
1309+
- name: Clone
1310+
id: checkout
1311+
uses: actions/checkout@v4
1312+
1313+
- name: Test
1314+
id: ggml-ci
1315+
run: |
1316+
nvidia-smi
1317+
GG_BUILD_CUDA=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1318+
1319+
ggml-ci-x64-nvidia-v100-vulkan:
1320+
runs-on: [self-hosted, Linux, X64, NVIDIA, V100]
1321+
1322+
steps:
1323+
- name: Clone
1324+
id: checkout
1325+
uses: actions/checkout@v4
1326+
1327+
- name: Test
1328+
id: ggml-ci
1329+
run: |
1330+
vulkaninfo
1331+
GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1332+
1333+
ggml-ci-x64-nvidia-t4-cuda:
1334+
runs-on: [self-hosted, Linux, X64, NVIDIA, T4]
1335+
1336+
steps:
1337+
- name: Clone
1338+
id: checkout
1339+
uses: actions/checkout@v4
1340+
1341+
- name: Test
1342+
id: ggml-ci
1343+
run: |
1344+
nvidia-smi
1345+
GG_BUILD_CUDA=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1346+
1347+
ggml-ci-x64-nvidia-t4-vulkan:
1348+
runs-on: [self-hosted, Linux, X64, NVIDIA, T4]
1349+
1350+
steps:
1351+
- name: Clone
1352+
id: checkout
1353+
uses: actions/checkout@v4
1354+
1355+
- name: Test
1356+
id: ggml-ci
1357+
run: |
1358+
vulkaninfo
1359+
GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1360+
1361+
ggml-ci-x64-nvidia-t4-vulkan-coopmat1:
1362+
runs-on: [self-hosted, Linux, X64, NVIDIA, T4]
1363+
1364+
steps:
1365+
- name: Clone
1366+
id: checkout
1367+
uses: actions/checkout@v4
1368+
1369+
- name: Test
1370+
id: ggml-ci
1371+
run: |
1372+
vulkaninfo
1373+
GG_BUILD_VULKAN=1 GGML_VK_DISABLE_COOPMAT2=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1374+
1375+
ggml-ci-x64-cpu-amx:
1376+
runs-on: [self-hosted, Linux, X64, CPU, AMX]
1377+
1378+
steps:
1379+
- name: Clone
1380+
id: checkout
1381+
uses: actions/checkout@v4
1382+
1383+
- name: Test
1384+
id: ggml-ci
1385+
run: |
1386+
bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1387+
1388+
ggml-ci-x64-amd-v710-vulkan:
1389+
runs-on: [self-hosted, Linux, X64, AMD, V710]
1390+
1391+
steps:
1392+
- name: Clone
1393+
id: checkout
1394+
uses: actions/checkout@v4
1395+
1396+
- name: Test
1397+
id: ggml-ci
1398+
run: |
1399+
GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1400+
1401+
ggml-ci-x64-amd-v710-rocm:
1402+
runs-on: [self-hosted, Linux, X64, AMD, V710]
1403+
1404+
steps:
1405+
- name: Clone
1406+
id: checkout
1407+
uses: actions/checkout@v4
1408+
1409+
- name: Test
1410+
id: ggml-ci
1411+
run: |
1412+
GG_BUILD_ROCM=1 GG_BUILD_AMDGPU_TARGETS="gfx1101" bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
1413+
1414+
ggml-ci-mac-metal:
1415+
runs-on: [self-hosted, macOS, ARM64]
1416+
1417+
steps:
1418+
- name: Clone
1419+
id: checkout
1420+
uses: actions/checkout@v4
1421+
1422+
- name: Test
1423+
id: ggml-ci
1424+
run: |
1425+
GG_BUILD_METAL=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp
1426+
1427+
# TODO: install vulkan drivers
1428+
# ggml-ci-mac-vulkan:
1429+
# runs-on: [self-hosted, macOS, ARM64]
1430+
#
1431+
# steps:
1432+
# - name: Clone
1433+
# id: checkout
1434+
# uses: actions/checkout@v4
1435+
#
1436+
# - name: Test
1437+
# id: ggml-ci
1438+
# run: |
1439+
# GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp

CODEOWNERS

Lines changed: 103 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,105 @@
11
# collaborators can optionally add themselves here to indicate their availability for reviewing related PRs
2+
# multiplie collaborators per item can be specified
23

3-
/ci/ @ggerganov
4-
/.devops/*.Dockerfile @ngxson
5-
/tools/server/ @ngxson
6-
/ggml/src/ggml-cuda/fattn* @JohannesGaessler
7-
/ggml/src/ggml-cuda/mmq.* @JohannesGaessler
8-
/ggml/src/ggml-cuda/mmvq.* @JohannesGaessler
9-
/ggml/src/ggml-opt.cpp @JohannesGaessler
10-
/ggml/src/gguf.cpp @JohannesGaessler
11-
/ggml/src/ggml-vulkan/ @0cc4m
12-
/ggml/src/ggml-zdnn/ @taronaeo
4+
/.devops/*.Dockerfile @ngxson
5+
/.github/actions/ @slaren
6+
/.github/workflows/ @CISC
7+
/.github/workflows/release.yml @slaren
8+
/.github/workflows/winget.yml @slaren
9+
/ci/ @ggerganov
10+
/cmake/ @ggerganov
11+
/common/CMakeLists.txt @ggerganov
12+
/common/arg.* @ggerganov @ericcurtin
13+
/common/base64.hpp.* @ggerganov
14+
/common/build-info.* @ggerganov
15+
/common/common.* @ggerganov
16+
/common/console.* @ggerganov
17+
/common/llguidance.* @ggerganov
18+
/common/log.* @ggerganov
19+
/common/sampling.* @ggerganov
20+
/common/speculative.* @ggerganov
21+
/convert_*.py @CISC
22+
/examples/batched.swift/ @ggerganov
23+
/examples/batched/ @ggerganov
24+
/examples/convert-llama2c-to-ggml/ @ggerganov
25+
/examples/deprecation-warning/ @ggerganov
26+
/examples/diffusion/ @am17an
27+
/examples/embedding/ @ggerganov
28+
/examples/eval-callback/ @ggerganov
29+
/examples/export-docs/ @ggerganov
30+
/examples/gen-docs/ @ggerganov
31+
/examples/gguf/ @ggerganov
32+
/examples/llama.android/ @ggerganov
33+
/examples/llama.swiftui/ @ggerganov
34+
/examples/llama.vim @ggerganov
35+
/examples/lookahead/ @ggerganov
36+
/examples/lookup/ @JohannesGaessler
37+
/examples/parallel/ @ggerganov
38+
/examples/passkey/ @ggerganov
39+
/examples/retrieval/ @ggerganov
40+
/examples/save-load-state/ @ggerganov
41+
/examples/simple-chat/ @slaren
42+
/examples/simple/ @slaren
43+
/examples/speculative-simple/ @ggerganov
44+
/examples/speculative/ @ggerganov
45+
/ggml/cmake/ @ggerganov
46+
/ggml/include/ @ggerganov @slaren
47+
/ggml/src/ggml-alloc.c @slaren
48+
/ggml/src/ggml-backend* @slaren
49+
/ggml/src/ggml-blas/ @slaren
50+
/ggml/src/ggml-common.h @ggerganov @slaren
51+
/ggml/src/ggml-cpu/ @ggerganov @slaren
52+
/ggml/src/ggml-cuda/common.cuh @slaren
53+
/ggml/src/ggml-cuda/fattn* @JohannesGaessler
54+
/ggml/src/ggml-cuda/ggml-cuda.cu @slaren
55+
/ggml/src/ggml-cuda/mmf.* @JohannesGaessler
56+
/ggml/src/ggml-cuda/mmq.* @JohannesGaessler
57+
/ggml/src/ggml-cuda/mmvf.* @JohannesGaessler
58+
/ggml/src/ggml-cuda/mmvq.* @JohannesGaessler
59+
/ggml/src/ggml-impl.h @ggerganov @slaren
60+
/ggml/src/ggml-metal/ @ggerganov
61+
/ggml/src/ggml-opt.cpp @JohannesGaessler
62+
/ggml/src/ggml-quants.* @ggerganov
63+
/ggml/src/ggml-threading.* @ggerganov @slaren
64+
/ggml/src/ggml-vulkan/ @0cc4m
65+
/ggml/src/ggml-zdnn/ @taronaeo
66+
/ggml/src/ggml.c @ggerganov @slaren
67+
/ggml/src/ggml.cpp @ggerganov @slaren
68+
/ggml/src/gguf.cpp @JohannesGaessler @Green-Sky
69+
/gguf-py/ @CISC
70+
/media/ @ggerganov
71+
/scripts/gen* @ggerganov
72+
/scripts/get* @ggerganov
73+
/scripts/sync* @ggerganov
74+
/src/ @ggerganov
75+
/src/llama-adapter.* @CISC
76+
/src/llama-arch.* @CISC
77+
/src/llama-chat.* @ngxson
78+
/src/llama-graph.* @CISC
79+
/src/llama-model-loader.* @slaren
80+
/src/llama-model.* @CISC
81+
/src/llama-vocab.* @CISC
82+
/tests/ @ggerganov
83+
/tests/test-backend-ops.cpp @slaren
84+
/tests/test-thread-safety.cpp @slaren
85+
/tools/batched-bench/ @ggerganov
86+
/tools/llama-bench/ @slaren
87+
/tools/main/ @ggerganov
88+
/tools/mtmd/ @ngxson
89+
/tools/perplexity/ @ggerganov
90+
/tools/quantize/ @ggerganov
91+
/tools/run/ @ericcurtin
92+
/tools/server/* @ngxson @ggerganov @ericcurtin # no subdir
93+
/tools/server/webui/ @allozaur
94+
/tools/tokenize/ @ggerganov
95+
/tools/tts/ @ggerganov
96+
/vendor/ @ggerganov
97+
.clang-format @slaren
98+
.clang-tidy @slaren
99+
AUTHORS @ggerganov
100+
CMakeLists.txt @ggerganov
101+
CONTRIBUTING.md @ggerganov
102+
LICENSE @ggerganov
103+
README.md @ggerganov
104+
SECURITY.md @ggerganov
105+
requirements*.txt @CISC

CONTRIBUTING.md

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,12 @@
1-
# Pull requests (for contributors)
1+
# Contributors
2+
3+
The project differentiates between 3 levels of contributors:
4+
5+
- Contributors: people who have contributed before (no special privileges)
6+
- Collaborators (Triage): people with significant contributions, who may be responsible for some parts of the code, and are expected to maintain and review contributions for the code they own
7+
- Maintainers: responsible for reviewing and merging PRs, after approval from the code owners
8+
9+
# Pull requests (for contributors & collaborators)
210

311
- llama.cpp uses the ggml tensor library for model evaluation. If you are unfamiliar with ggml, consider taking a look at the [examples in the ggml repository](https://github.com/ggml-org/ggml/tree/master/examples/). [simple](https://github.com/ggml-org/ggml/tree/master/examples/simple) shows the bare minimum for using ggml. [gpt-2](https://github.com/ggml-org/ggml/tree/master/examples/gpt-2) has minimal implementations for language model inference using GPT-2. [mnist](https://github.com/ggml-org/ggml/tree/master/examples/mnist) demonstrates how to train and evaluate a simple image classifier
412
- Test your changes:
@@ -9,15 +17,16 @@
917
- Create separate PRs for each feature or fix. Avoid combining unrelated changes in a single PR
1018
- Consider allowing write access to your branch for faster reviews, as reviewers can push commits directly
1119
- If your PR becomes stale, don't hesitate to ping the maintainers in the comments
20+
- Maintainers will rely on your insights and approval when making a final decision to approve and merge a PR
21+
- Consider adding yourself to [CODEOWNERS](CODEOWNERS) to indicate your availability for reviewing related PRs
1222

13-
# Pull requests (for collaborators)
23+
# Pull requests (for maintainers)
1424

1525
- Squash-merge PRs
1626
- Use the following format for the squashed commit title: `<module> : <commit title> (#<issue_number>)`. For example: `utils : fix typo in utils.py (#1234)`
1727
- Optionally pick a `<module>` from here: https://github.com/ggml-org/llama.cpp/wiki/Modules
18-
- Consider adding yourself to [CODEOWNERS](CODEOWNERS)
19-
- Let authors, who are also collaborators, merge their own PRs
20-
- When merging a PR by a contributor, make sure you have a good understanding of the changes
28+
- Let other maintainers, merge their own PRs
29+
- When merging a PR, make sure you have a good understanding of the changes
2130
- Be mindful of maintenance: most of the work going into a feature happens after the PR is merged. If the PR author is not committed to contribute long-term, someone else needs to take responsibility (you)
2231

2332
# Coding guidelines
@@ -117,6 +126,21 @@
117126
#endif // FOO
118127
```
119128
129+
# Code maintenance
130+
131+
- Existing code should have designated collaborators and/or maintainers specified in the [CODEOWNERS](CODEOWNERS) file reponsible for:
132+
- Reviewing and merging related PRs
133+
- Fixing related bugs
134+
- Providing developer guidance/support
135+
136+
- When adding or modifying a large piece of code:
137+
- If you are a collaborator, make sure to add yourself to [CODEOWNERS](CODEOWNERS) to indicate your availability for reviewing related PRs
138+
- If you are a contributor, find an existing collaborator who is willing to review and maintain your code long-term
139+
- Provide the necessary CI workflow (and hardware) to test your changes (see [ci/README.md](https://github.com/ggml-org/llama.cpp/tree/master/ci))
140+
141+
- New code should follow the guidelines (coding, naming, etc.) outlined in this document. Exceptions are allowed in isolated, backend-specific parts of the code that do not interface directly with the `ggml` interfaces.
142+
_(NOTE: for legacy reasons, existing code is not required to follow this guideline)_
143+
120144
# Documentation
121145
122146
- Documentation is a community effort

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -520,8 +520,8 @@ To learn more about model quantization, [read this documentation](tools/quantize
520520
## Contributing
521521
522522
- Contributors can open PRs
523-
- Collaborators can push to branches in the `llama.cpp` repo and merge PRs into the `master` branch
524523
- Collaborators will be invited based on contributions
524+
- Maintainers can push to branches in the `llama.cpp` repo and merge PRs into the `master` branch
525525
- Any help with managing issues, PRs and projects is very appreciated!
526526
- See [good first issues](https://github.com/ggml-org/llama.cpp/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) for tasks suitable for first contributions
527527
- Read the [CONTRIBUTING.md](CONTRIBUTING.md) for more information

0 commit comments

Comments
 (0)