You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CONTRIBUTING.md
+96-6Lines changed: 96 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,10 @@
1
1
# Pull requests (for contributors)
2
2
3
3
- Test your changes:
4
-
- Execute [the full CI locally on your machine](ci/README.md) before publishing
5
-
- Verify that the perplexity and the performance are not affected negatively by your changes (use `llama-perplexity` and `llama-bench`)
6
-
- If you modified the `ggml` source, run the `test-backend-ops` tool to check whether different backend implementations of the `ggml` operators produce consistent results (this requires access to at least two different `ggml` backends)
7
-
- If you modified a `ggml` operator or added a new one, add the corresponding test cases to `test-backend-ops`
4
+
- Execute [the full CI locally on your machine](ci/README.md) before publishing
5
+
- Verify that the perplexity and the performance are not affected negatively by your changes (use `llama-perplexity` and `llama-bench`)
6
+
- If you modified the `ggml` source, run the `test-backend-ops` tool to check whether different backend implementations of the `ggml` operators produce consistent results (this requires access to at least two different `ggml` backends)
7
+
- If you modified a `ggml` operator or added a new one, add the corresponding test cases to `test-backend-ops`
8
8
- Consider allowing write access to your branch for faster reviews, as reviewers can push commits directly
9
9
- If your PR becomes stale, don't hesitate to ping the maintainers in the comments
10
10
@@ -20,14 +20,104 @@
20
20
- Avoid adding third-party dependencies, extra files, extra headers, etc.
21
21
- Always consider cross-compatibility with other operating systems and architectures
22
22
- Avoid fancy-looking modern STL constructs, use basic `for` loops, avoid templates, keep it simple
23
-
-There are no strict rules for the code style, but try to follow the patterns in the code (indentation, spaces, etc.). Vertical alignment makes things more readable and easier to batch edit
23
+
- Vertical alignment makes things more readable and easier to batch edit
24
24
- Clean-up any trailing whitespaces, use 4 spaces for indentation, brackets on the same line, `void * ptr`, `int & a`
25
-
- Naming usually optimizes for common prefix (see https://github.com/ggerganov/ggml/pull/302#discussion_r1243240963)
25
+
- Use sized integer types such as `int32_t` in the public API, e.g. `size_t` may also be appropriate for allocation sizes or byte offsets
26
+
- Declare structs with `struct foo {}` instead of `typedef struct foo {} foo`
27
+
- In C++ code omit optional `struct` and `enum` keyword whenever they are not necessary
28
+
```cpp
29
+
// OK
30
+
llama_context * ctx;
31
+
const llama_rope_type rope_type;
32
+
33
+
// not OK
34
+
structllama_context* ctx;
35
+
const enum llama_rope_type rope_type;
36
+
```
37
+
38
+
_(NOTE: this guideline is yet to be applied to the `llama.cpp` codebase. New code should follow this guideline.)_
39
+
40
+
- Try to follow the existing patterns in the code (indentation, spaces, etc.). In case of doubt use `clang-format` to format the added code
41
+
- For anything not covered in the current guidelines, refer to the [C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines)
26
42
- Tensors store data in row-major order. We refer to dimension 0 as columns, 1 as rows, 2 as matrices
27
43
- Matrix multiplication is unconventional: [`C = ggml_mul_mat(ctx, A, B)`](https://github.com/ggerganov/llama.cpp/blob/880e352277fc017df4d5794f0c21c44e1eae2b84/ggml.h#L1058-L1064) means $C^T = A B^T \Leftrightarrow C = B A^T.$
28
44
29
45

30
46
47
+
# Naming guidelines
48
+
49
+
- Use `snake_case` for function, variable and type names
50
+
- Naming usually optimizes for longest common prefix (see https://github.com/ggerganov/ggml/pull/302#discussion_r1243240963)
51
+
52
+
```cpp
53
+
// not OK
54
+
int small_number;
55
+
int big_number;
56
+
57
+
// OK
58
+
int number_small;
59
+
int number_big;
60
+
```
61
+
62
+
- Enum values are always in upper case and prefixed with the enum name
63
+
64
+
```cpp
65
+
enum llama_vocab_type {
66
+
LLAMA_VOCAB_TYPE_NONE = 0,
67
+
LLAMA_VOCAB_TYPE_SPM = 1,
68
+
LLAMA_VOCAB_TYPE_BPE = 2,
69
+
LLAMA_VOCAB_TYPE_WPM = 3,
70
+
LLAMA_VOCAB_TYPE_UGM = 4,
71
+
LLAMA_VOCAB_TYPE_RWKV = 5,
72
+
};
73
+
```
74
+
75
+
- The general naming pattern is `<class>_<method>`, with `<method>` being `<action>_<noun>`
_(NOTE: this guideline is yet to be applied to the `llama.cpp` codebase. New code should follow this guideline)_
100
+
101
+
- C/C++ filenames are all lowercase with dashes. Headers use the `.h` extension. Source files use the `.c` or `.cpp` extension
102
+
- Python filenames are all lowercase with underscores
103
+
104
+
- _(TODO: abbreviations usage)_
105
+
106
+
# Preprocessor directives
107
+
108
+
- _(TODO: add guidelines with examples and apply them to the codebase)_
109
+
110
+
```cpp
111
+
#ifdef FOO
112
+
#endif // FOO
113
+
```
114
+
115
+
# Documentation
116
+
117
+
- Documentation is a community effort
118
+
- When you need to look into the source code to figure out how to use an API consider adding a short summary to the header file for future reference
119
+
- When you notice incorrect or outdated documentation, please update it
120
+
31
121
# Resources
32
122
33
123
The Github issues, PRs and discussions contain a lot of information that can be useful to get familiar with the codebase. For convenience, some of the more important information is referenced from Github projects:
"Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.\n"
* Allow getting the HF file from the HF repo with tag (like ollama), for example:
1464
+
* - bartowski/Llama-3.2-3B-Instruct-GGUF:q4
1465
+
* - bartowski/Llama-3.2-3B-Instruct-GGUF:Q4_K_M
1466
+
* - bartowski/Llama-3.2-3B-Instruct-GGUF:q5_k_s
1467
+
* Tag is optional, default to "latest" (meaning it checks for Q4_K_M first, then Q4, then if not found, return the first GGUF file in repo)
1468
+
*
1469
+
* Return pair of <repo, file> (with "repo" already having tag removed)
1470
+
*
1471
+
* Note: we use the Ollama-compatible HF API, but not using the blobId. Instead, we use the special "ggufFile" field which returns the value for "hf_file". This is done to be backward-compatible with existing cache files.
0 commit comments