Skip to content

Commit 9d32c3a

Browse files
authored
Merge pull request #22 from kpouget/rebase-b7356
Rebase on top of b7356
2 parents e39502e + e55205d commit 9d32c3a

File tree

159 files changed

+28072
-5208
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

159 files changed

+28072
-5208
lines changed

.github/workflows/build.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -243,7 +243,7 @@ jobs:
243243
echo "Fetch llama2c model"
244244
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/stories260K.bin
245245
./bin/llama-convert-llama2c-to-ggml --copy-vocab-from-model ./tok512.bin --llama2c-model stories260K.bin --llama2c-output-model stories260K.gguf
246-
./bin/llama-cli -m stories260K.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256
246+
./bin/llama-completion -m stories260K.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256
247247
248248
- name: Test llama2c (s390x)
249249
id: llama2c_test_s390x
@@ -252,7 +252,7 @@ jobs:
252252
cd build
253253
echo "Fetch llama2c big-endian model"
254254
wget https://huggingface.co/ggml-org/models/resolve/main/tinyllamas/stories260K-be.gguf
255-
./bin/llama-cli -m stories260K-be.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256
255+
./bin/llama-completion -m stories260K-be.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256
256256
257257
ubuntu-latest-cmake-sanitizer:
258258
runs-on: ubuntu-latest
@@ -1770,7 +1770,7 @@ jobs:
17701770
echo "Fetch llama2c model"
17711771
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/stories260K.bin
17721772
./bin/llama-convert-llama2c-to-ggml --copy-vocab-from-model ./tok512.bin --llama2c-model stories260K.bin --llama2c-output-model stories260K.gguf
1773-
./bin/llama-cli -m stories260K.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256
1773+
./bin/llama-completion -m stories260K.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256
17741774
17751775
ubuntu-cmake-sanitizer-riscv64-native:
17761776
runs-on: RISCV64

CMakePresets.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@
3030
{ "name": "static", "hidden": true, "cacheVariables": { "GGML_STATIC": "ON" } },
3131
{ "name": "sycl_f16", "hidden": true, "cacheVariables": { "GGML_SYCL_F16": "ON" } },
3232
{ "name": "vulkan", "hidden": true, "cacheVariables": { "GGML_VULKAN": "ON" } },
33+
{ "name": "remoting_frontend", "hidden": true, "cacheVariables": { "GGML_REMOTING_FRONTEND": "ON" } },
34+
{ "name": "remoting_backend", "hidden": true, "cacheVariables": { "GGML_REMOTING_BACKEND": "ON" } },
3335

3436
{
3537
"name": "x64-windows-llvm", "hidden": true,

CONTRIBUTING.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ The project differentiates between 3 levels of contributors:
1515
- If you modified the `ggml` source, run the `test-backend-ops` tool to check whether different backend implementations of the `ggml` operators produce consistent results (this requires access to at least two different `ggml` backends)
1616
- If you modified a `ggml` operator or added a new one, add the corresponding test cases to `test-backend-ops`
1717
- Create separate PRs for each feature or fix. Avoid combining unrelated changes in a single PR
18+
- When adding support for a new model or feature, focus on **CPU support only** in the initial PR unless you have a good reason not to. Add support for other backends like CUDA in follow-up PRs
1819
- Consider allowing write access to your branch for faster reviews, as reviewers can push commits directly
1920
- If your PR becomes stale, rebase it on top of latest `master` to get maintainers attention
2021
- Maintainers will rely on your insights and approval when making a final decision to approve and merge a PR

OWNERS

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
approvers:
2+
- kpouget
3+
- cfergeau
4+
- praveenkumar
5+
- vyasgun
6+
- gbraad
7+
options: {}
8+
reviewers:
9+
- kpouget
10+
- cfergeau
11+
- praveenkumar
12+
- vyasgun
13+
- gbraad

README.md

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -347,19 +347,6 @@ To learn more about model quantization, [read this documentation](tools/quantize
347347

348348
</details>
349349

350-
- <details>
351-
<summary>Run simple text completion</summary>
352-
353-
To disable conversation mode explicitly, use `-no-cnv`
354-
355-
```bash
356-
llama-cli -m model.gguf -p "I believe the meaning of life is" -n 128 -no-cnv
357-
358-
# I believe the meaning of life is to find your own truth and to live in accordance with it. For me, this means being true to myself and following my passions, even if they don't align with societal expectations. I think that's what I love about yoga – it's not just a physical practice, but a spiritual one too. It's about connecting with yourself, listening to your inner voice, and honoring your own unique journey.
359-
```
360-
361-
</details>
362-
363350
- <details>
364351
<summary>Constrain the output with a custom grammar</summary>
365352

build.backend.sh

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# force isatty-->true, so that $0 |& head -50 has colors ...
2+
rm -f READY_backend FAILED_backend
3+
4+
echo "int isatty(int fd) { return 1; }" | gcc -O2 -fpic -shared -ldl -o /tmp/isatty.so -xc -
5+
export LD_PRELOAD=/tmp/isatty.so
6+
7+
if [[ "${PERF_MODE:-}" ]]; then
8+
FLAVOR="-prod"
9+
else
10+
FLAVOR=""
11+
fi
12+
13+
export SDKROOT=$(xcrun --sdk macosx --show-sdk-path)
14+
15+
if [[ "$FLAVOR" == "-prod" ]]; then
16+
cat <<EOF
17+
###
18+
### Building the prod flavor
19+
###
20+
EOF
21+
fi
22+
23+
TARGETS="llama-run"
24+
if [[ "${BENCH_MODE:-}" == "bench" ]]; then
25+
TARGETS="$TARGETS llama-bench"
26+
elif [[ "${BENCH_MODE:-}" == "perf" ]]; then
27+
TARGETS="$TARGETS test-backend-ops"
28+
fi
29+
30+
cmake --build ../build.remoting-backend$FLAVOR --target $TARGETS "$@" --parallel 8
31+
32+
if [[ $? == 0 ]]; then
33+
touch READY_backend
34+
else
35+
touch FAILED_backend
36+
exit 1
37+
fi

build.linux.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
rm -f READY FAILED
2+
3+
cmake --build ../build.vulkan-linux --parallel 8 --target llama-run llama-server
4+
5+
if [[ $? == 0 ]]; then
6+
touch READY
7+
else
8+
touch FAILED
9+
exit 1
10+
fi

build.remoting.sh

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# force isatty-->true, so that $0 |& head -50 has colors ...
2+
rm -f READY FAILED
3+
4+
echo "int isatty(int fd) { return 1; }" | gcc -O2 -fpic -shared -ldl -o /tmp/isatty.so -xc -
5+
export LD_PRELOAD=/tmp/isatty.so
6+
7+
TARGETS="ggml-remotingfrontend"
8+
9+
TARGETS="$BUILD_TARGET llama-run"
10+
set -x
11+
if [[ "${BENCH_MODE:-}" == "bench" ]]; then
12+
TARGETS="$TARGETS llama-bench"
13+
elif [[ "${BENCH_MODE:-}" == "server" ]]; then
14+
TARGETS="$TARGETS llama-server"
15+
elif [[ "${BENCH_MODE:-}" == "perf" ]]; then
16+
TARGETS="$TARGETS test-backend-ops"
17+
fi
18+
19+
cmake --build ../build.remoting-frontend$FLAVOR --parallel 8 --target $TARGETS "$@"
20+
21+
if [[ $? == 0 ]]; then
22+
touch READY
23+
else
24+
touch FAILED
25+
exit 1
26+
fi

build.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
cmake --build ./build/ --parallel 8

build.vulkan.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
rm -f READY FAILED
2+
3+
cmake --build ../build.vulkan --parallel 8 --target llama-run
4+
5+
if [[ $? == 0 ]]; then
6+
touch READY
7+
else
8+
touch FAILED
9+
exit 1
10+
fi

0 commit comments

Comments
 (0)