Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
182 commits
Select commit Hold shift + click to select a range
ea1431b
docs : add "Quick start" section for new users (#13862)
ngxson Jun 3, 2025
7e00e60
vulkan: fix warnings in perf logger querypool code (#13937)
jeffbolznv Jun 3, 2025
e0e806f
kv-cache : fix unified::seq_rm to work with seq_id < 0 (#13985)
ggerganov Jun 4, 2025
0b4be4c
CUDA: fix FTZ in FA for Gemma 3 (#13991)
JohannesGaessler Jun 4, 2025
3ac6753
llama-graph : use ggml_repeat_4d (#13998)
ngxson Jun 4, 2025
4825487
releases : use dl backend for linux release, remove arm64 linux relea…
slaren Jun 4, 2025
2589ad3
ci : remove cuda 11.7 releases, switch runner to windows 2022 (#13997)
slaren Jun 4, 2025
3e63a58
kv-cache : refactor the update/defrag mechanism (#13988)
ggerganov Jun 4, 2025
0d39844
ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (#13813)
etasnadi Jun 4, 2025
5a8ae30
vulkan: automatically deduce size of push constants (#13936)
jeffbolznv Jun 5, 2025
9e31bec
context : fix pos_min initialization upon error decode (#14008)
ggerganov Jun 5, 2025
9f47fa5
vocab : warn about missing mask token (#14022)
CISC Jun 5, 2025
d01d112
readme : add badge (#13938)
Olexandr88 Jun 5, 2025
3a07714
llama : allow using mmap without PrefetchVirtualMemory, apply GGML_WI…
slaren Jun 5, 2025
7f37b6c
memory : migrate from llama_kv_cache to more generic llama_memory (#1…
ggerganov Jun 5, 2025
146b88e
ci: fix CUDA build failure on autodl cloud machines (#14005)
pockers21 Jun 5, 2025
669c13e
vulkan: Enable VK_KHR_cooperative_matrix extension for Intel Xe2 GPUs…
rillomas Jun 5, 2025
1caae7f
gguf-py : add add_classifier_output_labels method to writer (#14031)
CISC Jun 5, 2025
d17a809
llama : support multiple classifier outputs and labels (#13940)
CISC Jun 6, 2025
487a5e0
context : fix SWA-related warning for multiple sequences (#14045)
ggerganov Jun 6, 2025
745aa53
llama : deprecate llama_kv_self_ API (#14030)
ggerganov Jun 6, 2025
0974ad7
llama : fix llama_model_chat_template with template name (LLM_KV with…
CISC Jun 7, 2025
228f34c
SYCL: Implement few same quantized type copy kernels (#13739)
qnixsynapse Jun 7, 2025
5787b5d
ci: add LoongArch cross-compile build (#13944)
wojiushixiaobai Jun 7, 2025
247e5c6
cuda : fix buffer type check with integrated GPUs (#14069)
slaren Jun 8, 2025
056eb74
CANN: Enable labeler for Ascend NPU (#13914)
shink Jun 9, 2025
91a8ee6
add geglu activation function (#14074)
huydt84 Jun 9, 2025
b460d16
sycl: Add reorder to Q6_K mmvq implementation (#13885)
s-Nick Jun 9, 2025
87d34b3
server : fix LRU check (#14079)
ggerganov Jun 9, 2025
dc0623f
webui: fix sidebar being covered by main content (#14082)
yeahdongcn Jun 9, 2025
e21d2d4
CANN: Simplify the environment variable setting(#13104)
bachelor-dou Jun 9, 2025
201b31d
graph : fix geglu (#14077)
ggerganov Jun 9, 2025
8f47e25
cuda : fix device sync on buffer clear (#14033)
slaren Jun 9, 2025
f470bc3
ggml-cpu : split arch-specific implementations (#13892)
xctan Jun 9, 2025
7f4fbe5
llama : allow building all tests on windows when not using shared lib…
slaren Jun 9, 2025
40cbf57
kv-cache : fix shift and defrag logic (#14081)
ggerganov Jun 9, 2025
1f63e75
metal : use less stack memory in FA kernel (#14088)
ggerganov Jun 9, 2025
1a3b5e8
Add in-build ggml::ggml ALIAS library (ggml/1260)
dg0yt Jun 3, 2025
b8e2194
sync : ggml
ggerganov Jun 10, 2025
2bb0467
rpc : nicer error messages for RPC server crash (#14076)
isaac-mcfadyen Jun 10, 2025
97340b4
Vulkan: Don't default to CPU device (like llvmpipe), even if no other…
0cc4m Jun 10, 2025
b7ce1ad
ggml : fix weak alias win32 (whisper/0)
ggerganov Jun 10, 2025
ae92c18
sync : ggml
ggerganov Jun 10, 2025
3a12db2
Fixed spec timings to: accepted/tested instead of accepted/drafted (#…
jukofyork Jun 10, 2025
652b70e
vulkan: force device 0 in CI (#14106)
jeffbolznv Jun 10, 2025
3678b83
llama : support GEGLU for jina-bert-v2 (#14090)
CISC Jun 10, 2025
55f6b9f
convert : fix duplicate key DeepSeek-R1 conversion error (#14103)
CISC Jun 10, 2025
dad5c44
kv-cache : avoid modifying recurrent cells when setting inputs (#13834)
compilade Jun 10, 2025
4c763c8
opencl: add `mul_mv_id_q4_0_f32_8x_flat` (#14003)
lhez Jun 10, 2025
1f7d50b
vulkan: Track descriptor pools/sets per-context (#14109)
jeffbolznv Jun 11, 2025
7ae2932
kv-cache : add LLAMA_KV_CACHE_DEBUG environment variable (#14121)
ggerganov Jun 11, 2025
2baf077
server : pass default --keep argument (#14120)
MightyAlex200 Jun 11, 2025
89a184f
kv-cache : relax SWA masking condition (#14119)
ggerganov Jun 11, 2025
7781e5f
webui: Wrap long numbers instead of infinite horizontal scroll (#14062)
am17an Jun 11, 2025
bd248d4
vulkan: Better thread-safety for command pools/buffers (#14116)
jeffbolznv Jun 11, 2025
cc66a7f
tests : add test-tokenizers-repo (#14017)
CISC Jun 11, 2025
d4e0d95
chore : clean up relative source dir paths (#14128)
CISC Jun 11, 2025
532802f
Implement GGML_CPU_ALL_VARIANTS for ARM (#14080)
ckastner Jun 11, 2025
2e89f76
common: fix issue with regex_escape routine on windows (#14133)
bandoti Jun 11, 2025
a20b2b0
context : round n_tokens to next multiple of n_seqs when reserving (#…
compilade Jun 12, 2025
9596506
kv-cache : fix split_equal handling in unified implementation (#14130)
ggerganov Jun 12, 2025
e2c0b6e
cmake : handle whitepsaces in path during metal build (#14126)
ggerganov Jun 12, 2025
c3ee46f
batch : remove logits_all flag (#14141)
ggerganov Jun 12, 2025
f6e1a7a
context : simplify output counting logic during decode (#14142)
ggerganov Jun 12, 2025
7d51644
server : re-enable SWA speculative decoding (#14131)
ggerganov Jun 12, 2025
a681b4b
readme : remove project status link (#14149)
ggerganov Jun 12, 2025
ed52f36
sycl: Remove not needed copy f16->f32 for dnnl mul mat (#14125)
ShanoToni Jun 12, 2025
c33fe8b
vocab : prevent heap overflow when vocab is too small (#14145)
ggerganov Jun 13, 2025
09cf2c7
cmake : Improve build-info.cpp generation (#14156)
ckastner Jun 13, 2025
c61285e
SYCL: Bump oneMath commit (#14152)
Jun 13, 2025
0889eba
sycl: Adding additional cpy dbg print output (#14034)
ShanoToni Jun 13, 2025
ffad043
server : fix SWA condition for full context reprocess (#14163)
ggerganov Jun 13, 2025
d714dad
pooling : make cls_b and cls_out_b optional (#14165)
huydt84 Jun 13, 2025
cc8d081
cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167)
ckastner Jun 13, 2025
b7cc774
readme : remove survey link (#14168)
ggerganov Jun 13, 2025
60c6663
batch : rework llama_batch_allocr (#14153)
ggerganov Jun 13, 2025
26ff368
docs : Update multimodal.md (#14122)
ddpasa Jun 13, 2025
80709b7
batch : add LLAMA_BATCH_DEBUG environment variable (#14172)
ggerganov Jun 13, 2025
3cfbbdb
Merge commit from fork
GuyGoldenberg Jun 13, 2025
40643ed
sycl: fix docker image (#14144)
sgeor255 Jun 13, 2025
fb85a28
vocab : fix build (#14175)
ggerganov Jun 13, 2025
2e42be4
compare-llama-bench: add option to plot (#14169)
am17an Jun 14, 2025
3cb203c
llama-chat : Do not throw when tool parsing fails (#14012)
p1-0tr Jun 14, 2025
00ba772
docs : remove WIP since PR has been merged (#13912)
pepijndevos Jun 15, 2025
b9912ac
batch : auto-gen positions + verify multi-sequence input (#14177)
ggerganov Jun 15, 2025
c311ac6
cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188)
ggerganov Jun 15, 2025
9ae4143
model : add dots.llm1 architecture support (#14044) (#14118)
Noeda Jun 15, 2025
5fce5f9
kv-cache : fix use-after-move of defrag info (#14189)
ggerganov Jun 15, 2025
2c2caa4
HIP: Replace usage of depricated preprocessor macro __AMDGCN_WAVEFRON…
IMbackK Jun 15, 2025
e54b394
CUDA/HIP: fix ssm_scan on devices where warp size is not 32 (#14196)
IMbackK Jun 15, 2025
30e5b01
quantize : change int to unsigned int for KV overrides (#14197)
EAddario Jun 15, 2025
cd355ed
server : When listening on a unix domain socket don't print http:// a…
ericcurtin Jun 15, 2025
d7da8dc
model : Add support for Arcee AI's upcoming AFM model (#14185)
bartowski1182 Jun 15, 2025
3555b30
ggml-cpu : rework weak alias on apple targets (#14146)
xctan Jun 16, 2025
c89c2d1
vulkan: mutex around vkQueueSubmit (#14127)
jeffbolznv Jun 16, 2025
4ad2436
gguf-py : allow key override when adding value to GGUFWriter (#14194)
huydt84 Jun 16, 2025
0bf49eb
convert : remove arcee change in convert_hf_to_gguf_update.py (#14207)
bartowski1182 Jun 16, 2025
3ba0d84
ggml: Add Android support for GGML_CPU_ALL_VARIANTS (#14206)
chaxu01 Jun 16, 2025
d3e64b9
llama : rework embeddings logic (#14208)
ggerganov Jun 16, 2025
7d6d91b
HIP: disable rocwmma on gfx12 by default until rocm 7.0 (#14202)
IMbackK Jun 16, 2025
ad590be
model : add NeoBERT (#14164)
huydt84 Jun 16, 2025
0dbcabd
cmake: clean up external project logic for vulkan-shaders-gen (#14179)
bandoti Jun 16, 2025
6adc3c3
llama : add thread safety test (#14035)
slaren Jun 16, 2025
89fea80
server : fix incorrect usage of llama_get_embeddings() (#14225)
ggerganov Jun 16, 2025
e434e69
common : suggest --jinja when autodetection fails (#14222)
CISC Jun 16, 2025
fe9d60e
musa: fix build warning (unused variable) (#14231)
yeahdongcn Jun 17, 2025
860a9e4
ggml-cpu : remove the weak alias trick (#14221)
xctan Jun 17, 2025
c465030
cmake: remove shader-gen step-targets from ggml-vulkan (#14226)
bandoti Jun 17, 2025
c2056ed
examples : include examples in msvc disable warn (ggml/1270)
danbev Jun 12, 2025
bbe98d2
ggml : remove unused ggml_context_container (ggml/1272)
danbev Jun 13, 2025
dd8e59f
ggml : disable warnings for tests when using MSVC (ggml/1273)
danbev Jun 13, 2025
d03172c
sync : ggml
ggerganov Jun 18, 2025
3865cff
convert : fix null head_dim AutoConfig regression (#14248)
CISC Jun 18, 2025
9540255
llama-chat : fix multiple system message for gemma, orion (#14246)
ngxson Jun 18, 2025
413977d
mtmd : refactor llava-uhd preprocessing logic (#14247)
ngxson Jun 18, 2025
ef03580
ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (#14258)
chaxu01 Jun 18, 2025
6231c5c
ggml-cpu: fix uncaught underscore terminators (#14023)
taronaeo Jun 18, 2025
50d2227
ggml-cpu: reduce asm calls for hsum (#14037)
taronaeo Jun 18, 2025
8d94713
docs: add s390x build documentation (#14264)
taronaeo Jun 18, 2025
ed3290a
metal : add mean kernel (#14267)
ggerganov Jun 19, 2025
edc4a29
memory : Hybrid recurrent cache (#13979)
gabe-l-hart Jun 19, 2025
10bb545
Vulkan: Set device max size for host memory to avoid OOM warning and …
0cc4m Jun 19, 2025
faed5a5
llamafile : support s390x SIMD instruction set (#14273)
taronaeo Jun 19, 2025
5fc7856
convert : fix remote option in Windows (#14100)
pqnet Jun 19, 2025
fffcce5
llama-bench : add --no-warmup flag (#14224) (#14270)
s2010 Jun 19, 2025
600e3e9
sycl: Cleanup codepaths in Get Rows in sycl backend (#14215)
ShanoToni Jun 19, 2025
456af35
build : suppress gcc15 compile warnings (#14261)
fanyang89 Jun 19, 2025
d67341d
server : add server parameters for draft model cache type (#13782)
aa956 Jun 19, 2025
381174b
gguf-py : make sentencepiece optional (#14200)
Ahajha Jun 19, 2025
8f71d0f
ggml-cpu : remove unnecesary arm feature detection (#14281)
slaren Jun 19, 2025
9eaa51e
CUDA: add conv_2d_dw (#14265)
am17an Jun 20, 2025
4c9fdfb
ubatch : new splitting logic (#14217)
ggerganov Jun 20, 2025
812939a
model : more uniform output id handling (#14275)
ggerganov Jun 20, 2025
9230dbe
ggml: Update KleidiAI to v1.9.0 (#14277)
chaxu01 Jun 20, 2025
d27b3ca
ggml : fix repack work size for mul_mat_id (#14292)
ggerganov Jun 20, 2025
e28c1b9
cuda : synchronize graph capture and cublas handle destruction (#14288)
slaren Jun 20, 2025
88fc854
llama : improve sep token handling (#14272)
CISC Jun 20, 2025
6369be0
Implement GGML_CPU_ALL_VARIANTS for PowerPC (#14286)
ckastner Jun 20, 2025
8308f98
sycl: add usage of enqueue_functions extension (#14244)
s-Nick Jun 20, 2025
dd6e6d0
vocab : prevent tokenizer overflow (#14301)
retr0reg Jun 20, 2025
22015b2
lint : remove trailing whitepace (#14304)
CISC Jun 20, 2025
c959f46
CUDA: add conv_2d_transpose (#14287)
am17an Jun 20, 2025
d860dd9
docs : fix the link to llama.h (#14293)
david20571015 Jun 20, 2025
b714767
Add `ggml_roll` (ggml/1274)
Acly Jun 18, 2025
06cbedf
sync : ggml
ggerganov Jun 20, 2025
b23fa0b
convert : fix Llama 4 conversion (#14311)
danielhanchen Jun 21, 2025
692e3cd
memory : rename interface to llama_memory_context_i (#14296)
ggerganov Jun 21, 2025
67ae531
metal : fix thread-safety (#14300)
ggerganov Jun 21, 2025
58cba76
gguf-py : fix TemplateProcessing pair when bos/eos is missing (#14312)
CISC Jun 21, 2025
bb16041
Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (…
mtavenrath Jun 21, 2025
aa0ef5c
gguf-py : fix Qwen3-Embedding eos token (#14314)
CISC Jun 21, 2025
aa064b2
CUDA: add mean operation (#14313)
am17an Jun 22, 2025
40bfa04
common : use std::string_view now that we target c++17 (#14319)
CISC Jun 22, 2025
5d5c066
mtmd : fix Pixtral OOM with large images by capping image_size to 102…
yuiseki Jun 22, 2025
af3373f
HIP: enable vec fattn on RDNA4 (#14323)
IMbackK Jun 22, 2025
f1f5e82
examples : fix is_first logic for tokenization (#14329)
ggerganov Jun 22, 2025
66aba7a
run : avoid double tokenization (#14327)
retr0reg Jun 22, 2025
238005c
gguf-py : fix SpecialVocab parsing when post_processor is null (#14330)
CISC Jun 22, 2025
fa4a9f2
quantize : handle user-defined pruning of whole layers (blocks) (#13037)
EAddario Jun 22, 2025
3a9457d
vulkan: update windows SDK in CI (#14334)
jeffbolznv Jun 23, 2025
7b50d58
kv-cells : fix tracking of seq_pos (#14339)
ggerganov Jun 23, 2025
defe215
CUDA: mul_mat_v support for batch sizes > 1 (#14262)
JohannesGaessler Jun 23, 2025
72c6bc3
llama : better rwkv chat template and add missing `inputs.use_jinja` …
MollySophia Jun 23, 2025
bf2a99e
vulkan: update windows SDK in release.yml (#14344)
jeffbolznv Jun 23, 2025
ce82bd0
ci: add workflow for relocatable cmake package (#14346)
bandoti Jun 23, 2025
0142961
CUDA/HIP: optimize mmv paths taken for HIP devices (#14324)
IMbackK Jun 23, 2025
901e20b
jinja : Add Mistral-Small-3.2-24B-Instruct-2506.jinja (#14349)
bartowski1182 Jun 24, 2025
abf2410
main : honor --verbose-prompt on interactive prompts (#14350)
CISC Jun 24, 2025
1b809ce
server : move no API key doc to /health (#14352)
pnb Jun 24, 2025
c148cf1
cmake : use LLAMA_BUILD_NUMBER when defining LLAMA_INSTALL_VERSION (#…
mbaudier Jun 24, 2025
62af464
batch : fix check for empty sequences in memory (#14364)
ggerganov Jun 24, 2025
73e53dc
opencl: ref count `ggml_backend_opencl_context` and refactor profilin…
lhez Jun 24, 2025
2bf9d53
sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices (#13973)
ShanoToni Jun 25, 2025
b193d53
ggml : do not output unprintable characters on GGUF load failure (#14…
CISC Jun 25, 2025
60ef23d
ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317)
taronaeo Jun 25, 2025
716301d
musa: enable fp16 mma (all) and cublas on qy2 (#13842)
yeahdongcn Jun 26, 2025
bf5bcd0
docs: update s390x documentation + add faq (#14389)
taronaeo Jun 26, 2025
5783ae4
metal : batch rows copy in a single threadgroup (#14384)
ggerganov Jun 26, 2025
e8215db
metal : add special-case mat-vec mul for ne00 == 4 (#14385)
ggerganov Jun 26, 2025
b253462
llama : return mistral-v7-tekken as default template only (#14390)
CISC Jun 26, 2025
a01047b
cmake: regen vulkan shaders when shaders-gen sources change (#14398)
bandoti Jun 26, 2025
8846aac
model : gemma3n text-only (#14400)
ngxson Jun 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 17 additions & 13 deletions .devops/intel.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -49,19 +49,23 @@ COPY --from=build /app/full /app

WORKDIR /app

RUN apt-get update \
&& apt-get install -y \
git \
python3 \
python3-pip \
&& pip install --upgrade pip setuptools wheel \
&& pip install -r requirements.txt \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
&& find /var/cache/apt/archives /var/lib/apt/lists -not -name lock -type f -delete \
&& find /var/cache -type f -delete

RUN apt-get update && \
apt-get install -y \
git \
python3 \
python3-pip \
python3-venv && \
python3 -m venv /opt/venv && \
. /opt/venv/bin/activate && \
pip install --upgrade pip setuptools wheel && \
pip install -r requirements.txt && \
apt autoremove -y && \
apt clean -y && \
rm -rf /tmp/* /var/tmp/* && \
find /var/cache/apt/archives /var/lib/apt/lists -not -name lock -type f -delete && \
find /var/cache -type f -delete

ENV PATH="/opt/venv/bin:$PATH"

ENTRYPOINT ["/app/tools.sh"]

Expand Down
7 changes: 7 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,10 @@ nix:
embedding:
- changed-files:
- any-glob-to-any-file: examples/embedding/

Ascend NPU:
- changed-files:
- any-glob-to-any-file:
- ggml/include/ggml-cann.h
- ggml/src/ggml-cann/**
- docs/backend/CANN.md
51 changes: 51 additions & 0 deletions .github/workflows/build-cmake-pkg.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
name: Build relocatable cmake package
on:
workflow_dispatch:
workflow_call:

jobs:
linux:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Install dependencies
run: |
sudo apt update
sudo apt install -y build-essential tcl
- name: Build
run: |
PREFIX="$(pwd)"/inst
cmake -S . -B build -DCMAKE_PREFIX_PATH="$PREFIX" \
-DLLAMA_CURL=OFF -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_TOOLS=OFF \
-DLLAMA_BUILD_EXAMPLES=OFF -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
cmake --install build --prefix "$PREFIX" --config Release
export LLAMA_CONFIG="$PREFIX"/lib/cmake/llama/llama-config.cmake
tclsh <<'EOF'
set build(commit) [string trim [exec git rev-parse --short HEAD]]
set build(number) [string trim [exec git rev-list --count HEAD]]
set build(version) "0.0.$build(number)"
set llamaconfig [read [open "$env(LLAMA_CONFIG)" r]]
set checks [list "set\\(LLAMA_VERSION \\s+$build(version)\\)" \
"set\\(LLAMA_BUILD_COMMIT\\s+$build(commit)\\)" \
"set\\(LLAMA_BUILD_NUMBER\\s+$build(number)\\)"]
puts -nonewline "Checking llama-config.cmake version... "
foreach check $checks {
if {![regexp -expanded -- $check $llamaconfig]} {
puts "\"$check\" failed!"
exit 1
}
}
puts "success."
EOF
cd examples/simple-cmake-pkg
cmake -S . -B build -DCMAKE_PREFIX_PATH="$PREFIX"/lib/cmake
cmake --build build
113 changes: 113 additions & 0 deletions .github/workflows/build-linux-cross.yml
Original file line number Diff line number Diff line change
Expand Up @@ -231,3 +231,116 @@ jobs:
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH
cmake --build build --config Release -j $(nproc)
debian-13-loongarch64-cpu-cross:
runs-on: ubuntu-24.04
container: debian@sha256:653dfb9f86c3782e8369d5f7d29bb8faba1f4bff9025db46e807fa4c22903671

steps:
- uses: actions/checkout@v4
- name: Setup LoongArch
run: |
rm -f /etc/apt/sources.list.d/*
cat << EOF | tee /etc/apt/sources.list.d/debian-ports.list
deb http://snapshot.debian.org/archive/debian/20250515T202920Z/ trixie main
EOF
( echo 'quiet "true";'; \
echo 'APT::Get::Assume-Yes "true";'; \
echo 'APT::Install-Recommends "false";'; \
echo 'Acquire::Check-Valid-Until "false";'; \
echo 'Acquire::Retries "5";'; \
) > /etc/apt/apt.conf.d/99snapshot-repos
apt-get update
apt-get install -y ca-certificates debian-ports-archive-keyring cmake git zip
dpkg --add-architecture loong64
# Add arch-specific repositories for non-amd64 architectures
cat << EOF | tee /etc/apt/sources.list.d/loong64-ports.list
deb [arch=loong64] http://snapshot.debian.org/archive/debian-ports/20250515T194251Z/ sid main
EOF
apt-get update || true ;# Prevent failure due to missing URLs.
apt-get install -y --no-install-recommends \
build-essential \
gcc-14-loongarch64-linux-gnu \
g++-14-loongarch64-linux-gnu
- name: Build
run: |
cmake -B build -DLLAMA_CURL=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_OPENMP=OFF \
-DLLAMA_BUILD_EXAMPLES=ON \
-DLLAMA_BUILD_TOOLS=ON \
-DLLAMA_BUILD_TESTS=OFF \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=loongarch64 \
-DCMAKE_C_COMPILER=loongarch64-linux-gnu-gcc-14 \
-DCMAKE_CXX_COMPILER=loongarch64-linux-gnu-g++-14 \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_FIND_ROOT_PATH=/usr/lib/loongarch64-linux-gnu \
-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH
cmake --build build --config Release -j $(nproc)
debian-13-loongarch64-vulkan-cross:
runs-on: ubuntu-24.04
container: debian@sha256:653dfb9f86c3782e8369d5f7d29bb8faba1f4bff9025db46e807fa4c22903671

steps:
- uses: actions/checkout@v4
- name: Setup LoongArch
run: |
rm -f /etc/apt/sources.list.d/*
cat << EOF | tee /etc/apt/sources.list.d/debian-ports.list
deb http://snapshot.debian.org/archive/debian/20250515T202920Z/ trixie main
EOF
( echo 'quiet "true";'; \
echo 'APT::Get::Assume-Yes "true";'; \
echo 'APT::Install-Recommends "false";'; \
echo 'Acquire::Check-Valid-Until "false";'; \
echo 'Acquire::Retries "5";'; \
) > /etc/apt/apt.conf.d/99snapshot-repos
apt-get update
apt-get install -y ca-certificates debian-ports-archive-keyring cmake git zip
dpkg --add-architecture loong64
# Add arch-specific repositories for non-amd64 architectures
cat << EOF | tee /etc/apt/sources.list.d/loong64-ports.list
deb [arch=loong64] http://snapshot.debian.org/archive/debian-ports/20250515T194251Z/ sid main
EOF
apt-get update || true ;# Prevent failure due to missing URLs.
apt-get install -y --no-install-recommends \
build-essential \
glslc \
gcc-14-loongarch64-linux-gnu \
g++-14-loongarch64-linux-gnu \
libvulkan-dev:loong64
- name: Build
run: |
cmake -B build -DLLAMA_CURL=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_VULKAN=ON \
-DGGML_OPENMP=OFF \
-DLLAMA_BUILD_EXAMPLES=ON \
-DLLAMA_BUILD_TOOLS=ON \
-DLLAMA_BUILD_TESTS=OFF \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=loongarch64 \
-DCMAKE_C_COMPILER=loongarch64-linux-gnu-gcc-14 \
-DCMAKE_CXX_COMPILER=loongarch64-linux-gnu-g++-14 \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_FIND_ROOT_PATH=/usr/lib/loongarch64-linux-gnu \
-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH
cmake --build build --config Release -j $(nproc)
60 changes: 49 additions & 11 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,43 @@ on:
push:
branches:
- master
paths: ['.github/workflows/build.yml', '.github/workflows/build-linux-cross.yml', '**/CMakeLists.txt', '**/.cmake', '**/*.h', '**/*.hpp', '**/*.c', '**/*.cpp', '**/*.cu', '**/*.cuh', '**/*.swift', '**/*.m', '**/*.metal', '**/*.comp']
paths: [
'.github/workflows/build.yml',
'.github/workflows/build-linux-cross.yml',
'.github/workflows/build-cmake-pkg.yml',
'**/CMakeLists.txt',
'**/.cmake',
'**/*.h',
'**/*.hpp',
'**/*.c',
'**/*.cpp',
'**/*.cu',
'**/*.cuh',
'**/*.swift',
'**/*.m',
'**/*.metal',
'**/*.comp'
]

pull_request:
types: [opened, synchronize, reopened]
paths: ['.github/workflows/build.yml', '.github/workflows/build-linux-cross.yml', '**/CMakeLists.txt', '**/.cmake', '**/*.h', '**/*.hpp', '**/*.c', '**/*.cpp', '**/*.cu', '**/*.cuh', '**/*.swift', '**/*.m', '**/*.metal', '**/*.comp']
paths: [
'.github/workflows/build.yml',
'.github/workflows/build-linux-cross.yml',
'.github/workflows/build-cmake-pkg.yml',
'**/CMakeLists.txt',
'**/.cmake',
'**/*.h',
'**/*.hpp',
'**/*.c',
'**/*.cpp',
'**/*.cu',
'**/*.cuh',
'**/*.swift',
'**/*.m',
'**/*.metal',
'**/*.comp'
]

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
Expand Down Expand Up @@ -306,6 +339,7 @@ jobs:
id: cmake_test
run: |
cd build
export GGML_VK_VISIBLE_DEVICES=0
# This is using llvmpipe and runs slower than other backends
ctest -L main --verbose --timeout 3600
Expand Down Expand Up @@ -477,6 +511,9 @@ jobs:
build-linux-cross:
uses: ./.github/workflows/build-linux-cross.yml

build-cmake-pkg:
uses: ./.github/workflows/build-cmake-pkg.yml

macOS-latest-cmake-ios:
runs-on: macos-latest

Expand Down Expand Up @@ -682,17 +719,17 @@ jobs:
env:
OPENBLAS_VERSION: 0.3.23
SDE_VERSION: 9.33.0-2024-01-07
VULKAN_VERSION: 1.4.309.0
VULKAN_VERSION: 1.4.313.2

strategy:
matrix:
include:
- build: 'cpu-x64'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_OPENMP=OFF'
- build: 'cpu-x64 (static)'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DBUILD_SHARED_LIBS=OFF'
- build: 'openblas-x64'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_OPENMP=OFF -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS="$env:RUNNER_TEMP/openblas/include" -DBLAS_LIBRARIES="$env:RUNNER_TEMP/openblas/lib/openblas.lib"'
- build: 'vulkan-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_VULKAN=ON'
defines: '-DCMAKE_BUILD_TYPE=Release -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_VULKAN=ON'
- build: 'llvm-arm64'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON'
- build: 'llvm-arm64-opencl-adreno'
Expand Down Expand Up @@ -735,7 +772,7 @@ jobs:
id: get_vulkan
if: ${{ matrix.build == 'kompute-x64' || matrix.build == 'vulkan-x64' }}
run: |
curl.exe -o $env:RUNNER_TEMP/VulkanSDK-Installer.exe -L "https://sdk.lunarg.com/sdk/download/${env:VULKAN_VERSION}/windows/VulkanSDK-${env:VULKAN_VERSION}-Installer.exe"
curl.exe -o $env:RUNNER_TEMP/VulkanSDK-Installer.exe -L "https://sdk.lunarg.com/sdk/download/${env:VULKAN_VERSION}/windows/vulkansdk-windows-X64-${env:VULKAN_VERSION}.exe"
& "$env:RUNNER_TEMP\VulkanSDK-Installer.exe" --accept-licenses --default-answer --confirm-command install
Add-Content $env:GITHUB_ENV "VULKAN_SDK=C:\VulkanSDK\${env:VULKAN_VERSION}"
Add-Content $env:GITHUB_PATH "C:\VulkanSDK\${env:VULKAN_VERSION}\bin"
Expand Down Expand Up @@ -777,6 +814,7 @@ jobs:
cmake -S . -B build ${{ matrix.defines }} `
-DCURL_LIBRARY="$env:CURL_PATH/lib/libcurl.dll.a" -DCURL_INCLUDE_DIR="$env:CURL_PATH/include"
cmake --build build --config Release -j ${env:NUMBER_OF_PROCESSORS}
cp $env:CURL_PATH/bin/libcurl-*.dll build/bin/Release
- name: Add libopenblas.dll
id: add_libopenblas_dll
Expand Down Expand Up @@ -839,12 +877,12 @@ jobs:
-DGGML_CUDA=ON
cmake --build build
windows-2019-cmake-cuda:
runs-on: windows-2019
windows-2022-cmake-cuda:
runs-on: windows-2022

strategy:
matrix:
cuda: ['12.4', '11.7']
cuda: ['12.4']

steps:
- name: Clone
Expand Down Expand Up @@ -878,7 +916,7 @@ jobs:
env:
CURL_PATH: ${{ steps.get_libcurl.outputs.curl_path }}
run: |
call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvars64.bat"
call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64
cmake -S . -B build -G "Ninja Multi-Config" ^
-DLLAMA_BUILD_SERVER=ON ^
-DGGML_NATIVE=OFF ^
Expand Down
Loading
Loading