Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
d968e35
Added perfetto support (CPU + Vulkan) backends
walidbr Sep 7, 2025
7efc6d0
Added perfetto support (CPU + Vulkan) backends
walidbr Sep 7, 2025
b0a9c06
Merge branch 'master' into vulkan_perfetto
walidbr Sep 7, 2025
14ff930
server : removed obsolete doc (#15670)
l29ah Aug 29, 2025
0bcbeaa
CANN: FIx compiler warnings (#15661)
noemotiovon Aug 30, 2025
a5d75e1
vulkan: Skip syncing for prealloc_y when it is reused (#15544)
jeffbolznv Aug 30, 2025
22506f6
CUDA: use FP32 arithmetic for conv2d (#15683)
JohannesGaessler Aug 30, 2025
878fe00
llama: use FA + max. GPU layers by default (#15434)
JohannesGaessler Aug 30, 2025
d68c62c
Update build.md to remove MSVC arm64 notes (#15684)
slaren Aug 30, 2025
42e986f
ggml: update kleidiai to v1.13.0 (#15663)
chaxu01 Aug 30, 2025
26f086c
vulkan: clamp matmul and FA results to the max finite value (#15652)
jeffbolznv Aug 31, 2025
e35560d
vulkan: Allow fallback to sysmem memory when vidmem is full (#15649)
jeffbolznv Aug 31, 2025
6164f2a
vulkan : remove unused portability_enumeration_ext variable (#15679)
danbev Aug 31, 2025
f580b2a
vulkan: mul_mat_id coopmat2 optimizations (#15546)
jeffbolznv Aug 31, 2025
e16026b
vulkan: handle large sizes for get_rows (#15686)
jeffbolznv Aug 31, 2025
c369ca1
ci : explicitly set fa off or on (#15692)
CISC Aug 31, 2025
44da51d
llama : separate compute buffer reserve from fattn check (#15696)
slaren Aug 31, 2025
c86af26
llama : fix fattn reserve call n_seqs parameter (#15699)
slaren Aug 31, 2025
7c540fc
metal : fix checks for available FA kernels (#15700)
ggerganov Aug 31, 2025
47bd99d
server : enable /slots by default and make it secure (#15630)
ggerganov Aug 31, 2025
6988691
sampling : optimize samplers by reusing bucket sort (#15665)
ggerganov Aug 31, 2025
696f13f
CANN: fix RoPE cache issue on multi-device (#15629)
hipudding Sep 1, 2025
fad54d2
CANN: Optimize MUL_MAT_ID (#15658)
hipudding Sep 1, 2025
a31a3c7
CUDA: fix build error from ambiguous __half conversions in conv2d (#1…
qnixsynapse Sep 1, 2025
883ad6d
docs : add Hunyuan to models section (#15707)
DamonFool Sep 1, 2025
7c21ea6
ggml : WebGPU add TRANSPOSE and RESHAPE to supported ops (#15695)
danbev Sep 1, 2025
02ae1ea
Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants …
0cc4m Sep 1, 2025
31ee6d0
convert : remove redundant code (#15708)
DamonFool Sep 1, 2025
f6b06ef
ggml: aarch64: Implement SVE F16 kernels for vector functions (#15115)
Vithulep Sep 1, 2025
a7c1974
ggml: SVE support for exponential functions (#15145)
s-goto-11 Sep 1, 2025
bf7dec6
vulkan: disable large mmv subgroups on older Nvidia GPUs (#15717)
0cc4m Sep 1, 2025
c18fc47
vulkan: add missing clamps in new mul_mat_id paths (#15702)
jeffbolznv Sep 1, 2025
b720649
vulkan: use memory budget extension to read memory usage (#15545)
giladgd Sep 1, 2025
51bf289
ggml-backend: raise GGML_MAX_SPLIT_INPUTS (#15722)
JohannesGaessler Sep 1, 2025
14a48ac
CANN: Support ext_factor in rope (#15710)
hipudding Sep 2, 2025
8aa1f0a
CANN: Support eager execution mode under ACL graph compilation (#15712)
noemotiovon Sep 2, 2025
6300a87
opencl: add attn sinks support for FA kernels (#15706)
rmatif Sep 2, 2025
723f773
vulkan: Fix macro parameter order for f32 matmul shaders (#15716)
jeffbolznv Sep 2, 2025
c2e8b13
CANN: Resolve soft_max precision issue (#15730)
hipudding Sep 2, 2025
436e671
vulkan: fix shaders gen when no integer dot is available (#15740)
0cc4m Sep 2, 2025
0cf27b7
llama: -fa 1/0/-1 aliases for -fa on/off/auto (#15746)
JohannesGaessler Sep 2, 2025
6779e5d
chore: Update `.clang-format` to use `BinPackArguments=true` (#15744)
ORippler Sep 2, 2025
6e59834
fix: resolve unsigned int initialization warning for n_dims/size in g…
skrandy Sep 2, 2025
15010d6
CANN: Fix type float_t to float (#15736)
noemotiovon Sep 3, 2025
83a41a3
CANN: Mask unsupported TRANSPOSE_1D operator (#15733)
hipudding Sep 3, 2025
8f18965
model-conversion : add missing curl script [no ci] (#15761)
danbev Sep 3, 2025
2cb9b8d
ggml-cpu : optimize RVV kernels (#15720)
xctan Sep 3, 2025
9afd4ed
CANN: Add RoPE contiguous check for 310I DUP device (#15735)
hipudding Sep 3, 2025
d0f62c7
model-conversion : remove hardcoded /bin/bash shebangs [no ci] (#15765)
danbev Sep 3, 2025
d4617f9
llama : fix incorrect model type for Gemma 270M (#15764)
danbev Sep 3, 2025
5569dc3
sampling : optimize dist sampler (#15704)
ggerganov Sep 3, 2025
375e61d
model-conversion : fix pyright errors (#15770)
danbev Sep 3, 2025
6545c87
CUDA: Optimize `rms_norm_f32` kernel and its fused variants, giving 1…
ORippler Sep 3, 2025
a62ab54
ggml vulkan: add hardsigmoid and hardswish operations (#15762)
relent95 Sep 3, 2025
31850c8
vulkan : update ggml_vk_instance_validation_ext_available (#15666)
danbev Sep 3, 2025
30238ee
vulkan: don't use std::string in load_shaders, to improve compile tim…
jeffbolznv Sep 3, 2025
e614a10
vulkan: fix mmv subgroup16 selection (#15775)
0cc4m Sep 3, 2025
ff764bf
CANN: fix acl_rstd allocation size in ggml_cann_rms_norm (#15760)
noemotiovon Sep 4, 2025
34e736d
opencl: add hs=40 to FA (#15758)
rmatif Sep 4, 2025
491d7b5
CANN: Fix precision issue on 310I DUO multi-devices (#15784)
hipudding Sep 4, 2025
ed6f54c
ggml: add ops for WAN video model (cuda && cpu) (#15669)
leejet Sep 4, 2025
6efefd3
Document the new max GPU layers default in help (#15771)
ericcurtin Sep 4, 2025
8c1c4ea
server: add exceed_context_size_error type (#15780)
ngxson Sep 4, 2025
f02d06b
CANN: Refactor ND to NZ workspace to be per-device (#15763)
noemotiovon Sep 4, 2025
abbac9c
llama : set n_outputs to 1 to avoid 0 outputs mean-pooling (#15791)
danbev Sep 4, 2025
881904a
metal : Add template specialization for mul_mm_id w/ ne20 == 10 (#15799)
gabe-l-hart Sep 4, 2025
56e20f4
llama : add support for EmbeddingGemma 300m (#15798)
danbev Sep 4, 2025
1bf2162
scripts : add Jinja tester PySide6 simple app (#15756)
pwilkin Sep 4, 2025
ebbb81a
chat : nemotron thinking & toolcalling support (#15676)
pwilkin Sep 4, 2025
94920be
chat : fixed crash when Hermes 2 <tool_call> had a newline before it …
ExtReMLapin Sep 4, 2025
f192a0a
model-conversion : add --embeddings flag to modelcard.template [no ci…
danbev Sep 5, 2025
b883c5e
kv-cache : fix SWA checks + disable cacheless iSWA (#15811)
ggerganov Sep 5, 2025
a99f4af
gguf: gguf_writer refactor (#15691)
Green-Sky Sep 5, 2025
64bd608
tests : add --list-ops and --show-coverage options (#15745)
danbev Sep 5, 2025
a6444b4
CUDA: fastdiv, launch bounds for mmvq + q8_1 quant (#15802)
JohannesGaessler Sep 5, 2025
3e927bc
Implement --log-colors with always/never/auto (#15792)
ericcurtin Sep 5, 2025
add7911
Thinking model disabled assistant prefill (#15404)
gabe-l-hart Sep 5, 2025
077fdac
ci : exempt correct research label (#15825)
CISC Sep 5, 2025
b6e1f91
aLoRA Support (#15327)
gabe-l-hart Sep 5, 2025
f18fede
ggml-cpu: drop support for nnpa intrinsics (#15821)
taronaeo Sep 6, 2025
2ed2179
ggml-cpu: document use of "free" memory [no ci] (#15834)
JohannesGaessler Sep 6, 2025
d730284
server : implement prompt processing progress report in stream mode (…
ngxson Sep 6, 2025
8084099
server : speed up tests (#15836)
ngxson Sep 6, 2025
958f133
kleidiai: generalize compute_forward_kv_cache to compute_forward_fp16…
chaxu01 Sep 6, 2025
6b4a425
CUDA: faster tile FA (Pascal/AMD), headsize 256 (#15769)
JohannesGaessler Sep 6, 2025
53c9cbe
Added perfetto support (CPU + Vulkan) backends
walidbr Sep 7, 2025
43d92b5
Added vulkan build workflow, x86
walidbr Sep 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ AllowShortIfStatementsOnASingleLine: Never
AllowShortLambdasOnASingleLine: Inline
AllowShortLoopsOnASingleLine: false
AlwaysBreakBeforeMultilineStrings: true
BinPackArguments: false
BinPackArguments: true
BinPackParameters: false # OnePerLine
BitFieldColonSpacing: Both
BreakBeforeBraces: Custom # Attach
Expand Down
20 changes: 0 additions & 20 deletions .dockerignore

This file was deleted.

87 changes: 0 additions & 87 deletions .github/ISSUE_TEMPLATE/010-bug-compilation.yml

This file was deleted.

101 changes: 0 additions & 101 deletions .github/ISSUE_TEMPLATE/011-bug-results.yml

This file was deleted.

91 changes: 0 additions & 91 deletions .github/ISSUE_TEMPLATE/019-bug-misc.yml

This file was deleted.

51 changes: 0 additions & 51 deletions .github/ISSUE_TEMPLATE/020-enhancement.yml

This file was deleted.

Loading