Skip to content

Conversation

@jan-service-account
Copy link

Updates dev branch with latest release (b5170) from ggml-org/llama.cpp

ngxson and others added 5 commits April 22, 2025 10:37
* llava : update documentations

* fix typo
* metal : add memory pool for temp allocs (wip) [no ci]

* cont : free buffers from the heap

* cont : resize heap [no ci]

* cont : refactor heap [no ci]

* cont : heap for each cmd buffer [no ci]

* cont : fix free

* wip

* cont : fix alignment [no ci]

* cont : not working .. [no ci]

* cont : heap allocation now works [no ci]

* cont : use MTLHeapTypePlacement

ggml-ci

* metal : use dynamic MTLHeap allocations

ggml-ci

* metal : add comments

* metal : disable softmax use of mem_pool

ggml-ci

* metal : final touches
* security : add note about RPC functionality

* security : add note about llama-server
* mtmd : support SmolVLM (version 1 and 2)

* correct chat template

* fix n_patches

* scale_factor is an int

* add more models to test
* CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID

* fix logic for RoPE support, CUDA graphs
@jan-service-account jan-service-account merged commit 7b1524b into dev Apr 23, 2025
15 checks passed
@jan-service-account jan-service-account deleted the update-dev-from-master-2025-04-23-00-08 branch April 23, 2025 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants