-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Closed
Closed
Copy link
Labels
bugSomething isn't workingSomething isn't working
Description
Name and Version
$ ./build/bin/llama-cli --version
version: 4242 (642330a)
built with Homebrew clang version 18.1.5 for arm64-apple-darwin23.3.0
Operating systems
Linux, Mac, Windows
Which llama.cpp modules do you know to be affected?
llama-server
Problem description & steps to reproduce
in the destructor of server_context -
~server_context() {
if (ctx) {
llama_free(ctx);
ctx = nullptr;
}
if (model) {
llama_free_model(model);
model = nullptr;
}
if (model_dft) {
llama_free_model(model_dft);
model_dft = nullptr;
}
// Clear any sampling context
for (server_slot & slot : slots) {
common_sampler_free(slot.smpl);
slot.smpl = nullptr;
llama_free(slot.ctx_dft);
slot.ctx_dft = nullptr;
common_speculative_free(slot.spec);
slot.spec = nullptr;
llama_batch_free(slot.batch_spec);
}
llama_batch_free(batch);
}- if model_dft is not selected, the slot.spec is not allocated, hence
common_speculative_freeis called with a nullptr (introduced in commit 9ca2e67) - if model_dft is not selected, slot.batch_spec is initialized as default, and llama_batch_free with the default value causes memory corruption (introduced in commit 10bce04)
suggested fix -
- check for slot.spec before calling common_speculative_free
- convert slot.batch_spec to pointer, check for allocation and then call llama_batch_free
kind attn: @ggerganov @slaren
First Bad Commit
first bad commit: 9ca2e67
second bad commit: 10bce04
Relevant log output
this is observation from reading code, not able to reproduce with built binary and passing signal to get the system memory logs on deallocation.Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working