Skip to content

Add llama_memory_breakdown_print support#963

Open
kusaanko wants to merge 2 commits intoutilityai:mainfrom
kusaanko:feat/memory-breakdown
Open

Add llama_memory_breakdown_print support#963
kusaanko wants to merge 2 commits intoutilityai:mainfrom
kusaanko:feat/memory-breakdown

Conversation

@kusaanko
Copy link
Contributor

@kusaanko kusaanko commented Mar 17, 2026

We can estimate how much memory will be allocated by setting the model's parameter, no_alloc, to true and printing the memory breakdown.

Otherwise, you can look into the detail memory usage.

By hooking the logger, users can see how much memory is needed in other than stdout.

You can estimate using this code

let param = LlamaModelParams::default().with_no_alloc(true);
// ...create LlamaContext
ctx.print_memory_breakdown();
2026-03-17T08:56:23.843326Z  INFO llama-cpp-2: | memory breakdown [MiB] | total   free    self   model   context   compute    unaccounted | module="llama.cpp::llama_memory_breakdown_print"
2026-03-17T08:56:23.843385Z  INFO llama-cpp-2: |   - Host               |                 1426 =  1067 +      56 +     302                | module="llama.cpp::llama_memory_breakdown_print"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant