Skip to content

Conversation

rgerganov
Copy link
Collaborator

@rgerganov rgerganov commented Oct 16, 2025

Start tracking the free memory on every device and report it appropriately.

Start reporting the free memory on every device instead of using fixed values. Now llama-cli users can get a nice memory breakdown when using RPC devices.

@rgerganov rgerganov requested a review from slaren October 16, 2025 13:45
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Oct 16, 2025
Start reporting the free memory on every device instead of using
fixed values. Now llama-cli users can get a nice memory breakdown
when using RPC devices.
@rgerganov rgerganov requested a review from ggerganov as a code owner October 17, 2025 07:35
@rgerganov rgerganov changed the title rpc : track free memory rpc : report actual free memory Oct 17, 2025
@rgerganov
Copy link
Collaborator Author

@slaren I am thinking to drop the -m option of the rpc-server which overrides the reported total_mem. It is very confusing as it is not enforced and users can achieve the same with --tensor-split on the client side. What do you think?

@slaren
Copy link
Member

slaren commented Oct 17, 2025

@slaren I am thinking to drop the -m option of the rpc-server which overrides the reported total_mem. It is very confusing as it is not enforced and users can achieve the same with --tensor-split on the client side. What do you think?

Yes, I think that would be a good change.

@rgerganov rgerganov merged commit 41386cf into ggml-org:master Oct 17, 2025
70 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants