Skip to content

Conversation

@Djip007
Copy link
Contributor

@Djip007 Djip007 commented Aug 13, 2024

add the 2 changes needed for llamafile:

  • remove UMA build option
  • use it in all case if hipalloc failed with 'not have enough memory'

Note: with linux kernel 6.10+ hip_malloc use GGT memory for some APU (Ryzen 7940HS ...) so the limite size is easly configurable on linux with boot parameter. No more need for BIOS VRAM change.

#add at boot kernel param to have 16Go of GGT Memory. (default is 1/2 of free RAM)
amdgpu.gttsize=16384 

Note2: With this patch/UMA we can use all RAM without config.

(#439 (comment) update)

with the 2 changes needed for llamafile:
- remove UMA build option
- use it in all case if hipalloc failed with 'not have enough memory'
@Mushoz
Copy link

Mushoz commented Nov 20, 2024

@Djip007 Can this be merged now that it has been approved?

@Djip007
Copy link
Contributor Author

Djip007 commented Nov 20, 2024

For me with linux kernel > 6.10 it is less needed, but Yes it can be merged.

@cjpais
Copy link
Collaborator

cjpais commented Mar 14, 2025

@Djip007 does this need any additional changes? I see you were the one that contributed it to llama.cpp originally

If we are tracking the upstream changes, would like to have all of the same ones if possible. Were there any notable regressions as a result of these changes? I am happy to pull once you can confirm. I don't have a UMA device to test

@Djip007
Copy link
Contributor Author

Djip007 commented Apr 3, 2025

Sorry did not see you question.

On llama.cpp this is activate when GGML_HIP_UMA is define on build... and when define it is always use. It is not good for dGPU, only iGPU.

This path change the condition to use UMA memory: It is use only if not enough "VRAM" ...

Now as I say with linux kernel > 6.10 AMD change hip memory alloc so it can use all GTT as VRAM on APU. so no more limited by BIOS VRAM reserve (but by GTT size on linux). So this patch is much less useful with recent kernel.

When I did the path, because it is simple to enable it at build with llama.cpp I did it like that, but the patch I create for llamafile have this modified "config".

So if you think it is important to make AMD APU work with "old" kernel you can merge it, if not we can wait someone ask it 😎

@cjpais
Copy link
Collaborator

cjpais commented Apr 11, 2025

I am interested to pull this in, will think on it a little more before the next release. I would love to be able to test but I cannot outside of dGPU scenario in OOM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants