Skip to content

Conversation

@ggerganov
Copy link
Member

fix #12481
fix #12490
ref #12481 (comment)

Disable repacking (i.e. extra buffer types) if a GPU device is going to be available.

@ggerganov ggerganov requested a review from slaren March 21, 2025 13:39
@ggerganov ggerganov merged commit af04481 into master Mar 21, 2025
56 of 58 checks passed
@ggerganov ggerganov deleted the gg/repack-skip-if-gpu branch March 21, 2025 14:14
Ivy233 pushed a commit to Ivy233/llama.cpp that referenced this pull request Mar 23, 2025
@Djip007
Copy link
Contributor

Djip007 commented Mar 24, 2025

why disable and not simply change order:
// CPU: ACCEL -> CPU extra -> GPU host -> CPU
to
// CPU: ACCEL -> GPU host -> CPU extra -> CPU
???

@ggerganov
Copy link
Member Author

I think keeping the non-offloaded layers without repack would allow to dynamically move the ops to the GPU which would end up generally more efficient than using the optimized repacked CPU implementations.

@Djip007
Copy link
Contributor

Djip007 commented Mar 26, 2025

Yes, I understood that you prefer to have dynamically move the ops to the GPU.
But wouldn't it have had the same effect in that case by changing the order of addition (priority) of the buffers type, without completely disabling them?

@ggerganov
Copy link
Member Author

Hm, I guess you might be right. Do you want to give this a try, or should I go ahead and create a PR.

@Djip007
Copy link
Contributor

Djip007 commented Mar 28, 2025

OK. I'll have a try!

=> #12632 🤞

Djip007 pushed a commit to Djip007/llama.cpp that referenced this pull request Mar 28, 2025
this allow to use GPU host when possible over CPU repack.
this have the same effect to resolve this issues (ggml-org#12498) without
completely disable CPU extra buffer.
slaren pushed a commit that referenced this pull request Mar 29, 2025
… CPU (#12632)

this allow to use GPU host when possible over CPU repack.
this have the same effect to resolve this issues (#12498) without
completely disable CPU extra buffer.

Co-authored-by: philou <philou@framework>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants