Replies: 1 comment 1 reply
-
Partial offloading is tracked in #1562. This being a feature request, it doesn't belong in "Discussions" anyway. I am close to disabling this tab altogether because it seems like nobody understands what it's for. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
most CPU these that come with an IGPU.
The question is can it load your model partially or all of them onto the IGPU regardless of the GPU has/hasn't enough VRM, because fo IGPU, the system ram is the vram, so that you can have GPU acceleration on a laptop
just provide a option to load selected number of layer that load onto GPU like llama cpp does in the command.
Beta Was this translation helpful? Give feedback.
All reactions