Commit b83fadf
Fix DMA buffer allocation to respect contiguous flag
Remove debug code that forced all DMA allocations to be physically
contiguous. This was causing large memory allocations to fail with
NV_ERR_NO_MEMORY (0x51) because finding large contiguous physical
memory blocks is difficult, especially on systems with fragmented
memory.
The fix allows non-contiguous DMA buffers when the caller does not
require contiguous memory, enabling large model loading in llama.cpp
and other applications that need significant GPU memory.
Tested with Llama 3.1 8B (4.5GB) model which previously failed to load.1 parent 0093e7f commit b83fadf
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
418 | 418 | | |
419 | 419 | | |
420 | 420 | | |
421 | | - | |
| 421 | + | |
422 | 422 | | |
423 | 423 | | |
424 | 424 | | |
| |||
0 commit comments