Skip to content

Compute buffer and KV-cache aware layer distribution for multi-GPU inference#14484

Open
borebot wants to merge 1 commit intoggml-org:masterfrom
borebot:kv-compute-buffer-cache-aware-allocation
Open

Compute buffer and KV-cache aware layer distribution for multi-GPU inference#14484
borebot wants to merge 1 commit intoggml-org:masterfrom
borebot:kv-compute-buffer-cache-aware-allocation

Commits