Compute buffer and KV-cache aware layer distribution for multi-GPU inference#14484
Open
borebot wants to merge 1 commit intoggml-org:masterfrom
Open
Compute buffer and KV-cache aware layer distribution for multi-GPU inference#14484borebot wants to merge 1 commit intoggml-org:masterfrom
borebot wants to merge 1 commit intoggml-org:masterfrom