Skip to content

Commit 9a43994

Browse files
committed
ggml: Disable unused pipeline parallelism
We're not currently using it, even in cases where we could. Disabling it improves generation performance by 10-30% with multiple GPUs.
1 parent f8a6e88 commit 9a43994

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

ml/backend/ggml/ggml.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -418,7 +418,7 @@ func New(modelPath string, params ml.BackendParams) (ml.Backend, error) {
418418
(*C.ggml_backend_buffer_type_t)(unsafe.Pointer(&schedBufts[0])),
419419
C.int(len(schedBackends)),
420420
C.size_t(maxGraphNodes),
421-
C._Bool(len(gpus) > 1 && slices.Contains(gpus, output.d)),
421+
C._Bool(false),
422422
C._Bool(false),
423423
),
424424
schedBackends: schedBackends,

0 commit comments

Comments
 (0)