We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent ea3cab5 commit 027d97eCopy full SHA for 027d97e
src/llama-context.cpp
@@ -221,7 +221,7 @@ llama_context::llama_context(
221
bool pipeline_parallel =
222
model.n_devices() > 1 &&
223
model.params.n_gpu_layers > (int) model.hparams.n_layer &&
224
- model.params.split_mode == LLAMA_SPLIT_MODE_LAYER &&
+ (model.params.split_mode == LLAMA_SPLIT_MODE_LAYER || model.params.split_mode == LLAMA_SPLIT_MODE_ROW) &&
225
cparams.offload_kqv &&
226
!model.has_tensor_overrides();
227
0 commit comments