Skip to content

Commit 6ab397e

Browse files
CISCggerganov
andauthored
graph : support non-contiguous Q in build_attn_mha (ggml-org#15908)
* support non-contiguous Q in build_attn_mha * Update src/llama-graph.cpp ggml-ci Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>
1 parent 9de447d commit 6ab397e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/llama-graph.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1273,7 +1273,7 @@ ggml_tensor * llm_graph_context::build_attn_mha(
12731273
// split the batch into streams if needed
12741274
const auto n_stream = k->ne[3];
12751275

1276-
q = ggml_reshape_4d(ctx0, q, q->ne[0], q->ne[1], q->ne[2]/n_stream, n_stream);
1276+
q = ggml_view_4d(ctx0, q, q->ne[0], q->ne[1], q->ne[2]/n_stream, n_stream, q->nb[1], q->nb[2], q->nb[3]/n_stream, 0);
12771277

12781278
q = ggml_permute(ctx0, q, 0, 2, 1, 3);
12791279
k = ggml_permute(ctx0, k, 0, 2, 1, 3);

0 commit comments

Comments
 (0)