Skip to content

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented Aug 29, 2025

ref #15602 (comment)

  • Avoid Vcur = ggml_cont_3d(..) when the QKV weights are merged in a single tensor
  • Make llama_kv_cache:: cpy_k and cpy_v more readable

@CISC CISC mentioned this pull request Aug 29, 2025
4 tasks
@ggerganov ggerganov marked this pull request as ready for review September 7, 2025 17:24
@ggerganov ggerganov force-pushed the gg/model-avoid-cont3d branch from f15d515 to 3dec397 Compare September 8, 2025 06:47
Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with CodeQwen1.5, Phi2, jina-embeddings-v3 and PLaMo2.

@CISC
Copy link
Collaborator

CISC commented Sep 8, 2025

@ggerganov
Copy link
Member Author

@CISC
Copy link
Collaborator

CISC commented Sep 8, 2025

Hmmm, https://github.com/ggml-org/ci/blob/results/llama.cpp/60/d6e7c6fd8bacac0892b8722f5d5c585139cb43/ggml-4-x86-cuda-v100/stdall#L1957

This is due to #15687

Ah, I get a segfault locally though at the first REPEAT test after ARGMAX.

@ggerganov
Copy link
Member Author

On my end, all tests except GET_ROWS and the new IM2COL_3D are passing.

@CISC
Copy link
Collaborator

CISC commented Sep 8, 2025

On my end, all tests except GET_ROWS and the new IM2COL_3D are passing.

Nvm, must have been some other issue pre-rebase, I pulled latest changes and applied #15868 and everything is fine now.

Edit: Eh, almost, got GGML_ASSERT(ggml_is_contiguous(src0)) on PAD, but that's surely not related. It's pad_ext test with v == true. Fixed in #15869

@ggerganov ggerganov merged commit cf0e3ba into master Sep 8, 2025
52 of 55 checks passed
@ggerganov ggerganov deleted the gg/model-avoid-cont3d branch September 8, 2025 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants