Skip to content

Conversation

rmatif
Copy link
Collaborator

@rmatif rmatif commented Aug 8, 2025

This PR adds conv3d support for the CPU backend. This is required for tasks such as inferencing text to video models.

Since we are limited by the number of tensor dimensions, my idea was to unroll the depth dimension. The implementation flattens 3D input patches into columns, then performs a single matrix multiplication against the flattened kernel weights.

I did not seek for optimizations, and I'm not sure it is worth it (unless there are obvious ones), as this implementation will essentially serve as a ground truth to test against when implementing this op on other backends.

The correctness was checked against PyTorch's native conv3d

@github-actions github-actions bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning labels Aug 8, 2025
@rmatif rmatif requested review from ggerganov and slaren August 16, 2025 13:55
@rmatif rmatif merged commit 92f7f0a into ggml-org:master Aug 22, 2025
48 checks passed
qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Aug 25, 2025
* add conv3d

* bump GGML_OP_COUNT
@rujialiu rujialiu mentioned this pull request Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants