vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) #1204

Acly · 2025-04-29T12:33:29Z

This implements support for GGML_OP_CONV_2D_DW in Vulkan backend.

Motivation is the same as for CPU (#1152): while depthwise convolution can be implemented via im2col -> mul_mat, this is quite wasteful, and a direct kernel performs much better.

Timings (W=512, H=512, C=256)

Method	Layout	Time
`ggml_conv_2d_dw` (im2col)	WHCN	38.5 ms ± 0.02
`ggml_conv_2d_dw_direct`	WHCN	1.9 ms ± 0.02
`ggml_conv_2d_dw_direct`	CWHN	2.1 ms ± 0.01

Measured on RTX 4070. Larger batch sizes run into max allocation issues when using im2col.

Regarding separate kernel for CWHN (channels most contiguous, aka. NHWC): it's actually slightly slower than the default memory layout here. I kept it because regular (and transposed) Conv2D generally prefers it, and it can avoid extra permute/copy steps.

ggerganov · 2025-05-01T06:51:17Z

Also tagging @jeffbolznv in case you are interested in taking a look.

jeffbolznv · 2025-05-01T13:54:32Z

src/ggml-vulkan/vulkan-shaders/conv2d_dw.comp

+    FLOAT_TYPE sum = 0.0;
+    for (uint knl_y = 0; knl_y < p.knl_h; ++knl_y) {
+        uint src_y = dst_y * p.stride_y + knl_y * p.dilation_y - p.pad_y;
+        if (src_y < 0 || src_y >= p.src_h) {


Since src_x and src_y are unsigned, the < 0 conditions can be removed.

Is it possible src_y underflows if pad_y is large enough?

Yes, if there is padding the expression can become negative and wraps to a very large unsigned int, which will then be caught by the >= check (for typical values). So in the end it does what's intended, and the src_y < 0 check can be omitted.

The alternative is to use signed int and keep the check (bit cleaner, more instructions). Using unsigned and the check as it is makes no sense, I'll fix that.

Removed the check and added a comment to indicate wrapping is intentional.

jeffbolznv · 2025-05-01T13:55:42Z

src/ggml-vulkan/vulkan-shaders/conv2d_dw.comp

+        for (uint knl_x = 0; knl_x < p.knl_w; ++knl_x) {
+            uint src_x = dst_x * p.stride_x + knl_x * p.dilation_x - p.pad_x;
+            if (src_x < 0 || src_x >= p.src_w) {
+                continue;


I guess padding is always considered to be with a value of zero, never replicating the border?

Yes, there is no way to specify padding modes other than zero for convolutions so far.

jeffbolznv · 2025-05-01T14:02:13Z

tests/test-backend-ops.cpp

+    test_cases.emplace_back(new test_conv_2d_dw({17, 34, 9, 1}, {3, 3, 1, 9}, 1, 0, 1, true));
+    test_cases.emplace_back(new test_conv_2d_dw({32, 8, 64, 1}, {3, 3, 1, 64}, 2, 1, 1, false));
+    test_cases.emplace_back(new test_conv_2d_dw({32, 8, 64, 1}, {3, 3, 1, 64}, 2, 1, 1, true));
+


Please add perf tests that correspond to the examples you gave in the PR.

jeffbolznv · 2025-05-01T14:03:54Z

This is very cool. I didn't realize this op had been added. The change LGTM, I just had one minor suggestion.

vulkan : add kernels for depthwise 2d convolution (OP_CONV_2D_DW)

5a1d246

ggerganov requested a review from 0cc4m April 29, 2025 13:30

jeffbolznv reviewed May 1, 2025

View reviewed changes

review: remove src_x/y < 0 checks; add performance tests

36ce11b

jeffbolznv approved these changes May 2, 2025

View reviewed changes

ggerganov merged commit 0482de9 into ggml-org:master May 2, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) #1204

vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) #1204

Uh oh!

Acly commented Apr 29, 2025

Uh oh!

ggerganov commented May 1, 2025

Uh oh!

jeffbolznv May 1, 2025

Uh oh!

0cc4m May 1, 2025

Uh oh!

Acly May 1, 2025

Uh oh!

Acly May 2, 2025

Uh oh!

jeffbolznv May 1, 2025

Uh oh!

Acly May 2, 2025

Uh oh!

jeffbolznv May 1, 2025

Uh oh!

Acly May 2, 2025

Uh oh!

jeffbolznv commented May 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) #1204

vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) #1204

Uh oh!

Conversation

Acly commented Apr 29, 2025

Uh oh!

ggerganov commented May 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeffbolznv commented May 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants