[CPU]Add Support For GatedDeltaNet by zhangYiIntel · Pull Request #34447 · openvinotoolkit/openvino

zhangYiIntel · 2026-03-03T02:30:26Z

Details:

Add Internal Op GatedDeltaNet
Add GatedDeltaNet Fusion
Add GatedDeltaNet CPU kernel

Tickets:

AI Assistance:

AI assistance used: no
If yes, summarize how AI was used and what human validation was performed (build/tests/manual checks).

Copilot

Pull request overview

This PR introduces CPU support for the GatedDeltaNet operation, a recurrent linear attention variant. It adds:

An internal OpenVINO GatedDeltaNet op (core) with a new ov::op::GatedDeltaNet class, shape inference, and Python bindings.
A pattern-matching fusion transformation (GatedDeltaNetFusion) that replaces a Loop-based subgraph with the fused op.
A CPU kernel (recurrent_linear_attn) with AVX512F/AVX2/scalar code paths, a CPU node wrapper, and functional tests.

Changes:

New internal op GatedDeltaNet with validation, shape inference, config struct, and Python binding
New GatedDeltaNetFusion transformation that fuses the Loop-based GDN subgraph into the internal op
New CPU node and recurrent_linear_attn kernel (AVX512F + fallback), along with functional smoke tests

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`src/core/dev_api/openvino/op/gated_delta_net.hpp`	New internal op declaration with `Config` struct
`src/core/src/op/gated_delta_net.cpp`	Op implementation: validation, shape inference, `clone_with_new_inputs`
`src/common/transformations/include/transformations/common_optimizations/fuse_gated_delta_net.hpp`	Fusion pass declaration
`src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp`	Fusion pass implementation (Loop → GatedDeltaNet)
`src/plugins/intel_cpu/src/nodes/gated_delta_net.h`	CPU node header
`src/plugins/intel_cpu/src/nodes/gated_delta_net.cpp`	CPU node execute logic
`src/plugins/intel_cpu/src/nodes/kernels/linear_attn/recurrent_linear_attn.hpp`	Kernel function declaration
`src/plugins/intel_cpu/src/nodes/kernels/linear_attn/recurrent_linear_attn.cpp`	AVX512F/AVX2/scalar kernel implementation
`src/plugins/intel_cpu/src/transformations/transformation_pipeline.cpp`	Registers fusion pass in PreLpt pipeline
`src/plugins/intel_cpu/src/nodes_factory.cpp`	Registers GatedDeltaNet CPU node (x64 only)
`src/plugins/intel_cpu/src/graph_optimizer.cpp`	Excludes GatedDeltaNet from tail precision optimization
`src/plugins/intel_cpu/src/cpu_types.h`	Adds `GatedDeltaNet` to `Type` enum
`src/plugins/intel_cpu/src/cpu_types.cpp`	Adds type name mapping for GatedDeltaNet
`src/plugins/intel_cpu/src/extension.cpp`	Registers op extension for serialization
`src/plugins/intel_cpu/CMakeLists.txt`	Adds cross-compiled build entry for kernel
`src/bindings/python/src/pyopenvino/graph/ops/gated_delta_net.hpp`	Python binding header
`src/bindings/python/src/pyopenvino/graph/ops/gated_detla_net.cpp`	Python binding implementation (typo in filename)
`src/bindings/python/src/pyopenvino/pyopenvino.cpp`	Registers Python binding class
`src/bindings/python/src/openvino/op/__init__.pyi`	Exports `_GatedDeltaNet` to Python stubs
`src/bindings/python/src/openvino/_pyopenvino/op/__init__.pyi`	Adds Python stub for `_GatedDeltaNet`
`src/tests/functional/plugin/shared/include/shared_test_classes/subgraph/gated_delta_net.hpp`	Test class declaration
`src/tests/functional/plugin/shared/include/subgraph_tests/gated_delta_net.hpp`	Test body with CompareWithRefs
`src/tests/functional/plugin/shared/src/subgraph/gated_delta_net.cpp`	Test setup (reference Loop model + fused model)
`src/plugins/intel_cpu/tests/functional/shared_tests_instances/subgraph_tests/gated_delta_net.cpp`	CPU-specific test instantiation

You can also share your feedback on Copilot code review. Take the survey.

src/plugins/intel_cpu/src/nodes/kernels/linear_attn/recurrent_linear_attn.cpp

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp

src/tests/functional/plugin/shared/src/subgraph/gated_delta_net.cpp

src/bindings/python/src/openvino/_pyopenvino/op/__init__.pyi

src/bindings/python/src/pyopenvino/graph/ops/gated_detla_net.cpp

src/core/src/op/gated_delta_net.cpp

src/plugins/intel_cpu/src/nodes/kernels/linear_attn/recurrent_linear_attn.cpp

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp

mryzhov · 2026-03-10T16:08:54Z

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp

+            return scale_node.get_node_shared_ptr();
+        }
+    } else {
+        return ov::op::v0::Constant::create(default_scale_type, ov::Shape{}, {1.0});


When it is possible? sclae_node pattern is any_input().
Maybe it would be better to retrieve the scale node outside the helper function and check that it is not nullptr? For example, in that case we would not need to pass the default_scale_type matcher to the function.

mryzhov · 2026-03-10T16:36:13Z

src/common/transformations/tests/common_optimizations/fuse_gated_delta_net.cpp

+
+}  // namespace
+
+TEST_F(TransformationTestsF, GatedDeltaNetFusion_BuildLoopedGDNMode) {


I would suggest to add the tests without Convert layers, with different Transposes on inputs and outputs, with different input and output shapes (when the inputs rank or output rank can be changed)

The loop body are coupled with the transposes, the combination is not limited. Unlike SDPA case which has fixed order in QK matmul no matter q/k/v transposed or not, GDN's loop body is very diverse, I think in this release it's better to focus on current Qwen3Next pattern.

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp

...common/transformations/include/transformations/common_optimizations/fuse_gated_delta_net.hpp

src/core/src/op/gated_delta_net.cpp

Copilot

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp

Copilot · 2026-03-11T09:09:00Z

src/core/src/op/gated_delta_net.cpp

+                          "The head size in key and query should be the same, but got ",
+                          k_head_size,
+                          " and ",
+                          v_head_size,


The head-size mismatch validation compares k_head_size vs q_head_size, but the error message prints v_head_size as the second value (line 107). This will mislead users when debugging invalid models.

Please update the message to report q_head_size as the second operand.

Suggested change

v_head_size,

q_head_size,

src/plugins/intel_cpu/src/nodes/kernels/linear_attn/recurrent_linear_attn.cpp

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp

...common/transformations/include/transformations/common_optimizations/fuse_gated_delta_net.hpp

src/core/src/op/gated_delta_net.cpp

zhangYiIntel added 4 commits March 2, 2026 13:06

add op linear

5382618

Add GatedDeltaNet cpu impl

5025b41

add shared test

418f33e

Add GDN Fusion

ad86b8d

zhangYiIntel added the do_not_merge label Mar 3, 2026

yuxu42 requested a review from Copilot March 4, 2026 03:28

Copilot started reviewing on behalf of yuxu42 March 4, 2026 03:28 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

[CPU]use state layout of B, H, V, K

b8641b9

zhangYiIntel force-pushed the yi3/support_gdn branch 2 times, most recently from bd6fe37 to b8641b9 Compare March 4, 2026 08:39

zhangYiIntel added 2 commits March 5, 2026 15:29

[CPU]Update transformation for gdn

a3d1eff

fix test and op spec

9a550e8

zhangYiIntel force-pushed the yi3/support_gdn branch 2 times, most recently from 60a0acf to 9a550e8 Compare March 6, 2026 01:50

zhangYiIntel marked this pull request as ready for review March 6, 2026 02:38

zhangYiIntel requested review from a team as code owners March 6, 2026 02:38

mlukasze requested a review from Copilot March 6, 2026 05:50

mryzhov reviewed Mar 10, 2026

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp Show resolved Hide resolved

mryzhov reviewed Mar 10, 2026

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp Outdated Show resolved Hide resolved

mryzhov reviewed Mar 10, 2026

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp Outdated Show resolved Hide resolved

CuriousPanCake reviewed Mar 10, 2026

View reviewed changes

...common/transformations/include/transformations/common_optimizations/fuse_gated_delta_net.hpp Outdated Show resolved Hide resolved

CuriousPanCake reviewed Mar 10, 2026

View reviewed changes

...common/transformations/include/transformations/common_optimizations/fuse_gated_delta_net.hpp Outdated Show resolved Hide resolved

mitruska reviewed Mar 10, 2026

View reviewed changes

src/core/src/op/gated_delta_net.cpp Show resolved Hide resolved

src/core/src/op/gated_delta_net.cpp Show resolved Hide resolved

src/core/src/op/gated_delta_net.cpp Show resolved Hide resolved

apply review comments

a0e8bd0

zhangYiIntel force-pushed the yi3/support_gdn branch from c05ae61 to a0e8bd0 Compare March 11, 2026 06:34

mryzhov requested a review from Copilot March 11, 2026 09:02

Copilot started reviewing on behalf of mryzhov March 11, 2026 09:03 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

CuriousPanCake reviewed Mar 11, 2026

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp Outdated Show resolved Hide resolved

CuriousPanCake reviewed Mar 11, 2026

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp Outdated Show resolved Hide resolved

CuriousPanCake reviewed Mar 11, 2026

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp Outdated Show resolved Hide resolved

CuriousPanCake reviewed Mar 11, 2026

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/fuse_gated_delta_net.cpp Outdated Show resolved Hide resolved

maxnick mentioned this pull request Mar 11, 2026

[CPU][ARM] Constrain MHA single-token dot_product templates #34616

Open

remove debug

fe9157d

mryzhov reviewed Mar 11, 2026

View reviewed changes

...common/transformations/include/transformations/common_optimizations/fuse_gated_delta_net.hpp Show resolved Hide resolved

zhangYiIntel force-pushed the yi3/support_gdn branch 2 times, most recently from fca11d0 to de40b3b Compare March 12, 2026 01:15

fix clang & apply review comments

2d55899

zhangYiIntel force-pushed the yi3/support_gdn branch from de40b3b to 2d55899 Compare March 12, 2026 02:13

Merge branch 'master' into yi3/support_gdn

8fa64d1

mitruska approved these changes Mar 12, 2026

View reviewed changes

src/core/src/op/gated_delta_net.cpp Show resolved Hide resolved

zhangYiIntel force-pushed the yi3/support_gdn branch from 745088c to 50993ae Compare March 13, 2026 05:07

remove tranpose in gdn loop pattern

0f3d64d

zhangYiIntel force-pushed the yi3/support_gdn branch from 50993ae to 0f3d64d Compare March 13, 2026 07:26

fix typo in l2norm fusion

cae6427


		} // namespace

		TEST_F(TransformationTestsF, GatedDeltaNetFusion_BuildLoopedGDNMode) {

Conversation

zhangYiIntel commented Mar 3, 2026

Details:

Tickets:

AI Assistance:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mryzhov Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

zhangYiIntel Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

mryzhov Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

zhangYiIntel Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants