[CPU] FullyConnected acceleration with u2 weights decompression #31467

xuchen-intel · 2025-07-25T08:38:23Z

Details:

FullyConnected acceleration with u2 weights decompression.
OneDNN PR: [FORK FEATURE] InnerProduct primitive: u2 weights decompression oneDNN#289

Tickets:

CVS-169357

xuchen-intel · 2025-08-26T03:06:34Z

@maxnick Hi Maksim, could you please take a look?

src/plugins/intel_cpu/src/nodes/common/cpu_convert.cpp

Copilot

Pull Request Overview

This PR introduces FullyConnected acceleration with u2 (2-bit unsigned) weights decompression, adding support for a new precision type to improve performance in weight-compressed neural networks.

Added u2 element type support across the CPU plugin infrastructure
Extended FullyConnected operations to handle u2 weights with decompression
Added comprehensive test coverage for u2 precision conversion and matrix multiplication

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/plugins/intel_cpu/thirdparty/onednn	Updated oneDNN submodule to support u2 operations
src/plugins/intel_cpu/src/plugin.cpp	Added u2 to supported precision types
src/plugins/intel_cpu/src/nodes/fullyconnected.cpp	Extended FullyConnected to support u2 compressed weights
src/plugins/intel_cpu/src/nodes/executors/type_mask.hpp	Added u2 type mask definition
src/plugins/intel_cpu/src/nodes/executors/fullyconnected_implementations.cpp	Updated type mappings to include u2 support
src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_fullyconnected_primitive.cpp	Enhanced DNNL primitive to handle u2 weights decompression
src/plugins/intel_cpu/src/utils/plain_tensor.hpp	Added u2 pointer handling with 4x sub-byte multiplier
src/plugins/intel_cpu/src/nodes/common/cpu_convert.cpp	Implemented u2 to other types conversion functions
src/plugins/intel_cpu/src/dnnl_extension_utils.cpp	Added u2 data type mapping utilities
src/tests/functional/plugin/shared/src/subgraph/weights_decompression_builders.cpp	Updated test builders to handle u2 precision ranges
src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/x64/matmul_weights_decompression.cpp	Added u2-specific test cases for matrix multiplication
src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/classes/matmul_weights_decompression.cpp	Enhanced test case naming to include fusion flag
src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/common/conversion.cpp	Added u2 conversion test instances
src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/classes/conversion.cpp	Extended ARM64 precision checks to include u2

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-08T07:21:57Z

src/plugins/intel_cpu/src/utils/plain_tensor.hpp

        if (any_of(m_dt, ov::element::i4, ov::element::u4)) {
            return 2;
        }
+        if (m_dt == ov::element::u2) {
+            return 4;
+        }
        return 1;


The sub-byte multiplier logic should be consolidated into a single conditional statement to avoid multiple if statements checking related precision types. Consider combining these checks or using a switch statement for better maintainability.

Suggested change

if (any_of(m_dt, ov::element::i4, ov::element::u4)) {

return 2;

}

if (m_dt == ov::element::u2) {

return 4;

}

return 1;

switch (m_dt) {

case ov::element::i4:

case ov::element::u4:

return 2;

case ov::element::u2:

return 4;

default:

return 1;

}

Copilot · 2025-09-08T07:21:57Z

src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_fullyconnected_primitive.cpp

+    if (zpPtr && none_of(zpPtr->getDesc().getPrecision(),
+                         ov::element::u8,
+                         ov::element::u4,
+                         ov::element::u2,
+                         ov::element::dynamic)) {


The precision check list is becoming long and scattered across multiple lines. Consider extracting this into a helper function or constant to improve readability and maintainability.

Copilot · 2025-09-08T07:21:57Z

src/plugins/intel_cpu/src/nodes/common/cpu_convert.cpp

+[[maybe_unused]] static uint8_t get_u2(uint8_t val, uint8_t shift) {
+    return static_cast<uint8_t>((val & (0x3 << shift)) >> shift);
+}


The magic number 0x3 should be replaced with a named constant or documented comment explaining it represents the 2-bit mask for extracting u2 values.

maxnick

In general LGTM.

Update initMatMulDecompressionSubgraph and modify weight decompression kernels Apply clang format Fix issues revealed by ci Fix issue on avx2 platform Revert common transformation related changes Align subgraph tests with the new model pattern Fix issue on loading u2 zero points Support dynamic quantization for u2 Extend subgraph test cases and fix issues Apply review comments Update sub_byte_data_type_multiplier

### Details: - *FullyConnected acceleration with u2 weights decompression.* - *OneDNN PR: openvinotoolkit/oneDNN#289 ### Tickets: - *[CVS-169357](https://jira.devtools.intel.com/browse/CVS-169357)*

xuchen-intel added category: CPU OpenVINO CPU plugin do_not_merge labels Jul 25, 2025

xuchen-intel requested review from a team as code owners July 25, 2025 08:38

xuchen-intel requested review from itikhono and removed request for a team July 25, 2025 08:38

github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label Jul 25, 2025

xuchen-intel force-pushed the feature/u2_weights_decompression branch from c123076 to adf6623 Compare August 1, 2025 10:45

xuchen-intel requested review from a team as code owners August 1, 2025 10:45

github-actions bot added the category: IE Tests OpenVINO Test: plugins and common label Aug 1, 2025

xuchen-intel force-pushed the feature/u2_weights_decompression branch 8 times, most recently from a328b79 to d785baa Compare August 7, 2025 03:09

xuchen-intel force-pushed the feature/u2_weights_decompression branch from d785baa to 734a8f0 Compare August 11, 2025 07:50

github-actions bot removed the category: transformations OpenVINO Runtime library - Transformations label Aug 11, 2025

xuchen-intel force-pushed the feature/u2_weights_decompression branch 5 times, most recently from c256504 to 00fe735 Compare August 22, 2025 05:12

xuchen-intel changed the title ~~[Draft] [CPU] FullyConnected acceleration with u2 weights decompression~~ [CPU] FullyConnected acceleration with u2 weights decompression Aug 22, 2025

maxnick self-assigned this Aug 26, 2025

xuchen-intel force-pushed the feature/u2_weights_decompression branch from 37c631e to f64db7c Compare August 27, 2025 08:50

maxnick reviewed Aug 27, 2025

View reviewed changes

src/plugins/intel_cpu/src/nodes/common/cpu_convert.cpp Outdated Show resolved Hide resolved

maxnick added this to the 2025.4 milestone Aug 27, 2025

xuchen-intel force-pushed the feature/u2_weights_decompression branch 2 times, most recently from 493ae47 to 6b8f102 Compare August 29, 2025 05:41

xuchen-intel requested a review from maxnick August 29, 2025 05:58

xuchen-intel removed the do_not_merge label Aug 29, 2025

xuchen-intel force-pushed the feature/u2_weights_decompression branch 2 times, most recently from 851da03 to fb9389a Compare September 5, 2025 06:12

yuxu42 requested review from Copilot and removed request for itikhono September 8, 2025 07:20

Copilot AI reviewed Sep 8, 2025

View reviewed changes

maxnick approved these changes Sep 8, 2025

View reviewed changes

xuchen-intel force-pushed the feature/u2_weights_decompression branch 4 times, most recently from c677bec to 1ec8a2d Compare September 28, 2025 01:54

xuchen-intel added 2 commits September 28, 2025 04:14

Add test cases about repack to bf16 precision

3144983

xuchen-intel force-pushed the feature/u2_weights_decompression branch from 1ec8a2d to 3144983 Compare September 28, 2025 02:15

mvafin mentioned this pull request Sep 29, 2025

[PT FE] Fix repacking bitnet weights in frontend #32244

Merged

maxnick added this pull request to the merge queue Sep 29, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 29, 2025

maxnick added this pull request to the merge queue Sep 29, 2025

Merged via the queue into openvinotoolkit:master with commit abe0d47 Sep 29, 2025
268 of 289 checks passed

ekurniaw mentioned this pull request Oct 30, 2025

Feature Request: Add BitNet Notebook openvinotoolkit/openvino_notebooks#3102

Closed

ljaljushkin mentioned this pull request Nov 14, 2025

[Feature Request]: Do we have a method for 2-bit quantization? #32716

Open

1 task

ruskaruma mentioned this pull request Dec 14, 2025

[GPU] Add u2 weight quantization backend support #33243

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CPU] FullyConnected acceleration with u2 weights decompression #31467

[CPU] FullyConnected acceleration with u2 weights decompression #31467

Uh oh!

xuchen-intel commented Jul 25, 2025 •

edited

Loading

Uh oh!

xuchen-intel commented Aug 26, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 8, 2025

Uh oh!

Copilot AI Sep 8, 2025

Uh oh!

Copilot AI Sep 8, 2025

Uh oh!

maxnick left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CPU] FullyConnected acceleration with u2 weights decompression #31467

[CPU] FullyConnected acceleration with u2 weights decompression #31467

Uh oh!

Conversation

xuchen-intel commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

Uh oh!

xuchen-intel commented Aug 26, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

maxnick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xuchen-intel commented Jul 25, 2025 •

edited

Loading