Skip to content

Conversation

@xuchen-intel
Copy link
Contributor

@xuchen-intel xuchen-intel commented Jul 25, 2025

Details:

Tickets:

@xuchen-intel xuchen-intel requested review from a team as code owners July 25, 2025 08:38
@xuchen-intel xuchen-intel requested review from itikhono and removed request for a team July 25, 2025 08:38
@github-actions github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label Jul 25, 2025
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch from c123076 to adf6623 Compare August 1, 2025 10:45
@xuchen-intel xuchen-intel requested review from a team as code owners August 1, 2025 10:45
@github-actions github-actions bot added the category: IE Tests OpenVINO Test: plugins and common label Aug 1, 2025
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch 8 times, most recently from a328b79 to d785baa Compare August 7, 2025 03:09
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch from d785baa to 734a8f0 Compare August 11, 2025 07:50
@github-actions github-actions bot removed the category: transformations OpenVINO Runtime library - Transformations label Aug 11, 2025
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch 5 times, most recently from c256504 to 00fe735 Compare August 22, 2025 05:12
@xuchen-intel xuchen-intel changed the title [Draft] [CPU] FullyConnected acceleration with u2 weights decompression [CPU] FullyConnected acceleration with u2 weights decompression Aug 22, 2025
@xuchen-intel
Copy link
Contributor Author

@maxnick Hi Maksim, could you please take a look?

@maxnick maxnick self-assigned this Aug 26, 2025
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch from 37c631e to f64db7c Compare August 27, 2025 08:50
@maxnick maxnick added this to the 2025.4 milestone Aug 27, 2025
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch 2 times, most recently from 493ae47 to 6b8f102 Compare August 29, 2025 05:41
@xuchen-intel xuchen-intel requested a review from maxnick August 29, 2025 05:58
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch 2 times, most recently from 851da03 to fb9389a Compare September 5, 2025 06:12
@yuxu42 yuxu42 requested review from Copilot and removed request for itikhono September 8, 2025 07:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces FullyConnected acceleration with u2 (2-bit unsigned) weights decompression, adding support for a new precision type to improve performance in weight-compressed neural networks.

  • Added u2 element type support across the CPU plugin infrastructure
  • Extended FullyConnected operations to handle u2 weights with decompression
  • Added comprehensive test coverage for u2 precision conversion and matrix multiplication

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/plugins/intel_cpu/thirdparty/onednn Updated oneDNN submodule to support u2 operations
src/plugins/intel_cpu/src/plugin.cpp Added u2 to supported precision types
src/plugins/intel_cpu/src/nodes/fullyconnected.cpp Extended FullyConnected to support u2 compressed weights
src/plugins/intel_cpu/src/nodes/executors/type_mask.hpp Added u2 type mask definition
src/plugins/intel_cpu/src/nodes/executors/fullyconnected_implementations.cpp Updated type mappings to include u2 support
src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_fullyconnected_primitive.cpp Enhanced DNNL primitive to handle u2 weights decompression
src/plugins/intel_cpu/src/utils/plain_tensor.hpp Added u2 pointer handling with 4x sub-byte multiplier
src/plugins/intel_cpu/src/nodes/common/cpu_convert.cpp Implemented u2 to other types conversion functions
src/plugins/intel_cpu/src/dnnl_extension_utils.cpp Added u2 data type mapping utilities
src/tests/functional/plugin/shared/src/subgraph/weights_decompression_builders.cpp Updated test builders to handle u2 precision ranges
src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/x64/matmul_weights_decompression.cpp Added u2-specific test cases for matrix multiplication
src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/classes/matmul_weights_decompression.cpp Enhanced test case naming to include fusion flag
src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/common/conversion.cpp Added u2 conversion test instances
src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/classes/conversion.cpp Extended ARM64 precision checks to include u2

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 405 to 411
if (any_of(m_dt, ov::element::i4, ov::element::u4)) {
return 2;
}
if (m_dt == ov::element::u2) {
return 4;
}
return 1;
Copy link

Copilot AI Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sub-byte multiplier logic should be consolidated into a single conditional statement to avoid multiple if statements checking related precision types. Consider combining these checks or using a switch statement for better maintainability.

Suggested change
if (any_of(m_dt, ov::element::i4, ov::element::u4)) {
return 2;
}
if (m_dt == ov::element::u2) {
return 4;
}
return 1;
switch (m_dt) {
case ov::element::i4:
case ov::element::u4:
return 2;
case ov::element::u2:
return 4;
default:
return 1;
}

Copilot uses AI. Check for mistakes.
Comment on lines +183 to +187
if (zpPtr && none_of(zpPtr->getDesc().getPrecision(),
ov::element::u8,
ov::element::u4,
ov::element::u2,
ov::element::dynamic)) {
Copy link

Copilot AI Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The precision check list is becoming long and scattered across multiple lines. Consider extracting this into a helper function or constant to improve readability and maintainability.

Copilot uses AI. Check for mistakes.
Comment on lines +862 to +864
[[maybe_unused]] static uint8_t get_u2(uint8_t val, uint8_t shift) {
return static_cast<uint8_t>((val & (0x3 << shift)) >> shift);
}
Copy link

Copilot AI Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 0x3 should be replaced with a named constant or documented comment explaining it represents the 2-bit mask for extracting u2 values.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@maxnick maxnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM.

@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch 4 times, most recently from c677bec to 1ec8a2d Compare September 28, 2025 01:54
Update initMatMulDecompressionSubgraph and modify weight decompression kernels

Apply clang format

Fix issues revealed by ci

Fix issue on avx2 platform

Revert common transformation related changes

Align subgraph tests with the new model pattern

Fix issue on loading u2 zero points

Support dynamic quantization for u2

Extend subgraph test cases and fix issues

Apply review comments

Update sub_byte_data_type_multiplier
@xuchen-intel xuchen-intel force-pushed the feature/u2_weights_decompression branch from 1ec8a2d to 3144983 Compare September 28, 2025 02:15
@maxnick maxnick added this pull request to the merge queue Sep 29, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 29, 2025
@maxnick maxnick added this pull request to the merge queue Sep 29, 2025
github-merge-queue bot pushed a commit that referenced this pull request Sep 29, 2025
### Details:
 - *FullyConnected acceleration with u2 weights decompression.*
 - *OneDNN PR: openvinotoolkit/oneDNN#289

### Tickets:
 - *[CVS-169357](https://jira.devtools.intel.com/browse/CVS-169357)*
Merged via the queue into openvinotoolkit:master with commit abe0d47 Sep 29, 2025
268 of 289 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: CPU OpenVINO CPU plugin category: IE Tests OpenVINO Test: plugins and common

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants