[hipBLASlt]: Add integer_exact initialization option for matrix data by tony-davis · Pull Request #5535 · ROCm/rocm-libraries

tony-davis · 2026-03-17T20:54:36Z

Summary of changes

Introduced a new initialization option integer_exact for matrix data, allowing for specific integer initialization patterns.
Updated hipblaslt_initialization enum to include integer_exact with a corresponding value.
Enhanced the unit_print_first_mismatch function to support diagnostics for integer_exact initialization.
Modified the hipblaslt_init_device function to handle the new initialization type, ensuring proper data generation.
Added multiple test cases in YAML files to validate the behavior of the integer_exact initialization across various matrix sizes and configurations.

No functional or build impact outside of the new feature implementation.

Motivation

Implements integer-exact GEMM testing for hipBLASlt: input data are chosen so dot products stay in a small integer range, enabling GPU vs CPU comparison with exact match (zero tolerance) instead of a numeric tolerance. That improves defect detection and avoids tolerance masking issues, especially with 16-bit types. The pattern used is A and C in {0, 1, 2}, B in ±{0, 1, 2} with a checkerboard sign pattern, so results remain exactly representable under defined K limits.

Technical Details

Initialization: New hipblaslt_initialization::integer_exact. In hipblaslt_init_device, A and C use small_int_positive (values 0, 1, 2); B uses the same values with sign chosen by (i^j)&1 so the first element of each row and column alternates (checkerboard).
Verification: When initialization == integer_exact, testing_matmul sets tolerance to 0 (exact check) and calls unit_print_first_mismatch for D before the check to improve failure diagnostics.
16-bit K limit: For integer_exact with 16-bit (fp16/bf16), tests are skipped when any K > 512 so dot products stay exactly representable; 32-bit and 64-bit types use the full size ranges.
B matrix sync: Device→host sync for B uses storage dimensions (K, N, ldb) instead of logical (B_row, B_col, ldb) so transB and swizzle paths do not hit invalid leading dimension or wrong copy.
Tests: Eight new matmul_integer_exact_* tests in matmul_gtest.yaml mirror the existing matmul suite (one, small, medium, batch_medium, chunk, grid_limit_real, grid_limit_double, deepbench), with the same size ranges and filters (including gpu_arch where applicable).

Test Plan

Build hipBLASlt and run the new integer_exact matmul tests (quick, pre_checkin, and nightly as defined in YAML).
Confirm 16-bit cases with K > 512 are skipped (message printed) and that 32-bit and smaller-K 16-bit cases run and pass with exact match.

Related Tickets

ROCM-19939 motivated the creation of these tests.

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

## Summary of changes - Introduced a new initialization option `integer_exact` for matrix data, allowing for specific integer initialization patterns. - Updated `hipblaslt_initialization` enum to include `integer_exact` with a corresponding value. - Enhanced the `unit_print_first_mismatch` function to support diagnostics for `integer_exact` initialization. - Modified the `hipblaslt_init_device` function to handle the new initialization type, ensuring proper data generation. - Added multiple test cases in YAML files to validate the behavior of the `integer_exact` initialization across various matrix sizes and configurations. No functional or build impact outside of the new feature implementation.

Copilot

Pull request overview

Adds a new integer_exact matrix initialization mode intended to enable exact (zero-tolerance) GEMM result comparisons by constraining generated operands to small integers with a checkerboard sign pattern.

Changes:

Extends hipblaslt_initialization (YAML + C++) to include integer_exact and exposes it via the bench CLI.
Implements device-side data generation for integer_exact and adds a host diagnostic helper (unit_print_first_mismatch) for exact-check failures.
Adds a new integer-exact matmul test suite in YAML, plus runtime skipping for 16-bit types when K > 512.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
projects/hipblaslt/clients/tests/data/matmul_gtest.yaml	Adds new `matmul_integer_exact_*` YAML test entries mirroring existing matmul suites.
projects/hipblaslt/clients/tests/data/hipblaslt_common.yaml	Registers `integer_exact` as an initialization enum value for test YAML.
projects/hipblaslt/clients/common/src/hipblaslt_init_device.cpp	Implements `integer_exact` data generation (small ints + checkerboard sign for B).
projects/hipblaslt/clients/common/include/unit.hpp	Adds `unit_print_first_mismatch` helper for improved mismatch diagnostics.
projects/hipblaslt/clients/common/include/testing_matmul.hpp	Hooks diagnostics + adds 16-bit `K` skip and adjusts B host sync dims; sets tolerance to 0 for integer_exact in bias paths.
projects/hipblaslt/clients/common/include/hipblaslt_init.hpp	Updates comment describing alternating-sign initialization.
projects/hipblaslt/clients/common/include/hipblaslt_datatype2string.hpp	Extends init enum and string conversions to include `integer_exact`.
projects/hipblaslt/clients/bench/src/client.cpp	Updates CLI help text to list `integer_exact`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

projects/hipblaslt/clients/common/src/hipblaslt_init_device.cpp

projects/hipblaslt/clients/common/include/testing_matmul.hpp

projects/hipblaslt/clients/common/include/unit.hpp

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

…_general The "Apply suggestions from code review" commit removed the call to unit_print_first_mismatch when initialization == integer_exact. Restore it so integer_exact test failures still get the first-mismatch diagnostic (i, j, batch, CPU/GPU values, count) before the generic unit_check_general. Made-with: Cursor

## Summary of changes - Introduced a new precision configuration for TF32x1, allowing for f32 A+B operations with bf16 compute input. - Updated YAML test files to include a new test case for integer-exact initialization using TF32x1 on gfx950 architecture. - Enhanced documentation within the YAML files to clarify the purpose and constraints of the new TF32x1 configuration. No functional or build impact outside of the new test case implementation.

codecov-commenter · 2026-03-17T23:45:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (77.21%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #5535      +/-   ##
===========================================
- Coverage    67.27%   66.58%   -0.68%     
===========================================
  Files         1842     1839       -3     
  Lines       283628   282922     -706     
  Branches     39803    39696     -107     
===========================================
- Hits        190791   188383    -2408     
- Misses       76380    78193    +1813     
+ Partials     16457    16346     -111

Flag	Coverage Δ		*Carryforward flag
hipBLAS	`90.67% <ø> (ø)`		Carriedforward from f186f40
hipBLASLt	`43.57% <ø> (+0.03%)`	⬆️
hipCUB	`82.38% <ø> (+0.18%)`	⬆️	Carriedforward from f186f40
hipDNN	`85.11% <ø> (-0.21%)`	⬇️	Carriedforward from f186f40
hipFFT	`56.36% <ø> (+0.77%)`	⬆️	Carriedforward from f186f40
hipRAND	`76.12% <ø> (ø)`		Carriedforward from f186f40
hipSOLVER	`68.81% <ø> (ø)`		Carriedforward from f186f40
hipSPARSE	`84.70% <ø> (ø)`		Carriedforward from f186f40
rocBLAS	`47.97% <ø> (ø)`		Carriedforward from f186f40
rocFFT	`47.38% <ø> (-5.86%)`	⬇️	Carriedforward from f186f40
rocRAND	`57.07% <ø> (ø)`		Carriedforward from f186f40
rocSOLVER	`77.21% <ø> (ø)`		Carriedforward from f186f40
rocSPARSE	`71.48% <ø> (ø)`		Carriedforward from f186f40

*This pull request uses carry forward flags. Click here to find out more.
see 92 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…32_fast_bf16 Restrict matmul_integer_exact_tf32x1_gfx950 to 128/131/1024/1031 cubes instead of one_matrix_size_range to avoid NO solution and lda sync failures on edge cases (e.g. 1x1x1 TT). Made-with: Cursor

TF32x1 (f32_bf16_r) returns no solution for both-transposed on gfx950 in CI; restrict matmul_integer_exact_tf32x1_gfx950 to NN, NT, TN. Made-with: Cursor

GPU vs CPU exact match fails on gfx11 while passing on other families; skip at runtime with concise log. Note in matmul_gtest.yaml. Made-with: Cursor

…agnitudes Added a constant offset to the PRNG index for B in the fill_batch function to differentiate the magnitudes from A, preventing potential correlation issues. This change enhances the randomness of the generated values during device initialization.

github-actions bot added the project: hipblaslt label Mar 17, 2026

tony-davis marked this pull request as ready for review March 17, 2026 21:01

tony-davis requested a review from a team as a code owner March 17, 2026 21:01

Copilot AI review requested due to automatic review settings March 17, 2026 21:01

Copilot started reviewing on behalf of tony-davis March 17, 2026 21:06 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

projects/hipblaslt/clients/common/src/hipblaslt_init_device.cpp Outdated Show resolved Hide resolved

projects/hipblaslt/clients/common/include/testing_matmul.hpp Show resolved Hide resolved

projects/hipblaslt/clients/common/include/unit.hpp Outdated Show resolved Hide resolved

assistant-librarian bot added the organization: ROCm label Mar 17, 2026

tony-davis and others added 3 commits March 17, 2026 16:35

Apply suggestions from code review

3388cbc

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

tony-davis added 5 commits March 17, 2026 18:54

Merge branch 'develop' into users/todavis/hipblaslt-exact-int-tests

487a9f9

hipblaslt: align integer_exact TF32x1 gfx950 sizes with matmul_gemm_f…

40d629a

…32_fast_bf16 Restrict matmul_integer_exact_tf32x1_gfx950 to 128/131/1024/1031 cubes instead of one_matrix_size_range to avoid NO solution and lda sync failures on edge cases (e.g. 1x1x1 TT). Made-with: Cursor

hipblaslt: omit TT for integer_exact TF32x1 gfx950

8d25ec0

TF32x1 (f32_bf16_r) returns no solution for both-transposed on gfx950 in CI; restrict matmul_integer_exact_tf32x1_gfx950 to NN, NT, TN. Made-with: Cursor

hipBLASlt: skip integer_exact matmul on gfx11 (Navi)

167955f

GPU vs CPU exact match fails on gfx11 while passing on other families; skip at runtime with concise log. Note in matmul_gtest.yaml. Made-with: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hipBLASlt]: Add integer_exact initialization option for matrix data#5535

[hipBLASlt]: Add integer_exact initialization option for matrix data#5535
tony-davis wants to merge 9 commits intodevelopfrom
users/todavis/hipblaslt-exact-int-tests

tony-davis commented Mar 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tony-davis commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes

Motivation

Technical Details

Test Plan

Related Tickets

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tony-davis commented Mar 17, 2026 •

edited

Loading

codecov-commenter commented Mar 17, 2026 •

edited

Loading