[hipBLASlt]: Add integer_exact initialization option for matrix data#5535
[hipBLASlt]: Add integer_exact initialization option for matrix data#5535tony-davis wants to merge 9 commits intodevelopfrom
Conversation
## Summary of changes - Introduced a new initialization option `integer_exact` for matrix data, allowing for specific integer initialization patterns. - Updated `hipblaslt_initialization` enum to include `integer_exact` with a corresponding value. - Enhanced the `unit_print_first_mismatch` function to support diagnostics for `integer_exact` initialization. - Modified the `hipblaslt_init_device` function to handle the new initialization type, ensuring proper data generation. - Added multiple test cases in YAML files to validate the behavior of the `integer_exact` initialization across various matrix sizes and configurations. No functional or build impact outside of the new feature implementation.
There was a problem hiding this comment.
Pull request overview
Adds a new integer_exact matrix initialization mode intended to enable exact (zero-tolerance) GEMM result comparisons by constraining generated operands to small integers with a checkerboard sign pattern.
Changes:
- Extends
hipblaslt_initialization(YAML + C++) to includeinteger_exactand exposes it via the bench CLI. - Implements device-side data generation for
integer_exactand adds a host diagnostic helper (unit_print_first_mismatch) for exact-check failures. - Adds a new integer-exact matmul test suite in YAML, plus runtime skipping for 16-bit types when
K > 512.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| projects/hipblaslt/clients/tests/data/matmul_gtest.yaml | Adds new matmul_integer_exact_* YAML test entries mirroring existing matmul suites. |
| projects/hipblaslt/clients/tests/data/hipblaslt_common.yaml | Registers integer_exact as an initialization enum value for test YAML. |
| projects/hipblaslt/clients/common/src/hipblaslt_init_device.cpp | Implements integer_exact data generation (small ints + checkerboard sign for B). |
| projects/hipblaslt/clients/common/include/unit.hpp | Adds unit_print_first_mismatch helper for improved mismatch diagnostics. |
| projects/hipblaslt/clients/common/include/testing_matmul.hpp | Hooks diagnostics + adds 16-bit K skip and adjusts B host sync dims; sets tolerance to 0 for integer_exact in bias paths. |
| projects/hipblaslt/clients/common/include/hipblaslt_init.hpp | Updates comment describing alternating-sign initialization. |
| projects/hipblaslt/clients/common/include/hipblaslt_datatype2string.hpp | Extends init enum and string conversions to include integer_exact. |
| projects/hipblaslt/clients/bench/src/client.cpp | Updates CLI help text to list integer_exact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
projects/hipblaslt/clients/common/src/hipblaslt_init_device.cpp
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…_general The "Apply suggestions from code review" commit removed the call to unit_print_first_mismatch when initialization == integer_exact. Restore it so integer_exact test failures still get the first-mismatch diagnostic (i, j, batch, CPU/GPU values, count) before the generic unit_check_general. Made-with: Cursor
## Summary of changes - Introduced a new precision configuration for TF32x1, allowing for f32 A+B operations with bf16 compute input. - Updated YAML test files to include a new test case for integer-exact initialization using TF32x1 on gfx950 architecture. - Enhanced documentation within the YAML files to clarify the purpose and constraints of the new TF32x1 configuration. No functional or build impact outside of the new test case implementation.
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project status has failed because the head coverage (77.21%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #5535 +/- ##
===========================================
- Coverage 67.27% 66.58% -0.68%
===========================================
Files 1842 1839 -3
Lines 283628 282922 -706
Branches 39803 39696 -107
===========================================
- Hits 190791 188383 -2408
- Misses 76380 78193 +1813
+ Partials 16457 16346 -111
*This pull request uses carry forward flags. Click here to find out more. 🚀 New features to boost your workflow:
|
…32_fast_bf16 Restrict matmul_integer_exact_tf32x1_gfx950 to 128/131/1024/1031 cubes instead of one_matrix_size_range to avoid NO solution and lda sync failures on edge cases (e.g. 1x1x1 TT). Made-with: Cursor
TF32x1 (f32_bf16_r) returns no solution for both-transposed on gfx950 in CI; restrict matmul_integer_exact_tf32x1_gfx950 to NN, NT, TN. Made-with: Cursor
GPU vs CPU exact match fails on gfx11 while passing on other families; skip at runtime with concise log. Note in matmul_gtest.yaml. Made-with: Cursor
…agnitudes Added a constant offset to the PRNG index for B in the fill_batch function to differentiate the magnitudes from A, preventing potential correlation issues. This change enhances the randomness of the generated values during device initialization.
Summary of changes
integer_exactfor matrix data, allowing for specific integer initialization patterns.hipblaslt_initializationenum to includeinteger_exactwith a corresponding value.unit_print_first_mismatchfunction to support diagnostics forinteger_exactinitialization.hipblaslt_init_devicefunction to handle the new initialization type, ensuring proper data generation.integer_exactinitialization across various matrix sizes and configurations.No functional or build impact outside of the new feature implementation.
Motivation
Implements integer-exact GEMM testing for hipBLASlt: input data are chosen so dot products stay in a small integer range, enabling GPU vs CPU comparison with exact match (zero tolerance) instead of a numeric tolerance. That improves defect detection and avoids tolerance masking issues, especially with 16-bit types. The pattern used is A and C in
{0, 1, 2}, B in ±{0, 1, 2}with a checkerboard sign pattern, so results remain exactly representable under defined K limits.Technical Details
hipblaslt_initialization::integer_exact. Inhipblaslt_init_device, A and C usesmall_int_positive(values 0, 1, 2); B uses the same values with sign chosen by(i^j)&1so the first element of each row and column alternates (checkerboard).initialization == integer_exact,testing_matmulsets tolerance to 0 (exact check) and callsunit_print_first_mismatchfor D before the check to improve failure diagnostics.integer_exactwith 16-bit (fp16/bf16), tests are skipped when any K > 512 so dot products stay exactly representable; 32-bit and 64-bit types use the full size ranges.(K, N, ldb)instead of logical(B_row, B_col, ldb)so transB and swizzle paths do not hit invalid leading dimension or wrong copy.matmul_integer_exact_*tests inmatmul_gtest.yamlmirror the existing matmul suite (one, small, medium, batch_medium, chunk, grid_limit_real, grid_limit_double, deepbench), with the same size ranges and filters (includinggpu_archwhere applicable).Test Plan
Related Tickets
ROCM-19939 motivated the creation of these tests.
Submission Checklist