Skip to content

[rocsparse] Fix csric0 dispatch when rows have many nonzeros#5360

Open
jsandham wants to merge 6 commits intoROCm:developfrom
jsandham:missing_fence
Open

[rocsparse] Fix csric0 dispatch when rows have many nonzeros#5360
jsandham wants to merge 6 commits intoROCm:developfrom
jsandham:missing_fence

Conversation

@jsandham
Copy link
Contributor

@jsandham jsandham commented Mar 11, 2026

Motivation

Removes unneeded WF_SIZE template parameter from csrsm kernels. Fixes bug in csric0 where we could return rocsparse_status_invalid_value when the maximum number of non-zeros was between 513 and 1024.

Technical Details

The hash algorithm is only usable up to 512 but the dispatch assumed it could be used up to 1024. This means that if there was a matrix with 512 < max_nnz <= 1024 we would dispatch to the hash table algorithm and then in the hash table algorithm we would throw an error as not supported. This is why I added an additional test to check 512 < max_nnz <= 1024 as without my change this test fails.

@jsandham jsandham requested a review from a team as a code owner March 11, 2026 22:19
@jsandham jsandham changed the title Missing fence [rocsparse] Add threadence to csric0 and csrsv kernels Mar 11, 2026
@codecov-commenter
Copy link

codecov-commenter commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 45.45455% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
.../library/src/level3/rocsparse_csrsm_solve_impl.cpp 50.00% 5 Missing ⚠️
.../precond/csric0/rocsparse_csric0_kernel_launch.cpp 0.00% 0 Missing and 1 partial ⚠️

❌ Your project status has failed because the head coverage (77.21%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #5360      +/-   ##
===========================================
+ Coverage    66.43%   66.45%   +0.02%     
===========================================
  Files         1793     1814      +21     
  Lines       277419   279234    +1815     
  Branches     38806    39217     +411     
===========================================
+ Hits        184301   185562    +1261     
- Misses       77117    77441     +324     
- Partials     16001    16231     +230     
Flag Coverage Δ *Carryforward flag
hipBLAS 90.67% <ø> (ø) Carriedforward from d36db75
hipBLASLt 43.99% <ø> (ø) Carriedforward from d36db75
hipCUB 82.38% <ø> (ø) Carriedforward from d36db75
hipDNN 83.94% <ø> (ø) Carriedforward from d36db75
hipFFT 56.38% <ø> (ø) Carriedforward from d36db75
hipRAND 76.12% <ø> (ø) Carriedforward from d36db75
hipSOLVER 68.81% <ø> (ø) Carriedforward from d36db75
hipSPARSE 84.70% <ø> (ø) Carriedforward from d36db75
rocBLAS 47.97% <ø> (ø) Carriedforward from d36db75
rocFFT 49.77% <ø> (ø) Carriedforward from d36db75
rocRAND 57.07% <ø> (ø) Carriedforward from d36db75
rocSOLVER 77.21% <ø> (ø) Carriedforward from d36db75
rocSPARSE 71.48% <45.45%> (-0.05%) ⬇️

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
.../precond/csric0/rocsparse_csric0_kernel_launch.cpp 75.00% <0.00%> (ø)
.../library/src/level3/rocsparse_csrsm_solve_impl.cpp 74.09% <50.00%> (ø)

... and 86 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@YvanMokwinski
Copy link
Contributor

There is change about the use of binsearch, you lowered the size.
Please clarify.

@jsandham
Copy link
Contributor Author

There is change about the use of binsearch, you lowered the size. Please clarify.

The hash algorithm is only usable up to 512 but the dispatch assumed it could be used up to 1024. This means that if there was a matrix with 512 < max_nnz <= 1024 we would dispatch to the hash table algorithm and then in the hash table algorithm we would throw an error as not supported. This is why I added an additional test to check 512 < max_nnz <= 1024 as without my change this test fails.

@YvanMokwinski
Copy link
Contributor

There is change about the use of binsearch, you lowered the size. Please clarify.

The hash algorithm is only usable up to 512 but the dispatch assumed it could be used up to 1024. This means that if there was a matrix with 512 < max_nnz <= 1024 we would dispatch to the hash table algorithm and then in the hash table algorithm we would throw an error as not supported. This is why I added an additional test to check 512 < max_nnz <= 1024 as without my change this test fails.

Thanks for the clarification, @jsandham

@jsandham jsandham requested a review from a team as a code owner March 18, 2026 14:54
@jsandham jsandham changed the title [rocsparse] Add threadence to csric0 and csrsv kernels [rocsparse] Fix csric0 dispatch when rows have many nonzeros Mar 18, 2026
Copy link
Contributor

@amd-jnovotny amd-jnovotny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog OK.

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants