-
Notifications
You must be signed in to change notification settings - Fork 2
Implement sparse + sparse matrix addition with transpose/adjoint support for CSC, CSR, and COO formats #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement sparse + sparse matrix addition with transpose/adjoint support for CSC, CSR, and COO formats #27
Conversation
albertomercurio
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Overall it seems a good job. I'm wondering if we can also implement the sum among transposed/adjointed matrices, like +(A::DeviceSparseMatrixCSC, B::Transpose{T, <:DeviceSparseMatrixCSC}), and in case how to do it. Should one convert the transposed matrix to CSR? If so, can the sum be efficiently computed between a CSC and CSR?
If not, are there other direct ways to do it, like the methods done for the mul! or dot functions?
Good question! Adding support for transposed/adjointed matrices would be valuable. Here are the options: Option 1: Convert CSC transpose to CSR (Most efficient)
Option 2: Use existing pattern from mul!/dot (Also efficient)
Option 3: Convert to same format (Simpler but less efficient)
Recommendation: Option 1 (CSC+CSR merge) or Option 2 (direct transpose handling) would be most efficient. Option 2 follows the existing mul! pattern more closely. Should I implement one of these approaches? Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #27 +/- ##
==========================================
+ Coverage 82.40% 85.87% +3.46%
==========================================
Files 13 13
Lines 773 1069 +296
==========================================
+ Hits 637 918 +281
- Misses 136 151 +15 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Benchmark Results'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.30.
| Benchmark suite | Current: 7e33d91 | Previous: 131b789 | Ratio |
|---|---|---|---|
Sparse Vector/Array/Sparse-Dense dot |
125.11646136618141 ns |
92.47643979057591 ns |
1.35 |
This comment was automatically generated by workflow using github-action-benchmark.
|
@copilot let's do Option 2: Use existing pattern from mul!/dot |
Implemented in commit 4caae95. Added transpose/adjoint support following the mul!/dot pattern: Implementation:
Supports:
Tests added for all formats. Verified with CPU and JLArray backends. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
src/matrix_csc/matrix_csc.jl
Outdated
| A_csr = DeviceSparseMatrixCSR( | ||
| m, | ||
| n, | ||
| getcolptr(_A), | ||
| getrowval(_A), | ||
| $conja ? conj.(nonzeros(_A)) : nonzeros(_A), | ||
| ) | ||
| B_csr = DeviceSparseMatrixCSR( | ||
| m, | ||
| n, | ||
| getcolptr(_B), | ||
| getrowval(_B), | ||
| $conjb ? conj.(nonzeros(_B)) : nonzeros(_B), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot do we really need to do it manually? Can we simply do A_csr = DeviceSparseMatrixCSR(A)? I mean, A Is already of type Transpose, so we should have already the right method. Also, do we have the method DeviceSparseMatrixCSR(A::Hermitian{T, <:DeviceSparseMatrixCSC}? Because in case we also need it. This might simplify significantly the code.
Do it for all the other cases and formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Also, I was thinking, all these methods are defined only when at least one of the two is transposed or adjoined. Can we simply do DeviceSparseMatrixCSR(A) + DeviceSparseMatrixCSR(B)? They should be handled automatically right>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in commit 4eb9c60. Simplified to use existing conversion methods:
DeviceSparseMatrixCSR(A)whereAisTranspose{T, <:DeviceSparseMatrixCSC}already existsDeviceSparseMatrixCSC(A)whereAisAdjoint{T, <:DeviceSparseMatrixCSR}already exists- These conversions handle conjugation automatically
- Removed manual construction and conditionals - now just converts both to same format and uses existing addition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly! Fixed in commit 4eb9c60. Now the transpose/adjoint methods simply do:
A_csr = DeviceSparseMatrixCSR(A)
B_csr = DeviceSparseMatrixCSR(B)
result_csr = A_csr + B_csr
return DeviceSparseMatrixCSC(result_csr)The conversion methods handle everything automatically.
| Val{false}(), | ||
| Val{false}(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot I'm not getting why we are using these arguments? here they are fixed to false. Are they fixed to something else elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Val{false}(), Val{false}() parameters are passed to the kernel which supports optional conjugation via Val{CONJA} and Val{CONJB}. For the basic (non-transpose/adjoint) addition, they're always false. The kernel signature requires them for compile-time specialization. This follows the same pattern as kernel_spmatmul_csc_N! which uses Val{CONJA}, Val{CONJB}, Val{TRANSB} parameters.
test/shared/matrix_coo.jl
Outdated
| @testset "Sparse + Sparse Transpose/Adjoint Addition" begin | ||
| for T in (float_types..., complex_types...) | ||
| m, n = 30, 30 # Square matrices for transpose tests | ||
| A = sprand(T, m, n, 0.1) | ||
| B = sprand(T, n, m, 0.1) # Transpose size | ||
|
|
||
| dA = adapt(op, DeviceSparseMatrixCOO(A)) | ||
| dB = adapt(op, DeviceSparseMatrixCOO(B)) | ||
|
|
||
| # Test sparse + transpose(sparse) | ||
| result_trans = dA + transpose(dB) | ||
| expected_trans = A + transpose(B) | ||
| @test collect(result_trans) ≈ Matrix(expected_trans) | ||
|
|
||
| # Test transpose(sparse) + sparse | ||
| result_trans2 = transpose(dB) + dA | ||
| @test collect(result_trans2) ≈ Matrix(expected_trans) | ||
|
|
||
| # Test sparse + adjoint(sparse) for complex types | ||
| if T <: Complex | ||
| result_adj = dA + adjoint(dB) | ||
| expected_adj = A + adjoint(B) | ||
| @test collect(result_adj) ≈ Matrix(expected_adj) | ||
|
|
||
| # Test adjoint(sparse) + sparse | ||
| result_adj2 = adjoint(dB) + dA | ||
| @test collect(result_adj2) ≈ Matrix(expected_adj) | ||
| end | ||
|
|
||
| # Test transpose + transpose | ||
| result_trans_trans = transpose(dA) + transpose(dA) | ||
| expected_trans_trans = transpose(A) + transpose(A) | ||
| @test collect(result_trans_trans) ≈ Matrix(expected_trans_trans) | ||
| end | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot can we just extend the previous @testset "Sparse + Sparse Matrix Addition" including all possible cases? I mean, there is no need to make this code repetition here. We can iterate with
for (op_A, op_B) in Iterators.product(
(identity, transpose, adjoint),
(identity, transpose, adjoint),
)As in the other code above.
Do it also for the other formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in commit 5c36d6c. Consolidated all transpose/adjoint tests into the main "Sparse + Sparse Matrix Addition" testset using the Iterators.product((identity, transpose, adjoint), (identity, transpose, adjoint)) pattern. This eliminates code duplication across all three formats (CSC, CSR, COO) while maintaining comprehensive test coverage for all operation combinations.
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Fixed in commit d8c2ecb. Replaced all Changes made in:
All tests pass without errors. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
test/shared/matrix_coo.jl
Outdated
| if op_A === identity && op_B === identity | ||
| # Basic addition tests | ||
| m, n = 50, 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Why do we need to write two separate codes depending on op_A and op_B. What are the reasons?
I agree that the two arrays should have the same size. If this is the only reason, one could merge them and do something like m, n = (op_A === identity && op_B === identity) ? (50, 40) : (30, 30).
We should keep the code as compact as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in commit 093aef3. Merged the conditionals as suggested:
m, n = (op_A === identity && op_B === identity) ? (50, 40) : (30, 30)The test structure is now more compact:
- Skip conditions moved to the top
- Single unified code path for all operation combinations
- Additional tests (overlapping, dimension mismatch) only run for identity+identity at the end
- Reduced from ~60 lines to ~50 lines per format while maintaining identical coverage
@copilot you should use |
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
…ndency Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
…iterator pattern Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
Co-authored-by: albertomercurio <61953577+albertomercurio@users.noreply.github.com>
093aef3 to
7e33d91
Compare
Implementation Complete: Sparse Matrix Addition with Transpose/Adjoint Support
Successfully implemented the
+(A::AbstractDeviceSparseMatrix, B::AbstractDeviceSparseMatrix)operation for CSC-CSC, CSR-CSR, and COO-COO formats, including full support for transposed and adjointed matrices.Completed Tasks:
kernel_count_nnz_per_col_csc!insrc/matrix_csc/matrix_csc_kernels.jlkernel_merge_csc!with conjugation support insrc/matrix_csc/matrix_csc_kernels.jl+(A::DeviceSparseMatrixCSC, B::DeviceSparseMatrixCSC)insrc/matrix_csc/matrix_csc.jltrans_adj_wrapperspatternkernel_count_nnz_per_row_csr!insrc/matrix_csr/matrix_csr_kernels.jlkernel_merge_csr!with conjugation support insrc/matrix_csr/matrix_csr_kernels.jl+(A::DeviceSparseMatrixCSR, B::DeviceSparseMatrixCSR)insrc/matrix_csr/matrix_csr.jltrans_adj_wrapperspatternkernel_mark_unique_coo!insrc/matrix_coo/matrix_coo_kernels.jlkernel_compact_coo!insrc/matrix_coo/matrix_coo_kernels.jl+(A::DeviceSparseMatrixCOO, B::DeviceSparseMatrixCOO)insrc/matrix_coo/matrix_coo.jltest/shared/matrix_csc.jltest/shared/matrix_csr.jltest/shared/matrix_coo.jlbenchmark_sparse_sparse_add!inbenchmarks/matrix_benchmarks.jlbenchmarks/runbenchmarks.jlIterators.productpatternallowed_getindexwithonly()for compatibilityImplementation Details:
CSC/CSR Formats: Two-pass merge algorithm
Val{CONJA}andVal{CONJB}parameters (following mul!/dot pattern)COO Format: Sort + compact algorithm
Transpose/Adjoint Support (simplified using existing conversions):
trans_adj_wrappersto generate methods for all transpose/adjoint combinationsGPU Compatibility:
colptr_C[1:1] .= one(Ti))allowed_getindex(arr, idx)withonly(arr[idx:idx])for compatibility with updated ArrayInterfaceTest Structure: Consolidated and simplified
Iterators.product((identity, transpose, adjoint), (identity, transpose, adjoint))to test all combinationsm, n = (op_A === identity && op_B === identity) ? (50, 40) : (30, 30)Supported Operations:
A + B- Same format addition (CSC+CSC, CSR+CSR, COO+COO)A + transpose(B)- Addition with transposed matrixtranspose(A) + B- Transposed matrix additiontranspose(A) + transpose(B)- Both matrices transposedA + adjoint(B)- Addition with adjointed matrix (complex types)adjoint(A) + B- Adjointed matrix addition (complex types)Test Results:
✅ All sparse + sparse addition tests pass
✅ Transpose/adjoint addition works for CSC, CSR, and COO
✅ Proper conjugation for adjoint operations
✅ Works correctly with overlapping and non-overlapping entries
✅ Proper dimension checking and error handling
✅ GPU-compatible (no scalar indexing issues)
✅ COO format properly merges duplicates
✅ Simplified implementation using existing conversion methods
✅ Tests consolidated using iterator pattern (no code duplication)
✅ Compatible with updated ArrayInterface (PR #28)
✅ Compact test structure with merged conditionals
Original prompt
AbstractDeviceSparseMatrix#16💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.