Skip to content

Conversation

@jishnub
Copy link
Member

@jishnub jishnub commented Apr 9, 2025

Writing the base case out as an explicit loop helps with TTFX

julia> using LinearAlgebra

julia> T = Bidiagonal(ones(4), ones(3), :U); AT = Matrix(T); x = UpperTriangular(AT);

julia> @time T * x;
  0.195728 seconds (522.75 k allocations: 25.519 MiB, 8.07% gc time, 99.96% compilation time) # master
  0.126475 seconds (404.02 k allocations: 19.361 MiB, 99.96% compilation time) # this PR

@codecov
Copy link

codecov bot commented Apr 9, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.00%. Comparing base (243efdb) to head (2f98c8b).
Report is 6 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1278      +/-   ##
==========================================
- Coverage   92.03%   92.00%   -0.04%     
==========================================
  Files          34       34              
  Lines       15488    15488              
==========================================
- Hits        14254    14249       -5     
- Misses       1234     1239       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dkarrasch
Copy link
Member

So the 2x2 and 3x3 cases are handled just as well by the loops in _mul_bitrisym in terms of runtime performance?

@jishnub
Copy link
Member Author

jishnub commented Apr 10, 2025

Yes, in fact, it should be faster, as there are fewer operations to carry out if we use the structure of the matrix.

julia> T = Bidiagonal(ones(2), ones(1), :U); TM = Matrix(T); C = similar(TM);

julia> @btime mul!($C, $T, $TM);
  56.527 ns (0 allocations: 0 bytes) # master
  23.549 ns (0 allocations: 0 bytes) # this PR

julia> T = Bidiagonal(ones(3), ones(2), :U); TM = Matrix(T); C = similar(TM);

julia> @btime mul!($C, $T, $TM);
  116.016 ns (0 allocations: 0 bytes) # master
  37.616 ns (0 allocations: 0 bytes) # this PR

@jishnub jishnub merged commit 0671a7b into master Apr 10, 2025
3 of 4 checks passed
@jishnub jishnub deleted the jishnub/bidiag_mul_bitrisym branch April 10, 2025 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants