Skip to content

Conversation

@mtfishman
Copy link
Collaborator

@mtfishman mtfishman commented May 15, 2025

As the PR title says, this adds support for truncating unsorted spectra in TruncationKeepAbove/TruncationKeepBelow.

It also more systematically handles sorted vs. unsorted spectra where relevant by splitting between codepaths findtruncated_sorted and findtruncated_unsorted, where the sorted version outputs unit ranges.

This has the downside that those cases of findtruncated become mildly type unstable (i.e. Union{Vector{Int},UnitRange{Int}}), but I think that is worth it since slicing a spectrum/matrix with a unit range is generally faster than slicing with an arbitrary vector.

I'm open to hearing other design ideas as well, an alternative or complement to this could be that certain factorizations indicate to the truncation code that the spectrum is sorted already so it can avoid the check and maybe become type stable.

One motivation for this is that even though the spectrum for an SVD is generally (reverse) sorted, if you do a blockwise SVD of a block diagonal matrix the spectra are only sorted within blocks, this PR helps with handling that case.

@mtfishman mtfishman requested review from Jutho and lkdvos May 15, 2025 14:49
@mtfishman mtfishman changed the title Support unsorted spectrum in TruncationKeepAbove/TruncationKeepBelow Support unsorted spectrum in TruncationKeepAbove/Below May 15, 2025
@mtfishman mtfishman changed the title Support unsorted spectrum in TruncationKeepAbove/Below Support unsorted spectra in TruncationKeepAbove/Below May 15, 2025
@codecov
Copy link

codecov bot commented May 15, 2025

Codecov Report

Attention: Patch coverage is 86.36364% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/implementations/truncation.jl 86.36% 3 Missing ⚠️
Files with missing lines Coverage Δ
src/MatrixAlgebraKit.jl 100.00% <ø> (ø)
src/implementations/truncation.jl 83.51% <86.36%> (-2.56%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Jutho
Copy link
Member

Jutho commented May 16, 2025

In the grand scheme of things, I do wonder what the performance impact is replacing slicing with a range by slicing with a general vector representing the same range.

julia> using BenchmarkTools

julia> a = randn(1000);

julia> r = 97:197;

julia> v = collect(r);

julia> @benchmark getindex(a, r)
BenchmarkTools.Trial: 10000 samples with 991 evaluations per sample.
 Range (min … max):  42.970 ns … 659.687 ns  ┊ GC (min … max):  0.00% … 87.48%
 Time  (median):     48.016 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   55.161 ns ±  33.581 ns  ┊ GC (mean ± σ):  12.19% ± 14.92%

  ▄█▅                                                       ▁▂ ▁
  ████▇▆▆▅▄▄▁▁▁▁▁▁▁▁▁▅▄▁▃▁▁▁▁▁▁▁▁▇█▇▆▄▄▄▁▃▁▃▁▄▃▁▃▄▁▄▁▁▅▄▄▃▆▇██ █
  43 ns         Histogram: log(frequency) by time       216 ns <

 Memory estimate: 928 bytes, allocs estimate: 2.

julia> @benchmark getindex(a, v)
BenchmarkTools.Trial: 10000 samples with 960 evaluations per sample.
 Range (min … max):   93.185 ns … 793.706 ns  ┊ GC (min … max): 0.00% … 82.07%
 Time  (median):      96.181 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   105.955 ns ±  45.747 ns  ┊ GC (mean ± σ):  8.11% ± 12.89%

  █▆▂▁                                                       ▂▂ ▁
  ████▆▆▄▁▃▁▆▅▅▅▄▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅███ █
  93.2 ns       Histogram: log(frequency) by time        319 ns <

 Memory estimate: 928 bytes, allocs estimate: 2.

So it is twice as slow for this example, but we are talking about nanosecond operations. That is probably completely negligible in the kind of computations this is going to be used. Maybe the type instability has a larger performance impact.

@mtfishman
Copy link
Collaborator Author

I wasn't so concerned about the conversion from range to vector, rather that afterward you use the range/vector to slice the U, S, and V matrices, and there it seems better to have a range in general.

Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure if I agree with this implementation. The type instability is definitely a problem, and in this case these strategies are really all just a filter with different values of the truncation and should probably be mapped to that.

I don't think it is really necessary to have the general cases handle all exceptions though, non-sorted singular or eigenvalues are really the exception, and I would say that it is actually better to document which methods assume sorted rather than make them work for arbitrary inputs.

On top of that, for the case in the referenced pull request, actually we are immediately going to map it to a logical vector, so we should really immediately implement this as values .> atol and then this generalization here is obsolete again. Given the code design here, it's actually very easy to specialize for the relevant special cases, so unless we can handle more cases without hurting performance I don't think that is a great deal.

@Jutho
Copy link
Member

Jutho commented May 16, 2025

I wasn't so concerned about the conversion from range to vector, rather that afterward you use the range/vector to slice the U, S, and V matrices, and there it seems better to have a range in general.

What I am testing with my benchmark is the actual slicing and not the conversion itself? It is the slicing of the vector of values, maybe it makes a bigger difference for the matrices, so let me try:

julia> r = 97:197;

julia> v = collect(r);

julia> U = randn(1000,1000);

julia> V = randn(1000, 1000);

julia> S = randn(1000);

julia> @benchmark getindex($S, $r)
BenchmarkTools.Trial: 10000 samples with 991 evaluations per sample.
 Range (min … max):   55.667 ns … 712.619 μs  ┊ GC (min … max):  0.00% … 99.96%
 Time  (median):      92.205 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   184.328 ns ±   7.389 μs  ┊ GC (mean ± σ):  49.27% ±  1.41%

                          ▁▃▅▅▄▅▆█▄▁                             
  ▂▂▂▁▁▁▁▁▁▂▂▁▂▁▁▁▁▂▂▂▂▂▂▄██████████▇▅▄▄▃▃▃▃▃▃▃▂▂▃▂▃▂▂▂▂▂▂▂▂▂▂▂ ▃
  55.7 ns          Histogram: frequency by time          129 ns <

 Memory estimate: 928 bytes, allocs estimate: 2.

julia> @benchmark getindex($S, $v)
BenchmarkTools.Trial: 10000 samples with 950 evaluations per sample.
 Range (min … max):  104.957 ns … 784.318 μs  ┊ GC (min … max):  0.00% … 99.98%
 Time  (median):     143.947 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   226.135 ns ±   7.842 μs  ┊ GC (mean ± σ):  34.68% ±  1.00%

                             ▁██                                 
  ▂▂▂▂▂▂▂▁▁▁▂▂▂▂▁▁▁▁▁▁▁▁▁▁▂▂▄████▆▆▅▄▄▄▄▃▃▃▃▃▃▃▂▂▃▃▃▂▂▂▂▂▂▂▂▂▂▂ ▃
  105 ns           Histogram: frequency by time          184 ns <

 Memory estimate: 928 bytes, allocs estimate: 2.

julia> @benchmark getindex($U, :, $r)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):   35.667 μs … 108.709 ms  ┊ GC (min … max):  0.00% … 99.88%
 Time  (median):      86.167 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   101.417 μs ±   1.180 ms  ┊ GC (mean ± σ):  15.26% ±  1.41%

                                    ▁█▇█   ▃▃▂                   
  ▅▂▂▂▂▁▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃████▇▆▇████▇▇▇▆▅▄▄▄▄▃▃▃▂▃▂▂ ▃
  35.7 μs          Histogram: frequency by time          111 μs <

 Memory estimate: 800.08 KiB, allocs estimate: 3.

julia> @benchmark getindex($U, :, $v)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):   38.958 μs … 129.800 ms  ┊ GC (min … max):  0.00% … 99.90%
 Time  (median):      89.333 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   107.037 μs ±   1.374 ms  ┊ GC (mean ± σ):  16.35% ±  1.41%

                                    ▅▅█   ▁▂▁                    
  ▄▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▂▇███▆▅▇████▆▇▇▆▅▅▄▅▄▃▃▃▃▃▃▃▂ ▃
  39 μs            Histogram: frequency by time          114 μs <

 Memory estimate: 800.08 KiB, allocs estimate: 3.

julia> @benchmark getindex($V, $r, :)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):   37.375 μs … 146.763 ms  ┊ GC (min … max):  0.00% … 99.82%
 Time  (median):      89.708 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   108.304 μs ±   1.527 ms  ┊ GC (mean ± σ):  17.45% ±  1.41%

                                       █     ▁                   
  ▂▃▂▂▂▂▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▃▅██▇▄▄▅██▇▇▆▆▆▅▄▄▃▃▃▃▃▂▂▂▂ ▃
  37.4 μs          Histogram: frequency by time          112 μs <

 Memory estimate: 800.08 KiB, allocs estimate: 3.

julia> @benchmark getindex($V, $v, :)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):   87.250 μs … 120.687 ms  ┊ GC (min … max):  0.00% … 99.76%
 Time  (median):     134.458 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   150.218 μs ±   1.261 ms  ┊ GC (mean ± σ):  10.47% ±  1.41%

                                  ▁█                             
  ▅▂▂▂▂▁▂▂▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▅███▅▄▄▄▆▆▆▅▅▅▅▅▄▄▃▃▃▃▃▂▂▂▂▂▂▂ ▃
  87.2 μs          Histogram: frequency by time          163 μs <

 Memory estimate: 800.08 KiB, allocs estimate: 3.

So the only case where this seems to make a signicant difference is for slicing the rows of a matrix, which in hindsight is not so surprising. Even then the question is how significant this is within the larger simulation.

@Jutho
Copy link
Member

Jutho commented May 16, 2025

non-sorted singular or eigenvalues are really the exception

I guess methods that known the values are sorted could immediately call findtruncated_sorted so I am not sure I agree with Lukas's objections. For eigenvalues, the order is also not obvious, i.e. I guess you typically have eigenvalues of a density matrix in mind, and so you know that they are positive and you would like to see them sorted in decreasing order. But the default order of the Hermitian eigensolver will be increasing order I believe. For the non-hermitian case, I don't even think there is any fixed order that can be expected.

@mtfishman
Copy link
Collaborator Author

mtfishman commented May 16, 2025

Sorry @Jutho, I misread your post. Thanks for analyzing that. I wonder if there is more significant difference for GPU arrays. Also, is there some advantage that you can more easily take views and preserve that they are strided, or even do certain truncations in-place if they are ranges (for example with resizing)? I also doubt the overhead of type instability is significant here since it is just a small union that should get handled with union splitting.

I think the broader strategy I'm trying to take with this PR is being more agnostic about where the spectrum came from and then what will be done with it after, since findtruncated has very little context for that right now.

I'm open to considering alternatives to this design. One of my objections to the previous design was that some truncation strategies handled unsorted values while others didn't. Documenting that helps to some extent, but I think we should be more systematic about that distinction in some way. Some alternatives to the current design that I can think of are:

  1. Introduce a naming convention for truncation strategies to indicate whether or not they assume the spectrum is sorted, and/or introduce an abstract type SortedTruncationStrategy <: TruncationStrategy that truncation strategies that assume the spectrum is sorted can subtype.
  2. Have findtruncated accept more arguments about the factorization, for example findtruncated(values, strategy, f, A, alg) (obviously we can discuss which arguments would be accepted and what the argument order would be). Then, based on the factorization, matrix type, algorithm, etc. we can assume the spectrum will or will not be sorted and customize findtruncated appropriately.
  3. Instead of extracting the spectrum with diagview, having an interface spectrum that can fall back to diagview, but could add a wrapper indicating extra information about the spectrum, for example indicating that it is sorted.

These approaches could complement each other and complement the current PR, but I would still lean towards having something like the current PR, since I think findtruncated(values, strategy) can be considered a general fallback implementation and therefore should assume as little as possible about the factorization being performed.

@Jutho
Copy link
Member

Jutho commented May 16, 2025

The current truncate! implementation, i.e.

function truncate!(::typeof(svd_trunc!), (U, S, Vᴴ), strategy::TruncationStrategy)
ind = findtruncated(diagview(S), strategy)
return U[:, ind], Diagonal(diagview(S)[ind]), Vᴴ[ind, :]
end
function truncate!(::typeof(eig_trunc!), (D, V), strategy::TruncationStrategy)
ind = findtruncated(diagview(D), strategy)
return Diagonal(diagview(D)[ind]), V[:, ind]
end
function truncate!(::typeof(eigh_trunc!), (D, V), strategy::TruncationStrategy)
ind = findtruncated(diagview(D), strategy)
return Diagonal(diagview(D)[ind]), V[:, ind]
end
function truncate!(::typeof(left_null!), (U, S), strategy::TruncationStrategy)
# TODO: avoid allocation?
extended_S = vcat(diagview(S), zeros(eltype(S), max(0, size(S, 1) - size(S, 2))))
ind = findtruncated(extended_S, strategy)
return U[:, ind]
end
function truncate!(::typeof(right_null!), (S, Vᴴ), strategy::TruncationStrategy)
# TODO: avoid allocation?
extended_S = vcat(diagview(S), zeros(eltype(S), max(0, size(S, 2) - size(S, 1))))
ind = findtruncated(extended_S, strategy)
return Vᴴ[ind, :]
end

does not specialize on the factorisation arguments, i.e (U, S, V), (V, D) etc. Does this implementation make sense for anything beyond the matrix case? It certainly wouldn't work for TensorKit tensors, but I don't know about Itensors. If so, can we specialize on e.g. USV::Tuple{AbstractMatrix,AbstractVector,AbstractMatrix} and then directly call findtruncated_sorted?

We could also have findturncated_sorted return a range, and then when findtruncated_sorted is called from the general findtruncated, the latter collects the range into a general vector to make it completely type stable.

Regarding your other questions, yes instead of slicing a view with a range would be strided, whereas a view with a general vector not, but I don't know if this is very relevant here. And whereas you could in principle interpret throw away the final columns of a matrix as a resize! operation, Julia does not allow this only allows this since 1.11. This is what you would get in older versions:


julia> a = randn(100);

julia> m = reshape(a, (10, 10));

julia> m2 = reshape(resize!(a, 90), (10, 9))
ERROR: cannot resize array with shared data
Stacktrace:
 [1] _deleteend!
   @ ./array.jl:1081 [inlined]
 [2] resize!(a::Vector{Float64}, nl::Int64)
   @ Base ./array.jl:1320
 [3] top-level scope
   @ REPL[18]:1

@mtfishman
Copy link
Collaborator Author

The current truncate! implementation, i.e.

function truncate!(::typeof(svd_trunc!), (U, S, Vᴴ), strategy::TruncationStrategy)
ind = findtruncated(diagview(S), strategy)
return U[:, ind], Diagonal(diagview(S)[ind]), Vᴴ[ind, :]
end
function truncate!(::typeof(eig_trunc!), (D, V), strategy::TruncationStrategy)
ind = findtruncated(diagview(D), strategy)
return Diagonal(diagview(D)[ind]), V[:, ind]
end
function truncate!(::typeof(eigh_trunc!), (D, V), strategy::TruncationStrategy)
ind = findtruncated(diagview(D), strategy)
return Diagonal(diagview(D)[ind]), V[:, ind]
end
function truncate!(::typeof(left_null!), (U, S), strategy::TruncationStrategy)
# TODO: avoid allocation?
extended_S = vcat(diagview(S), zeros(eltype(S), max(0, size(S, 1) - size(S, 2))))
ind = findtruncated(extended_S, strategy)
return U[:, ind]
end
function truncate!(::typeof(right_null!), (S, Vᴴ), strategy::TruncationStrategy)
# TODO: avoid allocation?
extended_S = vcat(diagview(S), zeros(eltype(S), max(0, size(S, 2) - size(S, 1))))
ind = findtruncated(extended_S, strategy)
return Vᴴ[ind, :]
end

does not specialize on the factorisation arguments, i.e (U, S, V), (V, D) etc. Does this implementation make sense for anything beyond the matrix case? It certainly wouldn't work for TensorKit tensors, but I don't know about Itensors. If so, can we specialize on e.g. USV::Tuple{AbstractMatrix,AbstractVector,AbstractMatrix} and then directly call findtruncated_sorted?

That's a good point, customizing at the level of truncate! may be the best way to go and should cover the cases I have in mind. A use case I have in mind is truncating a block sparse matrix, and indeed truncate! would need to be customized for that case anyway to take into account the block structure.

We could also have findturncated_sorted return a range, and then when findtruncated_sorted is called from the general findtruncated, the latter collects the range into a general vector to make it completely type stable.

I like that proposal.

Regarding your other questions, yes instead of slicing a view with a range would be strided, whereas a view with a general vector not, but I don't know if this is very relevant here. And whereas you could in principle interpret throw away the final columns of a matrix as a resize! operation, Julia does not allow this only allows this since 1.11. This is what you would get in older versions:


julia> a = randn(100);

julia> m = reshape(a, (10, 10));

julia> m2 = reshape(resize!(a, 90), (10, 9))
ERROR: cannot resize array with shared data
Stacktrace:
 [1] _deleteend!
   @ ./array.jl:1081 [inlined]
 [2] resize!(a::Vector{Float64}, nl::Int64)
   @ Base ./array.jl:1320
 [3] top-level scope
   @ REPL[18]:1

Those are fair points. My broader point was that I would like to be relatively agnostic about what users might want to do with the output of findtruncate, I think your suggestion above of having findtruncated_sorted return a range while findtruncated returns a vector is a good compromise so that types can take advantage of a range when that is beneficial.

@mtfishman
Copy link
Collaborator Author

mtfishman commented May 16, 2025

Though on a related note, @lkdvos and I were discussing that it would be helpful for truncate! to accept the algorithm type as an argument to allow for more customization, since depending on the algorithm it could change how the spectrum gets output (for example, if you are taking the SVD of a block diagonal matrix, you could imagine that one algorithm preserves the block structure and the spectrum is only sorted locally within the blocks while another algorithm doesn't preserve the block structure and therefore the spectrum is sorted globally). EDIT: I've raised an issue about that here: #27.

@mtfishman
Copy link
Collaborator Author

I'm definitely fine trying to avoid type stability in this case, but for posterity:

julia> f(x, n) = x > 0.5 ? g(n) : h(n)
f (generic function with 1 method)

julia> g(n) = 1:n
g (generic function with 1 method)

julia> h(n) = collect(1:n)
h (generic function with 1 method)

julia> @btime g(20);
  1.716 ns (0 allocations: 0 bytes)

julia> @btime f(1, 20);
  1.716 ns (0 allocations: 0 bytes)

julia> @btime h(20);
  31.349 ns (2 allocations: 224 bytes)

julia> @btime f(0.2, 20);
  32.018 ns (2 allocations: 224 bytes)

julia> v = randn(40);

julia> @btime $v[g(20)];
  32.557 ns (2 allocations: 224 bytes)

julia> @btime $v[f(1, 20)];
  35.749 ns (2 allocations: 224 bytes)

julia> @btime $v[h(20)];
  79.180 ns (4 allocations: 448 bytes)

julia> @btime $v[f(0.2, 20)];
  76.093 ns (4 allocations: 448 bytes)

I have been under the impression that when there are type instabilities that involve small unions the compiler can take advantage of union splitting to compile those cases into if-statements, so there is very little overhead.

@Jutho
Copy link
Member

Jutho commented May 20, 2025

Is there anything that needs to happen here? I am happy with the approach in the current state of the PR. Should I approve so that this can be merged?

@lkdvos
Copy link
Member

lkdvos commented May 20, 2025

Can I take a look at this tomorrow when I'm back in the office?

@mtfishman
Copy link
Collaborator Author

Is there anything that needs to happen here? I am happy with the approach in the current state of the PR. Should I approve so that this can be merged?

I'm good with merging this, I don't have any more to do here. Lukas is traveling so maybe we want to wait for his feedback, it seemed like there was still some disagreement about the premise of this PR.

Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some small comments about the implementations, and given the discussions below I think here is my revised opinion/suggestions:

I do like splitting the implementations between findtruncated_sorted and findtruncated_unsorted, to clearly distinguish what is being assumed.
However, there might still be some questions about the sorting here: when we say sorted, I think currently we are assuming real entries sorted in descending (reverse) order, which is what we are using now. This is really only the case for the singular values, so it might make sense to add a by and rev argument to findtruncated_sorted as well, to indicate that this is what we are assuming.

Additionally, I don't think the current implementations are being very consistent with handling complex values, and abs is only sometimes added. Maybe we can be slightly more careful about this, but this ties back a bit to my previous comment about what we are assuming for findtruncated_sorted.

Given the benchmarks above and the general implications of this, I don't think I really like the if issorted branch in the findtruncated implementation. Using the findtruncated_sorted implementation and converting to a Vector{Int} seems like a very hard-coded fix for the type instability, making assumptions about the output type of findtruncated_unsorted without making many restrictions on the input types.
It makes me wonder if it's not both easier, cleaner, more generic and better to maintain in the future if we simply use findtruncated = findtruncated_unsorted, ie we assume unsorted by default. If the implementation would try and sort an input which is already sorted, this is very cheap anyways, so I don't think there are that many drawbacks to this.


In general, when I first wrote this part of the code I wasn't necessarily thinking that findtruncated would be the specialization point, and this was more a way to avoid code duplication than the best point for customization.
As @mtfishman already mentioned, at this point in the algorithms we really are missing some information about what kind of spectrum is being provided, so writing very generic implementations/specializations is not that straightforward.
Given the comments in this PR, I would maybe even argue to get rid of findtruncated altogether, and simply keep findtruncated_sorted and findtruncated_unsorted instead.

@mtfishman
Copy link
Collaborator Author

mtfishman commented May 21, 2025

I do like splitting the implementations between findtruncated_sorted and findtruncated_unsorted, to clearly distinguish what is being assumed. However, there might still be some questions about the sorting here: when we say sorted, I think currently we are assuming real entries sorted in descending (reverse) order, which is what we are using now. This is really only the case for the singular values, so it might make sense to add a by and rev argument to findtruncated_sorted as well, to indicate that this is what we are assuming.

Yes, that's a good point. I can add by and rev to make that more general.

Additionally, I don't think the current implementations are being very consistent with handling complex values, and abs is only sometimes added. Maybe we can be slightly more careful about this, but this ties back a bit to my previous comment about what we are assuming for findtruncated_sorted.

I can go through and make the default of by=abs and rev=true more consistent, and also make those customizable.

Given the benchmarks above and the general implications of this, I don't think I really like the if issorted branch in the findtruncated implementation. Using the findtruncated_sorted implementation and converting to a Vector{Int} seems like a very hard-coded fix for the type instability, making assumptions about the output type of findtruncated_unsorted without making many restrictions on the input types. It makes me wonder if it's not both easier, cleaner, more generic and better to maintain in the future if we simply use findtruncated = findtruncated_unsorted, ie we assume unsorted by default. If the implementation would try and sort an input which is already sorted, this is very cheap anyways, so I don't think there are that many drawbacks to this.

I was also wondering about the issorted branch, I'm happy to remove that and then define findtruncated = findtruncated_unsorted.

In general, when I first wrote this part of the code I wasn't necessarily thinking that findtruncated would be the specialization point, and this was more a way to avoid code duplication than the best point for customization. As @mtfishman already mentioned, at this point in the algorithms we really are missing some information about what kind of spectrum is being provided, so writing very generic implementations/specializations is not that straightforward. Given the comments in this PR, I would maybe even argue to get rid of findtruncated altogether, and simply keep findtruncated_sorted and findtruncated_unsorted instead.

I would bias towards keeping findtruncated just for the sake of having a shorter name, so maybe we could have findtruncated and findtruncated_sorted with the presumption that findtruncated doesn't assume the input is sorted unless there is more information, for example the spectrum outputs an object that indicates it is sorted? However, I might still bias towards having all three of findtruncated, findtruncated_unsorted, and findtruncated_sorted, and defaulting to findtruncated = findtruncated_unsorted in general. That leaves open the option that someone could define a SortedVector object which could overload findtruncated(values::SortedVector, strategy) = findtruncated_sorted(values, strategy). In other words, findtruncated could be thought of as the interface and findtruncated_unsorted/findtruncated_sorted can be thought of as implementations.

@Jutho
Copy link
Member

Jutho commented May 21, 2025

I agree with the remarks of Lukas, but if the reasoning is that sorting an already sorted array is very cheap, then we can maybe ditch the findtruncated_sorted altogether and only have findtruncated which always first sorts. The problem appears further down the chain where we slice matrices with a vector instead of a range, but even that effect might be negligible in the context of a realistic use case.

Yet another strategy is to always compute the p=sortperm(...) in truncate!, and only use findtruncated_sorted on view(values, p), though that probably eliminates the possibility of using specialized methods like partialsortperm.

@mtfishman
Copy link
Collaborator Author

For the sake of this PR, I would be fine with just having findtruncated that doesn't assume the spectrum is sorted, and we can leave open the possibility of reintroducing findtruncated_sorted if there is a compelling use case (i.e. where it is beneficial for performance, either in the logic of finding which values to keep or outputting a range which can be used for more efficient slicing).

@lkdvos
Copy link
Member

lkdvos commented May 21, 2025

I'm still not entirely sure this is the right place for all of this, given how annoying it is to get all of this to line up properly.
It is starting to sound more and more like trying to reuse the code for both singular values and eigenvalues and nullspaces etc is ending up hurting more than it helps, as the amount of complexity we are introducing to "generalize" is becoming as large as the implementations themselves.

truncate!(::typeof(f), out, strategy) might just be the more correct place for actually altering these implementations, since I still think that for f === svd_.. it should be allowed to assume that the values are real, positive and sorted in a descending manner, and whenever the type breaks these assumptions it should just specialize that function.
It might just be me, but it really feels like we are making this needlessly complicated by trying to handle every single edge case, which I think should just not be the goal here.

At the end of the day, to me the following seems like a very reasonable default implementation, and I am a bit hesitant to think that the infrastructure with sorted or unsorted find_truncated implementations is better, easier to maintain, or more clear in the long run.

function truncate!(::typeof(svd_trunc!), (U, S, Vh), strategy::TruncationKeepBelow)
    values = diagview(S)
    atol = max(strategy.atol, strategy.rtol * norm(values))
    ind = searchsortedlast(values, atol):length(values)
    return U[:, ind], Diagonal(values[ind]), Vᴴ[ind, :]
end

On a separate note, it seems like we are using the 1-norm determining relative tolerances, which seems odd in the context of singular values since I would argue 2-norms are more relevant there.

@mtfishman
Copy link
Collaborator Author

Just to be clear, are you basically proposing getting rid of findtruncated entirely? I personally like that abstraction, and I don't really think the implementation is all that complicated (I think it is the discussion that is getting complicated, which is fine since we should get the design to be as good and general as possible). I personally wouldn't want to have to repeat the logic of what values should be kept for the TruncationKeepBelow strategy for each factorization that might want to use it, I think it is nice to have one layer that just analyzes the spectrum and decides which values to keep and another that performs the truncation given the results of that. Of course, in some cases those steps will need to be merged/customized at which point you can directly overload truncate! and circumvent findtruncated as you show.

@mtfishman
Copy link
Collaborator Author

mtfishman commented May 21, 2025

On a separate note, it seems like we are using the 1-norm determining relative tolerances, which seems odd in the context of singular values since I would argue 2-norms are more relevant there.

Agreed. EDIT: Previously it was actually using the infinity norm (the maximum value) but I changed it to use the 2-norm by default, with the option to specify the norm.

@mtfishman
Copy link
Collaborator Author

As a concrete step, I'll try to collate all of the feedback above into this PR and we'll see how general and simple we can make findtruncated, and decide from there.

I'll add that a nontrivial usage of the findtruncated logic is truncation strategies that are composed or wrapped around each other. I think that logic is both easier to do at the findtruncated level and more efficient, since you can just work at the level of the spectrum rather than perform multiple truncations of the actual matrices (i.e. you could compose two truncation strategies at the truncate! level but that doesn't seem like the right level to do that).

@mtfishman
Copy link
Collaborator Author

mtfishman commented May 21, 2025

Another option is one that I suggested above which is to pass more information to findtruncated about the factorization, i.e.:

function findtruncated(::typeof(svd_trunc!), (U, S, Vh), strategy::TruncationKeepBelow)
    values = diagview(S)
    atol = max(strategy.atol, strategy.rtol * norm(values))
    return searchsortedlast(values, atol):length(values)
end

function truncate!(::typeof(svd_trunc!), (U, S, Vh), strategy::TruncationStrategy)
    ind = findtruncated(svd_trunc!, (U, S, Vh), strategy)
    return U[:, ind], Diagonal(values[ind]), Vᴴ[ind, :]
end

There could still be a general fallback of findtruncated based on just the values that doesn't assume anything about the spectrum.

@mtfishman
Copy link
Collaborator Author

I think I've addressed all of the comments. In the latest commits, I:

  • changed sortby to by in TruncationKeepSorted,
  • simplified the logic of findtruncated, so now there is findtruncated that is general and doesn't assume the input is sorted, and findtruncated_sorted that assumes the input is reverse sorted by absolute value. findtruncated_sorted is only used in the truncation step of svd_trunc!. Also, findtruncated_sorted calls findtruncated for strategies where findtruncated_sorted isn't defined, which helps with generality so not all strategies have to define it,
  • changed TruncationKeepAbove and TruncationKeepBelow to use the 2-norm by default for the relative tolerance, and the norm value is customizable with a new field p, and
  • simplified implementations of findtruncated based on suggestions from Lukas.

This is good to go from my end, let me know what you think.

lkdvos
lkdvos previously approved these changes May 22, 2025
Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely looks good to me.
Given that we now really heavily use and rely on findtruncated and findtruncated_sorted, it might make sense to add docstrings to these (especially to document the assumptions for the latter), but otherwise I would be happy to merge this.

Just double-checking here: I don't think this counts as a breaking change, since fieldnames are internal and the rest is fixes and improvements, so we might actually just tag this?

@mtfishman
Copy link
Collaborator Author

mtfishman commented May 22, 2025

This definitely looks good to me. Given that we now really heavily use and rely on findtruncated and findtruncated_sorted, it might make sense to add docstrings to these (especially to document the assumptions for the latter), but otherwise I would be happy to merge this.

Sure, I can add docstrings and make them public.

Just double-checking here: I don't think this counts as a breaking change, since fieldnames are internal and the rest is fixes and improvements, so we might actually just tag this?

The issue is that the main branch has breaking changes that haven't been registered (the last registered version was v0.1.2, but main is up to v0.2 because of the breaking changes), and this is built on top of that. So I think after we merge this we should register v0.2.

Co-authored-by: Jutho <[email protected]>
lkdvos
lkdvos previously approved these changes May 22, 2025
Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought v0.2 got tagged last week, my bad. That makes everything even easier, we can simply tag 0.2?

Jutho
Jutho previously approved these changes May 22, 2025
@mtfishman mtfishman dismissed stale reviews from Jutho and lkdvos via ee4ea74 May 22, 2025 15:44
@mtfishman mtfishman merged commit 1e86aea into main May 22, 2025
9 checks passed
@mtfishman mtfishman deleted the mf/unsorted_truncation branch May 22, 2025 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants