[Performance] `TreeTransformer` refactor + multithreading #251

lkdvos · 2025-07-01T15:30:35Z

This is a PR that changes multiple things, after a bunch of discussions and brainstorming with @ogauthe over the past weeks.

Firstly, I simplified the AbelianTreeTransformer by only storing the relevant information, ie this object no longer needs to hold on to the entire tensor fusiontree structure, and simply has the required data in a Vector. This also changes the implementation to "array-of-struct" instead of "struct-of-arrays", which should slightly improve data locality (although this is probably negligible).

Secondly, motivated by @ogauthe, I experimented with grouping the GenericTreeTransformer over the different subblocks with the same uncoupled charges. These subblocks have the same sizes, and as a result it is possible to concatenate them into a contiguous buffer to perform the recoupling through a BLAS mul! call. If this transformation is indeed given by a dense matrix of N x N coefficients with subblocks of length L, this changes the implementation from:

previously: N^2 calls to tensoradd!, ie N^2 scaled permutations of L elements
now: N almost contiguous copies of L elements, a BLAS call of N^2 L, and N permutations of L elements.

The main idea is that at least for the dominant part of this algorithm, the data is kept contiguous and we can make use of efficient BLAS routines, effectively changing the number of non-optimal access operations from ~ N^2 to ~ N.

Finally, I also went ahead and actually implemented the multi-threading, motivated by the implementation with an atomic Int in #100. This seems like a very lightweight implementation that balances loads quite well, and has no external dependencies.

I think the major remaining components are to benchmark these implementations, presumably in real-world scenarios such as MPSKit and PEPSKit, and possibly find some heuristic to disable the multi-threading for tensors that are too small for the overhead to outweigh the benefits.

!!! note
Before merging I'd like to add @ogauthe to the co-author list of this PR, given his involvement with figuring this out.

codecov · 2025-07-01T19:18:09Z

Codecov Report

Attention: Patch coverage is 84.65347% with 31 lines in your changes missing coverage. Please review.

Project coverage is 82.90%. Comparing base (ffa6cf9) to head (1016b5a).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
src/tensors/indexmanipulations.jl	75.26%	23 Missing ⚠️
src/TensorKit.jl	14.28%	6 Missing ⚠️
src/tensors/treetransformers.jl	97.67%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #251      +/-   ##
==========================================
- Coverage   83.08%   82.90%   -0.19%     
==========================================
  Files          44       44              
  Lines        5647     5749     +102     
==========================================
+ Hits         4692     4766      +74     
- Misses        955      983      +28

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/TensorKit.jl

src/tensors/indexmanipulations.jl

src/tensors/treetransformers.jl

src/tensors/indexmanipulations.jl

src/tensors/treetransformers.jl

Jutho

This is a great PR, and I am fully supportive of the new strategy, which is clearly (in hindsight) much more sensible. I left a few comments, but I don't think any of them is really essential.

lkdvos · 2025-07-07T16:06:04Z

For the sorting, as well as the cost model of the steps, given the current multi-threading implementation, and the fact that the buffer sizes are determined independently, this is actually no longer really relevant. I'm happy to leave them in, in case anyone wants to experiment with different threading strategies, but otherwise I don't think this has any meaningful impact.

Rewrite treetransformers Introduce `StridedStructure`

lkdvos · 2025-07-08T00:33:38Z

So, after running some benchmarks, the results are in. This is clearly an improvement, but it also is important to notice that this only starts making a difference once the number of legs becomes large, as for the MPO contractions it doesn't seem to be as relevant.

These are the results for running everything single threaded.

MPO contraction

julia> group = "mpo";

julia> comparison = collect(judge(result_est_new[group], result_est_main[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", "[160, 5, 3]", 2), TrialJudgement(+6.98% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[200, 20, 20]", 2), TrialJudgement(-8.49% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[2560, 5, 3]", 2), TrialJudgement(-1.18% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[40, 5, 3]", 2), TrialJudgement(+0.16% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 20, 20]", 2), TrialJudgement(+3.42% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 40, 40]", 2), TrialJudgement(+2.80% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6120, 5, 3]", 2), TrialJudgement(+0.35% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[640, 5, 3]", 2), TrialJudgement(-10.36% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[10, 4, 3]", "nothing"), TrialJudgement(-1.89% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[100, 10, 10]", "nothing"), TrialJudgement(-4.63% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[160, 4, 3]", "nothing"), TrialJudgement(-6.55% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[200, 10, 10]", "nothing"), TrialJudgement(-2.63% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[2560, 4, 3]", "nothing"), TrialJudgement(-0.02% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[300, 20, 20]", "nothing"), TrialJudgement(-1.21% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[40, 4, 3]", "nothing"), TrialJudgement(-5.92% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[640, 4, 3]", "nothing"), TrialJudgement(-0.38% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[160, 5, 3]", 0.5), TrialJudgement(-13.16% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[200, 20, 20]", 0.5), TrialJudgement(-8.85% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[2560, 5, 3]", 0.5), TrialJudgement(+0.55% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[40, 5, 3]", 0.5), TrialJudgement(-5.27% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[400, 20, 20]", 0.5), TrialJudgement(-4.36% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[400, 40, 40]", 0.5), TrialJudgement(-0.82% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6120, 5, 3]", 0.5), TrialJudgement(+0.78% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[640, 5, 3]", 0.5), TrialJudgement(-3.15% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[10, 4, 4]", 0.5), TrialJudgement(-4.24% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[100, 10, 10]", 0.5), TrialJudgement(-11.83% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[160, 4, 4]", 0.5), TrialJudgement(-18.93% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[200, 10, 10]", 0.5), TrialJudgement(-6.15% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[2560, 4, 4]", 0.5), TrialJudgement(+0.11% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[300, 20, 20]", 0.5), TrialJudgement(-6.32% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[40, 4, 4]", 0.5), TrialJudgement(-4.59% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[640, 4, 4]", 0.5), TrialJudgement(-3.05% => invariant))

MERA contraction

julia> group = "mera";

julia> comparison = collect(judge(result_est_new[group], result_est_main[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
24-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", 4, 2.0), TrialJudgement(-70.38% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 8, 2.0), TrialJudgement(-86.74% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 12, 2.0), TrialJudgement(-83.67% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 16, 2.0), TrialJudgement(-92.40% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 22, 2.0), TrialJudgement(-87.01% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 28, 2.0), TrialJudgement(-81.28% => improvement))
 Pair{Any, Any}(("Float64", "Trivial", 2, "nothing"), TrialJudgement(-1.19% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 3, "nothing"), TrialJudgement(-3.45% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 4, "nothing"), TrialJudgement(-11.41% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 8, "nothing"), TrialJudgement(-1.69% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 12, "nothing"), TrialJudgement(+1.75% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 16, "nothing"), TrialJudgement(+1.55% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 4, 0.5), TrialJudgement(-1.29% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 8, 0.5), TrialJudgement(-4.32% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 12, 0.5), TrialJudgement(+4.31% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 16, 0.5), TrialJudgement(+1.63% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 22, 0.5), TrialJudgement(+2.38% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 28, 0.5), TrialJudgement(-0.75% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 2, 0.5), TrialJudgement(-2.61% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 4, 0.5), TrialJudgement(-2.52% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 8, 0.5), TrialJudgement(-1.75% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 12, 0.5), TrialJudgement(-2.78% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 16, 0.5), TrialJudgement(+1.90% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 20, 0.5), TrialJudgement(+1.64% => invariant))

PEPO contraction

julia> group = "pepo";

julia> comparison = collect(judge(result_est_new[group], result_est_main[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 2, 2, 50]", 2.0), TrialJudgement(-90.86% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 3, 2, 100]", 2.0), TrialJudgement(-87.67% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 2, 2, 100]", 2.0), TrialJudgement(-81.93% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 4, 4, 200]", 2.0), TrialJudgement(-88.23% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 2, 2, 100]", 2.0), TrialJudgement(-78.96% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 3, 4, 200]", 2.0), TrialJudgement(-73.29% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 2, 100]", 2.0), TrialJudgement(-87.60% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 4, 200]", 2.0), TrialJudgement(-87.39% => improvement))
 Pair{Any, Any}(("Float64", "Trivial", "[3, 2, 2, 50]", "nothing"), TrialJudgement(-5.06% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[3, 3, 3, 100]", "nothing"), TrialJudgement(+2.74% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[4, 2, 2, 50]", "nothing"), TrialJudgement(+3.78% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[4, 3, 3, 100]", "nothing"), TrialJudgement(+5.92% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 2, 50]", "nothing"), TrialJudgement(-1.71% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 3, 100]", "nothing"), TrialJudgement(+1.51% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[6, 2, 2, 50]", "nothing"), TrialJudgement(-1.55% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[6, 3, 2, 100]", "nothing"), TrialJudgement(-0.11% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[10, 2, 2, 50]", 0.5), TrialJudgement(+10.12% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[10, 3, 2, 100]", 0.5), TrialJudgement(-2.37% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[4, 2, 2, 100]", 0.5), TrialJudgement(-1.15% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[4, 4, 4, 200]", 0.5), TrialJudgement(-1.30% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6, 2, 2, 100]", 0.5), TrialJudgement(-7.47% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6, 3, 4, 200]", 0.5), TrialJudgement(+2.63% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 2, 100]", 0.5), TrialJudgement(-1.32% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 4, 200]", 0.5), TrialJudgement(-2.61% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 2, 2, 50]", 0.5), TrialJudgement(-9.19% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 4, 4, 100]", 0.5), TrialJudgement(+2.24% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 2, 2, 50]", 0.5), TrialJudgement(-2.64% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 3, 4, 100]", 0.5), TrialJudgement(+2.56% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 2, 50]", 0.5), TrialJudgement(+5.36% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 4, 100]", 0.5), TrialJudgement(+0.42% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 2, 2, 50]", 0.5), TrialJudgement(+1.03% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 3, 2, 100]", 0.5), TrialJudgement(+3.10% => invariant))

For completeness I did a similar experiment, enabling only the transformer multithreading with 2 and 4 threads, without the limitation on the data sizes, to have an idea at what point things start helping. It seems like there is definitely things to be gained, but it would be interesting to experiment with distributing threads among the different drivers in real world settings to actually gauge the effect.

2 threads vs 1 thread

julia> group = "mpo";

julia> comparison = collect(judge(result_est_new2[group], result_est_new[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", "[160, 5, 3]", 2), TrialJudgement(+31.86% => regression))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[200, 20, 20]", 2), TrialJudgement(-1.91% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[2560, 5, 3]", 2), TrialJudgement(-13.55% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[40, 5, 3]", 2), TrialJudgement(+60.35% => regression))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 20, 20]", 2), TrialJudgement(-12.53% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 40, 40]", 2), TrialJudgement(-19.24% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6120, 5, 3]", 2), TrialJudgement(-13.69% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[640, 5, 3]", 2), TrialJudgement(+0.17% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[10, 4, 3]", "nothing"), TrialJudgement(+18.60% => regression))
 Pair{Any, Any}(("Float64", "Trivial", "[100, 10, 10]", "nothing"), TrialJudgement(-0.72% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[160, 4, 3]", "nothing"), TrialJudgement(-2.88% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[200, 10, 10]", "nothing"), TrialJudgement(+11.17% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[2560, 4, 3]", "nothing"), TrialJudgement(+0.06% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[300, 20, 20]", "nothing"), TrialJudgement(-1.11% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[40, 4, 3]", "nothing"), TrialJudgement(+7.79% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[640, 4, 3]", "nothing"), TrialJudgement(-0.92% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[160, 5, 3]", 0.5), TrialJudgement(+7.49% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[200, 20, 20]", 0.5), TrialJudgement(-8.89% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[2560, 5, 3]", 0.5), TrialJudgement(-1.87% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[40, 5, 3]", 0.5), TrialJudgement(+107.91% => regression))
 Pair{Any, Any}(("Float64", "U1Irrep", "[400, 20, 20]", 0.5), TrialJudgement(-6.33% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[400, 40, 40]", 0.5), TrialJudgement(-7.37% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6120, 5, 3]", 0.5), TrialJudgement(-0.76% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[640, 5, 3]", 0.5), TrialJudgement(-4.32% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[10, 4, 4]", 0.5), TrialJudgement(+249.25% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[100, 10, 10]", 0.5), TrialJudgement(-3.48% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[160, 4, 4]", 0.5), TrialJudgement(+6.42% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[200, 10, 10]", 0.5), TrialJudgement(-11.77% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[2560, 4, 4]", 0.5), TrialJudgement(-12.16% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[300, 20, 20]", 0.5), TrialJudgement(-22.22% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[40, 4, 4]", 0.5), TrialJudgement(+87.99% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[640, 4, 4]", 0.5), TrialJudgement(-10.93% => invariant))

julia> group = "mera";

julia> comparison = collect(judge(result_est_new2[group], result_est_new[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
24-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", 4, 2.0), TrialJudgement(-0.35% => invariant))
 Pair{Any, Any}(("Float64", "SU2Irrep", 8, 2.0), TrialJudgement(-34.05% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 12, 2.0), TrialJudgement(-39.23% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 16, 2.0), TrialJudgement(-46.35% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 22, 2.0), TrialJudgement(-35.32% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 28, 2.0), TrialJudgement(-37.79% => improvement))
 Pair{Any, Any}(("Float64", "Trivial", 2, "nothing"), TrialJudgement(+6.86% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 3, "nothing"), TrialJudgement(+6.61% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 4, "nothing"), TrialJudgement(+13.17% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 8, "nothing"), TrialJudgement(+0.89% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 12, "nothing"), TrialJudgement(-2.70% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 16, "nothing"), TrialJudgement(-0.93% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 4, 0.5), TrialJudgement(+131.32% => regression))
 Pair{Any, Any}(("Float64", "U1Irrep", 8, 0.5), TrialJudgement(+18.74% => regression))
 Pair{Any, Any}(("Float64", "U1Irrep", 12, 0.5), TrialJudgement(-24.76% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", 16, 0.5), TrialJudgement(-16.73% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", 22, 0.5), TrialJudgement(-15.23% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", 28, 0.5), TrialJudgement(-10.36% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 2, 0.5), TrialJudgement(+122.77% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", 4, 0.5), TrialJudgement(+53.27% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", 8, 0.5), TrialJudgement(+1.62% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 12, 0.5), TrialJudgement(-17.29% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", 16, 0.5), TrialJudgement(-21.12% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", 20, 0.5), TrialJudgement(-17.54% => improvement))

julia> group = "pepo";

julia> comparison = collect(judge(result_est_new2[group], result_est_new[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 2, 2, 50]", 2.0), TrialJudgement(-48.27% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 3, 2, 100]", 2.0), TrialJudgement(-41.85% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 2, 2, 100]", 2.0), TrialJudgement(-42.04% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 4, 4, 200]", 2.0), TrialJudgement(-42.23% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 2, 2, 100]", 2.0), TrialJudgement(-41.14% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 3, 4, 200]", 2.0), TrialJudgement(-39.20% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 2, 100]", 2.0), TrialJudgement(-42.91% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 4, 200]", 2.0), TrialJudgement(-37.54% => improvement))
 Pair{Any, Any}(("Float64", "Trivial", "[3, 2, 2, 50]", "nothing"), TrialJudgement(+6.42% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[3, 3, 3, 100]", "nothing"), TrialJudgement(-7.24% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[4, 2, 2, 50]", "nothing"), TrialJudgement(+0.42% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[4, 3, 3, 100]", "nothing"), TrialJudgement(-5.51% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 2, 50]", "nothing"), TrialJudgement(-4.61% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 3, 100]", "nothing"), TrialJudgement(-5.79% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[6, 2, 2, 50]", "nothing"), TrialJudgement(-1.53% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[6, 3, 2, 100]", "nothing"), TrialJudgement(-0.97% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[10, 2, 2, 50]", 0.5), TrialJudgement(-25.02% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", "[10, 3, 2, 100]", 0.5), TrialJudgement(-14.86% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[4, 2, 2, 100]", 0.5), TrialJudgement(-10.95% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[4, 4, 4, 200]", 0.5), TrialJudgement(-3.78% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6, 2, 2, 100]", 0.5), TrialJudgement(-8.11% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6, 3, 4, 200]", 0.5), TrialJudgement(-16.78% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 2, 100]", 0.5), TrialJudgement(-13.30% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 4, 200]", 0.5), TrialJudgement(-10.33% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 2, 2, 50]", 0.5), TrialJudgement(-9.33% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 4, 4, 100]", 0.5), TrialJudgement(-20.25% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 2, 2, 50]", 0.5), TrialJudgement(-5.60% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 3, 4, 100]", 0.5), TrialJudgement(-21.14% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 2, 50]", 0.5), TrialJudgement(-23.69% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 4, 100]", 0.5), TrialJudgement(-20.99% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 2, 2, 50]", 0.5), TrialJudgement(-16.59% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 3, 2, 100]", 0.5), TrialJudgement(-22.42% => improvement))

4 threads vs 1 thread

julia> group = "mpo";

julia> comparison = collect(judge(result_est_new4[group], result_est_new[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", "[160, 5, 3]", 2), TrialJudgement(+19.33% => regression))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[200, 20, 20]", 2), TrialJudgement(-29.03% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[2560, 5, 3]", 2), TrialJudgement(-17.28% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[40, 5, 3]", 2), TrialJudgement(+53.86% => regression))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 20, 20]", 2), TrialJudgement(-30.31% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 40, 40]", 2), TrialJudgement(-26.26% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6120, 5, 3]", 2), TrialJudgement(-17.71% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[640, 5, 3]", 2), TrialJudgement(-15.14% => improvement))
 Pair{Any, Any}(("Float64", "Trivial", "[10, 4, 3]", "nothing"), TrialJudgement(+1.95% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[100, 10, 10]", "nothing"), TrialJudgement(-3.29% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[160, 4, 3]", "nothing"), TrialJudgement(-4.33% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[200, 10, 10]", "nothing"), TrialJudgement(-2.75% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[2560, 4, 3]", "nothing"), TrialJudgement(+0.27% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[300, 20, 20]", "nothing"), TrialJudgement(-0.13% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[40, 4, 3]", "nothing"), TrialJudgement(-10.46% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[640, 4, 3]", "nothing"), TrialJudgement(-1.99% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[160, 5, 3]", 0.5), TrialJudgement(+2.99% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[200, 20, 20]", 0.5), TrialJudgement(-12.24% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[2560, 5, 3]", 0.5), TrialJudgement(-1.93% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[40, 5, 3]", 0.5), TrialJudgement(+80.98% => regression))
 Pair{Any, Any}(("Float64", "U1Irrep", "[400, 20, 20]", 0.5), TrialJudgement(-8.21% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[400, 40, 40]", 0.5), TrialJudgement(-4.58% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6120, 5, 3]", 0.5), TrialJudgement(+0.90% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[640, 5, 3]", 0.5), TrialJudgement(-5.74% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[10, 4, 4]", 0.5), TrialJudgement(+298.92% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[100, 10, 10]", 0.5), TrialJudgement(-14.76% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[160, 4, 4]", 0.5), TrialJudgement(-7.60% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[200, 10, 10]", 0.5), TrialJudgement(-19.96% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[2560, 4, 4]", 0.5), TrialJudgement(-11.07% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[300, 20, 20]", 0.5), TrialJudgement(-26.18% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[40, 4, 4]", 0.5), TrialJudgement(+90.16% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[640, 4, 4]", 0.5), TrialJudgement(-14.73% => invariant))

julia> group = "mera";

julia> comparison = collect(judge(result_est_new4[group], result_est_new[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
24-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", 4, 2.0), TrialJudgement(-49.64% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 8, 2.0), TrialJudgement(-64.71% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 12, 2.0), TrialJudgement(-56.83% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 16, 2.0), TrialJudgement(-67.05% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 22, 2.0), TrialJudgement(-54.39% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", 28, 2.0), TrialJudgement(-53.22% => improvement))
 Pair{Any, Any}(("Float64", "Trivial", 2, "nothing"), TrialJudgement(+1.30% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 3, "nothing"), TrialJudgement(+14.43% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 4, "nothing"), TrialJudgement(+3.81% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 8, "nothing"), TrialJudgement(+2.96% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 12, "nothing"), TrialJudgement(-0.11% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", 16, "nothing"), TrialJudgement(+0.75% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", 4, 0.5), TrialJudgement(+171.77% => regression))
 Pair{Any, Any}(("Float64", "U1Irrep", 8, 0.5), TrialJudgement(+19.75% => regression))
 Pair{Any, Any}(("Float64", "U1Irrep", 12, 0.5), TrialJudgement(-23.47% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", 16, 0.5), TrialJudgement(-21.67% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", 22, 0.5), TrialJudgement(-18.58% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", 28, 0.5), TrialJudgement(-15.78% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", 2, 0.5), TrialJudgement(+178.65% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", 4, 0.5), TrialJudgement(+75.30% => regression))
 Pair{Any, Any}(("Float64", "Z2Irrep", 8, 0.5), TrialJudgement(-14.56% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", 12, 0.5), TrialJudgement(-23.38% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", 16, 0.5), TrialJudgement(-25.95% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", 20, 0.5), TrialJudgement(-24.06% => improvement))

julia> group = "pepo";

julia> comparison = collect(judge(result_est_new4[group], result_est_new[group]));

julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
 Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 2, 2, 50]", 2.0), TrialJudgement(-68.36% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 3, 2, 100]", 2.0), TrialJudgement(-64.31% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 2, 2, 100]", 2.0), TrialJudgement(-62.56% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 4, 4, 200]", 2.0), TrialJudgement(-61.43% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 2, 2, 100]", 2.0), TrialJudgement(-56.69% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 3, 4, 200]", 2.0), TrialJudgement(-53.83% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 2, 100]", 2.0), TrialJudgement(-64.40% => improvement))
 Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 4, 200]", 2.0), TrialJudgement(-58.21% => improvement))
 Pair{Any, Any}(("Float64", "Trivial", "[3, 2, 2, 50]", "nothing"), TrialJudgement(+8.14% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[3, 3, 3, 100]", "nothing"), TrialJudgement(-4.44% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[4, 2, 2, 50]", "nothing"), TrialJudgement(+0.33% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[4, 3, 3, 100]", "nothing"), TrialJudgement(-1.24% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 2, 50]", "nothing"), TrialJudgement(-5.95% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 3, 100]", "nothing"), TrialJudgement(-1.74% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[6, 2, 2, 50]", "nothing"), TrialJudgement(+1.60% => invariant))
 Pair{Any, Any}(("Float64", "Trivial", "[6, 3, 2, 100]", "nothing"), TrialJudgement(+0.09% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[10, 2, 2, 50]", 0.5), TrialJudgement(-23.18% => improvement))
 Pair{Any, Any}(("Float64", "U1Irrep", "[10, 3, 2, 100]", 0.5), TrialJudgement(-12.33% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[4, 2, 2, 100]", 0.5), TrialJudgement(-6.54% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[4, 4, 4, 200]", 0.5), TrialJudgement(-2.42% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6, 2, 2, 100]", 0.5), TrialJudgement(-8.01% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[6, 3, 4, 200]", 0.5), TrialJudgement(-13.01% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 2, 100]", 0.5), TrialJudgement(-9.83% => invariant))
 Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 4, 200]", 0.5), TrialJudgement(-7.28% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 2, 2, 50]", 0.5), TrialJudgement(-13.39% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 4, 4, 100]", 0.5), TrialJudgement(-26.58% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 2, 2, 50]", 0.5), TrialJudgement(-12.03% => invariant))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 3, 4, 100]", 0.5), TrialJudgement(-30.31% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 2, 50]", 0.5), TrialJudgement(-29.04% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 4, 100]", 0.5), TrialJudgement(-27.33% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 2, 2, 50]", 0.5), TrialJudgement(-22.15% => improvement))
 Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 3, 2, 100]", 0.5), TrialJudgement(-26.79% => improvement))

In summary though, I think it is safe to conclude that this is an overall improvement, and we can safely merge this.

As a fair warning again: I want to add the co-authored by tag for Olivier, so if you approve to merge tomorrow I'll gladly merge myself to make sure this is done.

lkdvos · 2025-07-08T16:59:23Z

As discussed before, I changed the last things considering the copyto! and copy! interaction with Strided. I'll let the tests finish to make sure I have everything right now, and then I'll merge.

lkdvos requested a review from Jutho July 3, 2025 15:12

Jutho reviewed Jul 4, 2025

View reviewed changes

src/TensorKit.jl Outdated Show resolved Hide resolved