-
Notifications
You must be signed in to change notification settings - Fork 53
[Performance] TreeTransformer
refactor + multithreading
#251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #251 +/- ##
==========================================
- Coverage 83.08% 82.90% -0.19%
==========================================
Files 44 44
Lines 5647 5749 +102
==========================================
+ Hits 4692 4766 +74
- Misses 955 983 +28 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great PR, and I am fully supportive of the new strategy, which is clearly (in hindsight) much more sensible. I left a few comments, but I don't think any of them is really essential.
For the sorting, as well as the cost model of the steps, given the current multi-threading implementation, and the fact that the buffer sizes are determined independently, this is actually no longer really relevant. I'm happy to leave them in, in case anyone wants to experiment with different threading strategies, but otherwise I don't think this has any meaningful impact. |
Rewrite treetransformers Introduce `StridedStructure`
So, after running some benchmarks, the results are in. This is clearly an improvement, but it also is important to notice that this only starts making a difference once the number of legs becomes large, as for the MPO contractions it doesn't seem to be as relevant. These are the results for running everything single threaded. MPO contractionjulia> group = "mpo";
julia> comparison = collect(judge(result_est_new[group], result_est_main[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", "[160, 5, 3]", 2), TrialJudgement(+6.98% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[200, 20, 20]", 2), TrialJudgement(-8.49% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[2560, 5, 3]", 2), TrialJudgement(-1.18% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[40, 5, 3]", 2), TrialJudgement(+0.16% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 20, 20]", 2), TrialJudgement(+3.42% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 40, 40]", 2), TrialJudgement(+2.80% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6120, 5, 3]", 2), TrialJudgement(+0.35% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[640, 5, 3]", 2), TrialJudgement(-10.36% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[10, 4, 3]", "nothing"), TrialJudgement(-1.89% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[100, 10, 10]", "nothing"), TrialJudgement(-4.63% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[160, 4, 3]", "nothing"), TrialJudgement(-6.55% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[200, 10, 10]", "nothing"), TrialJudgement(-2.63% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[2560, 4, 3]", "nothing"), TrialJudgement(-0.02% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[300, 20, 20]", "nothing"), TrialJudgement(-1.21% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[40, 4, 3]", "nothing"), TrialJudgement(-5.92% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[640, 4, 3]", "nothing"), TrialJudgement(-0.38% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[160, 5, 3]", 0.5), TrialJudgement(-13.16% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[200, 20, 20]", 0.5), TrialJudgement(-8.85% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[2560, 5, 3]", 0.5), TrialJudgement(+0.55% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[40, 5, 3]", 0.5), TrialJudgement(-5.27% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[400, 20, 20]", 0.5), TrialJudgement(-4.36% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[400, 40, 40]", 0.5), TrialJudgement(-0.82% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6120, 5, 3]", 0.5), TrialJudgement(+0.78% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[640, 5, 3]", 0.5), TrialJudgement(-3.15% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[10, 4, 4]", 0.5), TrialJudgement(-4.24% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[100, 10, 10]", 0.5), TrialJudgement(-11.83% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[160, 4, 4]", 0.5), TrialJudgement(-18.93% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[200, 10, 10]", 0.5), TrialJudgement(-6.15% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[2560, 4, 4]", 0.5), TrialJudgement(+0.11% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[300, 20, 20]", 0.5), TrialJudgement(-6.32% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[40, 4, 4]", 0.5), TrialJudgement(-4.59% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[640, 4, 4]", 0.5), TrialJudgement(-3.05% => invariant)) MERA contractionjulia> group = "mera";
julia> comparison = collect(judge(result_est_new[group], result_est_main[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
24-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", 4, 2.0), TrialJudgement(-70.38% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 8, 2.0), TrialJudgement(-86.74% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 12, 2.0), TrialJudgement(-83.67% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 16, 2.0), TrialJudgement(-92.40% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 22, 2.0), TrialJudgement(-87.01% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 28, 2.0), TrialJudgement(-81.28% => improvement))
Pair{Any, Any}(("Float64", "Trivial", 2, "nothing"), TrialJudgement(-1.19% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 3, "nothing"), TrialJudgement(-3.45% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 4, "nothing"), TrialJudgement(-11.41% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 8, "nothing"), TrialJudgement(-1.69% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 12, "nothing"), TrialJudgement(+1.75% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 16, "nothing"), TrialJudgement(+1.55% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 4, 0.5), TrialJudgement(-1.29% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 8, 0.5), TrialJudgement(-4.32% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 12, 0.5), TrialJudgement(+4.31% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 16, 0.5), TrialJudgement(+1.63% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 22, 0.5), TrialJudgement(+2.38% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 28, 0.5), TrialJudgement(-0.75% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 2, 0.5), TrialJudgement(-2.61% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 4, 0.5), TrialJudgement(-2.52% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 8, 0.5), TrialJudgement(-1.75% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 12, 0.5), TrialJudgement(-2.78% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 16, 0.5), TrialJudgement(+1.90% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 20, 0.5), TrialJudgement(+1.64% => invariant)) PEPO contractionjulia> group = "pepo";
julia> comparison = collect(judge(result_est_new[group], result_est_main[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 2, 2, 50]", 2.0), TrialJudgement(-90.86% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 3, 2, 100]", 2.0), TrialJudgement(-87.67% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 2, 2, 100]", 2.0), TrialJudgement(-81.93% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 4, 4, 200]", 2.0), TrialJudgement(-88.23% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 2, 2, 100]", 2.0), TrialJudgement(-78.96% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 3, 4, 200]", 2.0), TrialJudgement(-73.29% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 2, 100]", 2.0), TrialJudgement(-87.60% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 4, 200]", 2.0), TrialJudgement(-87.39% => improvement))
Pair{Any, Any}(("Float64", "Trivial", "[3, 2, 2, 50]", "nothing"), TrialJudgement(-5.06% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[3, 3, 3, 100]", "nothing"), TrialJudgement(+2.74% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[4, 2, 2, 50]", "nothing"), TrialJudgement(+3.78% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[4, 3, 3, 100]", "nothing"), TrialJudgement(+5.92% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 2, 50]", "nothing"), TrialJudgement(-1.71% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 3, 100]", "nothing"), TrialJudgement(+1.51% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[6, 2, 2, 50]", "nothing"), TrialJudgement(-1.55% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[6, 3, 2, 100]", "nothing"), TrialJudgement(-0.11% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[10, 2, 2, 50]", 0.5), TrialJudgement(+10.12% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[10, 3, 2, 100]", 0.5), TrialJudgement(-2.37% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[4, 2, 2, 100]", 0.5), TrialJudgement(-1.15% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[4, 4, 4, 200]", 0.5), TrialJudgement(-1.30% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6, 2, 2, 100]", 0.5), TrialJudgement(-7.47% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6, 3, 4, 200]", 0.5), TrialJudgement(+2.63% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 2, 100]", 0.5), TrialJudgement(-1.32% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 4, 200]", 0.5), TrialJudgement(-2.61% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 2, 2, 50]", 0.5), TrialJudgement(-9.19% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 4, 4, 100]", 0.5), TrialJudgement(+2.24% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 2, 2, 50]", 0.5), TrialJudgement(-2.64% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 3, 4, 100]", 0.5), TrialJudgement(+2.56% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 2, 50]", 0.5), TrialJudgement(+5.36% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 4, 100]", 0.5), TrialJudgement(+0.42% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 2, 2, 50]", 0.5), TrialJudgement(+1.03% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 3, 2, 100]", 0.5), TrialJudgement(+3.10% => invariant)) For completeness I did a similar experiment, enabling only the transformer multithreading with 2 and 4 threads, without the limitation on the data sizes, to have an idea at what point things start helping. It seems like there is definitely things to be gained, but it would be interesting to experiment with distributing threads among the different drivers in real world settings to actually gauge the effect. 2 threads vs 1 threadjulia> group = "mpo";
julia> comparison = collect(judge(result_est_new2[group], result_est_new[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", "[160, 5, 3]", 2), TrialJudgement(+31.86% => regression))
Pair{Any, Any}(("Float64", "SU2Irrep", "[200, 20, 20]", 2), TrialJudgement(-1.91% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[2560, 5, 3]", 2), TrialJudgement(-13.55% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[40, 5, 3]", 2), TrialJudgement(+60.35% => regression))
Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 20, 20]", 2), TrialJudgement(-12.53% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 40, 40]", 2), TrialJudgement(-19.24% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6120, 5, 3]", 2), TrialJudgement(-13.69% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", "[640, 5, 3]", 2), TrialJudgement(+0.17% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[10, 4, 3]", "nothing"), TrialJudgement(+18.60% => regression))
Pair{Any, Any}(("Float64", "Trivial", "[100, 10, 10]", "nothing"), TrialJudgement(-0.72% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[160, 4, 3]", "nothing"), TrialJudgement(-2.88% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[200, 10, 10]", "nothing"), TrialJudgement(+11.17% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[2560, 4, 3]", "nothing"), TrialJudgement(+0.06% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[300, 20, 20]", "nothing"), TrialJudgement(-1.11% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[40, 4, 3]", "nothing"), TrialJudgement(+7.79% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[640, 4, 3]", "nothing"), TrialJudgement(-0.92% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[160, 5, 3]", 0.5), TrialJudgement(+7.49% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[200, 20, 20]", 0.5), TrialJudgement(-8.89% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[2560, 5, 3]", 0.5), TrialJudgement(-1.87% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[40, 5, 3]", 0.5), TrialJudgement(+107.91% => regression))
Pair{Any, Any}(("Float64", "U1Irrep", "[400, 20, 20]", 0.5), TrialJudgement(-6.33% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[400, 40, 40]", 0.5), TrialJudgement(-7.37% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6120, 5, 3]", 0.5), TrialJudgement(-0.76% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[640, 5, 3]", 0.5), TrialJudgement(-4.32% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[10, 4, 4]", 0.5), TrialJudgement(+249.25% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", "[100, 10, 10]", 0.5), TrialJudgement(-3.48% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[160, 4, 4]", 0.5), TrialJudgement(+6.42% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[200, 10, 10]", 0.5), TrialJudgement(-11.77% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[2560, 4, 4]", 0.5), TrialJudgement(-12.16% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[300, 20, 20]", 0.5), TrialJudgement(-22.22% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[40, 4, 4]", 0.5), TrialJudgement(+87.99% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", "[640, 4, 4]", 0.5), TrialJudgement(-10.93% => invariant))
julia> group = "mera";
julia> comparison = collect(judge(result_est_new2[group], result_est_new[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
24-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", 4, 2.0), TrialJudgement(-0.35% => invariant))
Pair{Any, Any}(("Float64", "SU2Irrep", 8, 2.0), TrialJudgement(-34.05% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 12, 2.0), TrialJudgement(-39.23% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 16, 2.0), TrialJudgement(-46.35% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 22, 2.0), TrialJudgement(-35.32% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 28, 2.0), TrialJudgement(-37.79% => improvement))
Pair{Any, Any}(("Float64", "Trivial", 2, "nothing"), TrialJudgement(+6.86% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 3, "nothing"), TrialJudgement(+6.61% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 4, "nothing"), TrialJudgement(+13.17% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 8, "nothing"), TrialJudgement(+0.89% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 12, "nothing"), TrialJudgement(-2.70% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 16, "nothing"), TrialJudgement(-0.93% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 4, 0.5), TrialJudgement(+131.32% => regression))
Pair{Any, Any}(("Float64", "U1Irrep", 8, 0.5), TrialJudgement(+18.74% => regression))
Pair{Any, Any}(("Float64", "U1Irrep", 12, 0.5), TrialJudgement(-24.76% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", 16, 0.5), TrialJudgement(-16.73% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", 22, 0.5), TrialJudgement(-15.23% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", 28, 0.5), TrialJudgement(-10.36% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 2, 0.5), TrialJudgement(+122.77% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", 4, 0.5), TrialJudgement(+53.27% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", 8, 0.5), TrialJudgement(+1.62% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 12, 0.5), TrialJudgement(-17.29% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", 16, 0.5), TrialJudgement(-21.12% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", 20, 0.5), TrialJudgement(-17.54% => improvement))
julia> group = "pepo";
julia> comparison = collect(judge(result_est_new2[group], result_est_new[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 2, 2, 50]", 2.0), TrialJudgement(-48.27% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 3, 2, 100]", 2.0), TrialJudgement(-41.85% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 2, 2, 100]", 2.0), TrialJudgement(-42.04% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 4, 4, 200]", 2.0), TrialJudgement(-42.23% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 2, 2, 100]", 2.0), TrialJudgement(-41.14% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 3, 4, 200]", 2.0), TrialJudgement(-39.20% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 2, 100]", 2.0), TrialJudgement(-42.91% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 4, 200]", 2.0), TrialJudgement(-37.54% => improvement))
Pair{Any, Any}(("Float64", "Trivial", "[3, 2, 2, 50]", "nothing"), TrialJudgement(+6.42% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[3, 3, 3, 100]", "nothing"), TrialJudgement(-7.24% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[4, 2, 2, 50]", "nothing"), TrialJudgement(+0.42% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[4, 3, 3, 100]", "nothing"), TrialJudgement(-5.51% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 2, 50]", "nothing"), TrialJudgement(-4.61% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 3, 100]", "nothing"), TrialJudgement(-5.79% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[6, 2, 2, 50]", "nothing"), TrialJudgement(-1.53% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[6, 3, 2, 100]", "nothing"), TrialJudgement(-0.97% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[10, 2, 2, 50]", 0.5), TrialJudgement(-25.02% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", "[10, 3, 2, 100]", 0.5), TrialJudgement(-14.86% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[4, 2, 2, 100]", 0.5), TrialJudgement(-10.95% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[4, 4, 4, 200]", 0.5), TrialJudgement(-3.78% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6, 2, 2, 100]", 0.5), TrialJudgement(-8.11% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6, 3, 4, 200]", 0.5), TrialJudgement(-16.78% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 2, 100]", 0.5), TrialJudgement(-13.30% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 4, 200]", 0.5), TrialJudgement(-10.33% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 2, 2, 50]", 0.5), TrialJudgement(-9.33% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 4, 4, 100]", 0.5), TrialJudgement(-20.25% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 2, 2, 50]", 0.5), TrialJudgement(-5.60% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 3, 4, 100]", 0.5), TrialJudgement(-21.14% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 2, 50]", 0.5), TrialJudgement(-23.69% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 4, 100]", 0.5), TrialJudgement(-20.99% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 2, 2, 50]", 0.5), TrialJudgement(-16.59% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 3, 2, 100]", 0.5), TrialJudgement(-22.42% => improvement)) 4 threads vs 1 threadjulia> group = "mpo";
julia> comparison = collect(judge(result_est_new4[group], result_est_new[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", "[160, 5, 3]", 2), TrialJudgement(+19.33% => regression))
Pair{Any, Any}(("Float64", "SU2Irrep", "[200, 20, 20]", 2), TrialJudgement(-29.03% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[2560, 5, 3]", 2), TrialJudgement(-17.28% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[40, 5, 3]", 2), TrialJudgement(+53.86% => regression))
Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 20, 20]", 2), TrialJudgement(-30.31% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[400, 40, 40]", 2), TrialJudgement(-26.26% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6120, 5, 3]", 2), TrialJudgement(-17.71% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[640, 5, 3]", 2), TrialJudgement(-15.14% => improvement))
Pair{Any, Any}(("Float64", "Trivial", "[10, 4, 3]", "nothing"), TrialJudgement(+1.95% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[100, 10, 10]", "nothing"), TrialJudgement(-3.29% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[160, 4, 3]", "nothing"), TrialJudgement(-4.33% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[200, 10, 10]", "nothing"), TrialJudgement(-2.75% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[2560, 4, 3]", "nothing"), TrialJudgement(+0.27% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[300, 20, 20]", "nothing"), TrialJudgement(-0.13% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[40, 4, 3]", "nothing"), TrialJudgement(-10.46% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[640, 4, 3]", "nothing"), TrialJudgement(-1.99% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[160, 5, 3]", 0.5), TrialJudgement(+2.99% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[200, 20, 20]", 0.5), TrialJudgement(-12.24% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[2560, 5, 3]", 0.5), TrialJudgement(-1.93% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[40, 5, 3]", 0.5), TrialJudgement(+80.98% => regression))
Pair{Any, Any}(("Float64", "U1Irrep", "[400, 20, 20]", 0.5), TrialJudgement(-8.21% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[400, 40, 40]", 0.5), TrialJudgement(-4.58% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6120, 5, 3]", 0.5), TrialJudgement(+0.90% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[640, 5, 3]", 0.5), TrialJudgement(-5.74% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[10, 4, 4]", 0.5), TrialJudgement(+298.92% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", "[100, 10, 10]", 0.5), TrialJudgement(-14.76% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[160, 4, 4]", 0.5), TrialJudgement(-7.60% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[200, 10, 10]", 0.5), TrialJudgement(-19.96% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[2560, 4, 4]", 0.5), TrialJudgement(-11.07% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[300, 20, 20]", 0.5), TrialJudgement(-26.18% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[40, 4, 4]", 0.5), TrialJudgement(+90.16% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", "[640, 4, 4]", 0.5), TrialJudgement(-14.73% => invariant))
julia> group = "mera";
julia> comparison = collect(judge(result_est_new4[group], result_est_new[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
24-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", 4, 2.0), TrialJudgement(-49.64% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 8, 2.0), TrialJudgement(-64.71% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 12, 2.0), TrialJudgement(-56.83% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 16, 2.0), TrialJudgement(-67.05% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 22, 2.0), TrialJudgement(-54.39% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", 28, 2.0), TrialJudgement(-53.22% => improvement))
Pair{Any, Any}(("Float64", "Trivial", 2, "nothing"), TrialJudgement(+1.30% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 3, "nothing"), TrialJudgement(+14.43% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 4, "nothing"), TrialJudgement(+3.81% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 8, "nothing"), TrialJudgement(+2.96% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 12, "nothing"), TrialJudgement(-0.11% => invariant))
Pair{Any, Any}(("Float64", "Trivial", 16, "nothing"), TrialJudgement(+0.75% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", 4, 0.5), TrialJudgement(+171.77% => regression))
Pair{Any, Any}(("Float64", "U1Irrep", 8, 0.5), TrialJudgement(+19.75% => regression))
Pair{Any, Any}(("Float64", "U1Irrep", 12, 0.5), TrialJudgement(-23.47% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", 16, 0.5), TrialJudgement(-21.67% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", 22, 0.5), TrialJudgement(-18.58% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", 28, 0.5), TrialJudgement(-15.78% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", 2, 0.5), TrialJudgement(+178.65% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", 4, 0.5), TrialJudgement(+75.30% => regression))
Pair{Any, Any}(("Float64", "Z2Irrep", 8, 0.5), TrialJudgement(-14.56% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", 12, 0.5), TrialJudgement(-23.38% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", 16, 0.5), TrialJudgement(-25.95% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", 20, 0.5), TrialJudgement(-24.06% => improvement))
julia> group = "pepo";
julia> comparison = collect(judge(result_est_new4[group], result_est_new[group]));
julia> comparison = sort!(comparison, by=x -> (first(x)[2], first(x)[3]))
32-element Vector{Any}:
Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 2, 2, 50]", 2.0), TrialJudgement(-68.36% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[10, 3, 2, 100]", 2.0), TrialJudgement(-64.31% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 2, 2, 100]", 2.0), TrialJudgement(-62.56% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[4, 4, 4, 200]", 2.0), TrialJudgement(-61.43% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 2, 2, 100]", 2.0), TrialJudgement(-56.69% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[6, 3, 4, 200]", 2.0), TrialJudgement(-53.83% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 2, 100]", 2.0), TrialJudgement(-64.40% => improvement))
Pair{Any, Any}(("Float64", "SU2Irrep", "[8, 2, 4, 200]", 2.0), TrialJudgement(-58.21% => improvement))
Pair{Any, Any}(("Float64", "Trivial", "[3, 2, 2, 50]", "nothing"), TrialJudgement(+8.14% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[3, 3, 3, 100]", "nothing"), TrialJudgement(-4.44% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[4, 2, 2, 50]", "nothing"), TrialJudgement(+0.33% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[4, 3, 3, 100]", "nothing"), TrialJudgement(-1.24% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 2, 50]", "nothing"), TrialJudgement(-5.95% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[5, 2, 3, 100]", "nothing"), TrialJudgement(-1.74% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[6, 2, 2, 50]", "nothing"), TrialJudgement(+1.60% => invariant))
Pair{Any, Any}(("Float64", "Trivial", "[6, 3, 2, 100]", "nothing"), TrialJudgement(+0.09% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[10, 2, 2, 50]", 0.5), TrialJudgement(-23.18% => improvement))
Pair{Any, Any}(("Float64", "U1Irrep", "[10, 3, 2, 100]", 0.5), TrialJudgement(-12.33% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[4, 2, 2, 100]", 0.5), TrialJudgement(-6.54% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[4, 4, 4, 200]", 0.5), TrialJudgement(-2.42% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6, 2, 2, 100]", 0.5), TrialJudgement(-8.01% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[6, 3, 4, 200]", 0.5), TrialJudgement(-13.01% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 2, 100]", 0.5), TrialJudgement(-9.83% => invariant))
Pair{Any, Any}(("Float64", "U1Irrep", "[8, 2, 4, 200]", 0.5), TrialJudgement(-7.28% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 2, 2, 50]", 0.5), TrialJudgement(-13.39% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[4, 4, 4, 100]", 0.5), TrialJudgement(-26.58% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 2, 2, 50]", 0.5), TrialJudgement(-12.03% => invariant))
Pair{Any, Any}(("Float64", "Z2Irrep", "[5, 3, 4, 100]", 0.5), TrialJudgement(-30.31% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 2, 50]", 0.5), TrialJudgement(-29.04% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[6, 2, 4, 100]", 0.5), TrialJudgement(-27.33% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 2, 2, 50]", 0.5), TrialJudgement(-22.15% => improvement))
Pair{Any, Any}(("Float64", "Z2Irrep", "[8, 3, 2, 100]", 0.5), TrialJudgement(-26.79% => improvement)) In summary though, I think it is safe to conclude that this is an overall improvement, and we can safely merge this. As a fair warning again: I want to add the |
TreeTransformer
refactor + multithreadingTreeTransformer
refactor + multithreading
As discussed before, I changed the last things considering the |
This is a PR that changes multiple things, after a bunch of discussions and brainstorming with @ogauthe over the past weeks.
Firstly, I simplified the
AbelianTreeTransformer
by only storing the relevant information, ie this object no longer needs to hold on to the entire tensor fusiontree structure, and simply has the required data in aVector
. This also changes the implementation to "array-of-struct" instead of "struct-of-arrays", which should slightly improve data locality (although this is probably negligible).Secondly, motivated by @ogauthe, I experimented with grouping the
GenericTreeTransformer
over the different subblocks with the same uncoupled charges. These subblocks have the same sizes, and as a result it is possible to concatenate them into a contiguous buffer to perform the recoupling through a BLASmul!
call. If this transformation is indeed given by a dense matrix ofN x N
coefficients with subblocks of lengthL
, this changes the implementation from:N^2
calls totensoradd!
, ieN^2
scaled permutations ofL
elementsN
almost contiguous copies ofL
elements, a BLAS call ofN^2 L
, andN
permutations ofL
elements.The main idea is that at least for the dominant part of this algorithm, the data is kept contiguous and we can make use of efficient BLAS routines, effectively changing the number of non-optimal access operations from
~ N^2
to~ N
.Finally, I also went ahead and actually implemented the multi-threading, motivated by the implementation with an atomic
Int
in #100. This seems like a very lightweight implementation that balances loads quite well, and has no external dependencies.I think the major remaining components are to benchmark these implementations, presumably in real-world scenarios such as MPSKit and PEPSKit, and possibly find some heuristic to disable the multi-threading for tensors that are too small for the overhead to outweigh the benefits.
!!! note
Before merging I'd like to add @ogauthe to the co-author list of this PR, given his involvement with figuring this out.