-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Hi Lukas,
I am exploring the new version of MPSKit. Compared with [email protected], the [email protected] seems to drop some support for parallel computation, especially for finite size system and algorithms.
For an example, set MPSKit.Defaults.set_parallelization("derivatives" => true)
is useful when I perform DMRG2
for a finite size lattice in the past version:
#julia -t 8 hubbard_m.jl
using LinearAlgebra
BLAS.set_num_threads(1)
using TensorKit
using MPSKit
using MPSKitModels: FiniteCylinder,FiniteStrip
using DynamicalCorrelators
using JLD2: save,load
MPSKit.Defaults.set_parallelization("derivatives" => true)
filling = (1,1)
lattice = FiniteStrip(4, 12)
H = hubbard(Float64, U1Irrep, U1Irrep, lattice; filling=filling, t=1, U=8, μ=0)
N=length(lattice)
st = randFiniteMPS(Float64, U1Irrep, U1Irrep, N; filling=filling)
err = 1e-6
@time gs, envs, delta = find_groundstate(st, H, DMRG2(trscheme= truncerr(err)));
E0 = expectation_value(gs, H)
[ Info: DMRG2 1: obj = -4.911146147560e+00 err = 9.9914573155e-01 time = 51.10 sec
[ Info: DMRG2 2: obj = -4.913259207333e+00 err = 1.6855045657e-04 time = 1.29 min
[ Info: DMRG2 3: obj = -4.913259209043e+00 err = 2.7500646205e-10 time = 24.01 sec
[ Info: DMRG2 conv 4: obj = -4.913259209043e+00 err = 4.3298697960e-14 time = 3.27 min
209.001721 seconds (1.13 G allocations: 624.127 GiB, 21.34% gc time, 288 lock conflicts, 38.48% compilation time)
-4.913259209043462
# For a single thread, it costs
[ Info: DMRG2 1: obj = -4.912078856370e+00 err = 9.9976380773e-01 time = 1.77 min
[ Info: DMRG2 2: obj = -4.913259207169e+00 err = 1.0282147643e-04 time = 1.61 min
[ Info: DMRG2 3: obj = -4.913259209043e+00 err = 3.0078417534e-10 time = 57.36 sec
[ Info: DMRG2 conv 4: obj = -4.913259209043e+00 err = 4.4075854078e-14 time = 5.79 min
357.544613 seconds (914.24 M allocations: 572.318 GiB, 16.88% gc time, 4.63% compilation time)
-4.9132592090434555
But in the new version, a lot of things changed, and I noticed that the two-site derivative function ∂AC2
does not use multithreads anymore, which seems to drop the last support for DMRG2
in parallelization and that indeed seems to be the case:
# 8 threads in [email protected] with MPSKit.Defaults.set_scheduler!(:dynamic)
[ Info: DMRG2 1: obj = -4.911500799784e+00 err = 9.4429522527e-01 time = 1.26 min
[ Info: DMRG2 2: obj = -4.913259207112e+00 err = 2.2780024577e-04 time = 48.19 sec
[ Info: DMRG2 3: obj = -4.913259209043e+00 err = 2.9917901490e-10 time = 35.59 sec
[ Info: DMRG2 conv 4: obj = -4.913259209043e+00 err = 4.3964831775e-14 time = 4.68 min
291.332853 seconds (888.91 M allocations: 450.021 GiB, 11.08% gc time, 15.20% compilation time: <1% of which was recompilation)
-4.913259209043462
# single thread in [email protected]
[ Info: DMRG2 1: obj = -4.911706979069e+00 err = 9.9925886189e-01 time = 1.06 min
[ Info: DMRG2 2: obj = -4.913259207177e+00 err = 1.6968202792e-04 time = 52.09 sec
[ Info: DMRG2 3: obj = -4.913259209043e+00 err = 3.0106028781e-10 time = 34.27 sec
[ Info: DMRG2 conv 4: obj = -4.913259209043e+00 err = 4.4408920985e-14 time = 4.58 min
284.791112 seconds (762.70 M allocations: 432.346 GiB, 16.69% gc time, 8.58% compilation time: <1% of which was recompilation)
-4.913259209043463
For single-threaded computations, the new version has significant advantages. However, since the new version does not support multi-threading for finite size DMRG, it is at a disadvantage compared to the previous version.