-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Since DynamicPPL 0.31.1 (#696), using the now-deprecated @submodel
syntax in testsets is very slow.
In the below timings, I ran sample(model, spl, iters)
for:
demo_assume_observe_literal
as a control;- the two demo models with submodels. Here,
..._old
is the version which uses@submodel
, and..._new
a version which usesto_submodel()
. DynamicPPL@master is still using the old versions.
I tested with both DynamicPPL.SampleFromUniform()
as well as a 'real' sampler NUTS()
. All tests used Julia 1.11, 1 thread.
From the timings, it's clear that 0.31.1 introduced a significant slowdown when using @submodel
. This is already very clear locally and on macOS CI (2-4x slowdown), but is even worse on Ubuntu CI (20-30x slowdown).
I don't know what's causing this, but I believe this is the root cause for why CI has been so ridiculously slow recently. Thus, to speed up the tests for both DPPL and Turing, it should be a matter of priority to replace any remaining instances of @submodel
with to_submodel()
and fix any associated changes in behaviour (such as variable prefixing).
Curiously, I can't replicate this slowdown with Chairmarks (see section below), so it's likely some interaction with the testset.
Testset timings
You can see the code that was run @ https://github.com/penelopeysm/Shaymin.jl/blob/main/test/runtests.jl
Ubuntu CI @ 0.31.1: https://github.com/penelopeysm/Shaymin.jl/actions/runs/12315810160/job/34374779974#step:6:563
macOS CI @ 0.31.1: https://github.com/penelopeysm/Shaymin.jl/actions/runs/12315810160/job/34374780294#step:6:558
Ubuntu CI @ 0.31.0: https://github.com/penelopeysm/Shaymin.jl/actions/runs/12315894358/job/34375022027?pr=20#step:6:555
macOS CI @ 0.31.0: https://github.com/penelopeysm/Shaymin.jl/actions/runs/12315894358/job/34375022380?pr=20#step:6:287
------ DPPL 0.31.1 ------- ------ DPPL 0.30.0 -------
Test Summary: | Total Local CI-ubuntu CI-macOS Local CI-ubuntu CI-macOS
DynamicPPL.jl | 0
|
demo_assume_observe_literal | 0
SampleFromUniform @ 1000 iters | 0 1.5s 2.9s 1.7s 1.5s 3.0s 2.8s
NUTS @ 100 iters | 0 7.8s 15.6s 8.6s 8.6s 16.3s 9.7s
|
demo_assume_submodel_observe_index_literal_old | 0
SampleFromUniform @ 1000 iters | 0 5.3s 1m00.8s 6.1s 2.7s 5.2s 3.2s
NUTS @ 100 iters | 0 10.6s 1m23.3s 11.9s 7.9s 15.8s 10.4s
|
demo_assume_submodel_observe_index_literal_new | 0
SampleFromUniform @ 1000 iters | 0 1.8s 3.7s 1.9s
NUTS @ 100 iters | 0 5.8s 12.2s 6.6s
|
demo_dot_assume_observe_submodel_old | 0
SampleFromUniform @ 1000 iters | 0 2.9s 56.2s 3.1s 0.7s 1.5s 0.9s
NUTS @ 100 iters | 0 5.3s 1m12.4s 6.2s 1.9s 4.0s 3.0s
|
demo_dot_assume_observe_submodel_new | 0
SampleFromUniform @ 1000 iters | 0 0.5s 1.2s 0.6s
NUTS @ 100 iters | 0 1.8s 3.5s 1.9s
Chairmarks 0.31.1
julia> @be sample(DynamicPPL.TestUtils.demo_assume_submodel_observe_index_literal(), SampleFromUniform(), 1000; progress=false)
Benchmark: 17 samples with 1 evaluation
min 5.080 ms (111500 allocs: 6.844 MiB)
median 5.157 ms (111500 allocs: 6.844 MiB)
mean 5.906 ms (111500 allocs: 6.844 MiB, 3.99% gc time)
max 16.734 ms (111500 allocs: 6.844 MiB, 67.82% gc time)
julia> @be sample(DynamicPPL.TestUtils.demo_assume_submodel_observe_index_literal(), NUTS(), 100; progress=false)
Benchmark: 8 samples with 1 evaluation
min 10.173 ms (197197 allocs: 13.027 MiB)
median 11.206 ms (209143.50 allocs: 14.009 MiB)
mean 12.884 ms (207954.62 allocs: 13.838 MiB, 7.18% gc time)
max 26.406 ms (219314 allocs: 14.599 MiB, 57.44% gc time)
Chairmarks 0.31.0
julia> @be sample(DynamicPPL.TestUtils.demo_assume_submodel_observe_index_literal(), SampleFromUniform(), 1000; progress=false)
Benchmark: 11 samples with 1 evaluation
min 5.264 ms (111500 allocs: 6.844 MiB)
median 5.339 ms (111500 allocs: 6.844 MiB)
mean 18.684 ms (111500 allocs: 6.844 MiB, 15.98% gc time)
max 131.152 ms (111500 allocs: 6.844 MiB, 95.94% gc time)
julia> @be sample(DynamicPPL.TestUtils.demo_assume_submodel_observe_index_literal(), NUTS(), 100; progress=false)
Benchmark: 7 samples with 1 evaluation
min 10.221 ms (193852 allocs: 12.863 MiB)
median 11.400 ms (218755 allocs: 14.557 MiB)
mean 14.288 ms (216068.14 allocs: 14.384 MiB, 9.04% gc time)
max 32.843 ms (234745 allocs: 15.744 MiB, 63.25% gc time)