Speed up test_weakref_leak and test_step_args#8107
Speed up test_weakref_leak and test_step_args#8107mcwanza wants to merge 1 commit intopymc-devs:mainfrom
Conversation
5137112 to
6dba2f2
Compare
ricardoV94
left a comment
There was a problem hiding this comment.
Some changes are fine, other sacrifice too much on the tests. Left some commenst
| "size, shape", | ||
| [ | ||
| ((10,), None), | ||
| (None, (10, 6)), |
There was a problem hiding this comment.
I don't think we should remove any of the parametrizations in test_multivariate. If these are slow the best thing is to analyze why instead
There was a problem hiding this comment.
Reverted — all test_multivariate.py changes have been removed from this PR.
There was a problem hiding this comment.
Reworked the entire PR based on this feedback. Profiled all the slow tests to understand where time is actually spent — you were right that compilation dominates.
Removed all the original changes (parametrization reduction, iteration count tweaks on compile-dominated tests). The PR now only contains 2 profiling-backed changes:
-
test_weakref_leak (20→6 iterations): Object count profiling shows state stabilizes at iteration 2 — the original 16 warmup iterations were overkill. Each
conditional_logpcall costs ~4.5s on CI, so this saves ~63s. -
test_step_args (explicit draws=200, tune=200): This test checks
acceptance_rate.mean() ≈ 0.5withdecimal=1precision — 200 draws is sufficient. The default 1000 was unnecessary for this tolerance level.
tests/progress_bar/test_manager.py
Outdated
| "draws": 10, | ||
| "tune": 10, | ||
| "draws": 5, | ||
| "tune": 5, |
There was a problem hiding this comment.
I would be very surprised if this played a role. The cost will be mostly compile time, not running 5 +- steps. Let me know if I am wrong
There was a problem hiding this comment.
Reverted — the progress bar change has been removed from this PR.
There was a problem hiding this comment.
Confirmed by profiling. The test_default_value_transform_logprob loop takes 0.0014s total for 10 iterations while compilation takes 8.85s — compile/loop ratio of 6480x. Removed this change from the PR.
6dba2f2 to
c023167
Compare
Profiling-backed changes targeting the two tests where iteration reduction yields meaningful savings. test_weakref_leak (89s on CI → estimated ~30s): Object count profiling shows memory state stabilizes at iteration 2. The original test used 16 warmup iterations before checking — reduced to 3 warmup + 3 check = 6 total iterations (from 20). Each conditional_logp call costs ~4.5s on CI, so removing 14 iterations saves ~63s. test_step_args (62s on CI → estimated ~25s): This test verifies target_accept argument plumbing, checking acceptance_rate.mean() ≈ 0.5 with decimal=1 precision. The default pm.sample() uses 1000 draws/tune, but 200 is sufficient for this loose tolerance. Verified stable over 10 runs with different seeds. Changes NOT made (profiling showed negligible impact): - test_default_value_transform_logprob range(10)→range(3): compile time is 99.98% of cost, loop saves 0.6ms total - test_interpolated x_points 100k→20k: compilation is 91.8% of cost, array reduction saves ~296ms across 56 iterations Addresses pymc-devs#7686
c023167 to
c513193
Compare
Addresses #7686 — profiling-backed optimizations targeting tests where iteration reduction yields meaningful savings.
Changes
test_weakref_leak (89s → ~30s estimated on CI)
Object count profiling shows memory state stabilizes at iteration 2:
The original test ran 20 iterations but only checked after iteration 15 — 16 warmup iterations when 3 is sufficient. Reduced to 6 total (3 warmup + 3 check). Each
conditional_logpcall costs ~4.5s on CI, so removing 14 iterations saves ~63s.test_step_args (62s → ~25s estimated on CI)
This test verifies
target_acceptargument plumbing, checkingacceptance_rate.mean() ≈ 0.5withdecimal=1precision. The defaultpm.sample()uses 1000 draws/tune, but 200 is more than sufficient for this loose tolerance (verified stable over 10 runs with different seeds).Profiling: first
pm.sample()with 1000 draws takes 2.20s locally, with 200 draws takes 0.29s. The test callspm.sample()5 times.Changes NOT made (profiling showed negligible impact)
Compilation dominates these tests — reducing iteration counts has near-zero impact, consistent with @ricardoV94's review feedback.