Skip to content

Batch test execution #324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 2, 2025
Merged

Batch test execution #324

merged 13 commits into from
Jun 2, 2025

Conversation

jmafoster1
Copy link
Contributor

For the EA case study, trying to execute 87000 test cases fills up the RAM. The run_tests_in_batches method was meant to fix this, but didn't. I really want to be able to run all 87K tests at once, so I dug a bit deeper into this. The problem is that, when we execute a test case, the estimator.model attribute becomes non-None. It is this that fills up the RAM, but we don't use it here, so I just reset it to None after the test is executed. I'm not sure how I feel about this as a solution going forward, so any better suggestions would be much appreciated. I think test.estimator.model is used elsewhere, though (perhaps as part of test adequacy?) but we should be able to do everything we need to do before resetting it to None. On the other hand, it just feels a bit inelegant somehow.

@jmafoster1 jmafoster1 requested a review from f-allian May 16, 2025 14:39
Copy link

github-actions bot commented May 16, 2025

🦙 MegaLinter status: ✅ SUCCESS

Descriptor Linter Files Fixed Errors Elapsed time
✅ PYTHON black 31 0 0.96s
✅ PYTHON pylint 31 0 5.81s

See detailed report in MegaLinter reports

MegaLinter is graciously provided by OX Security

Copy link

codecov bot commented May 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.69%. Comparing base (40faa83) to head (fcab481).
Report is 15 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #324      +/-   ##
==========================================
+ Coverage   93.84%   95.69%   +1.84%     
==========================================
  Files          27       27              
  Lines        1593     1602       +9     
==========================================
+ Hits         1495     1533      +38     
+ Misses         98       69      -29     
Files with missing lines Coverage Δ
causal_testing/__main__.py 96.96% <100.00%> (+13.63%) ⬆️
...esting/estimation/abstract_regression_estimator.py 95.83% <100.00%> (-0.17%) ⬇️
...ausal_testing/estimation/cubic_spline_estimator.py 96.55% <100.00%> (ø)
..._testing/estimation/linear_regression_estimator.py 100.00% <100.00%> (ø)
...esting/estimation/logistic_regression_estimator.py 100.00% <100.00%> (ø)
causal_testing/main.py 95.89% <100.00%> (+12.48%) ⬆️

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15513be...fcab481. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@f-allian
Copy link
Contributor

@jmafoster1 I agree, definitely not the most elegant solution.

If it is because of the test_case.estimator.model attribute, it looks like it's coming from the abstract RegressionEstimator class where self.model is defined and later used for the prediction stages.

Can you explain a few things for me first, you say:

The problem is that, when we execute a test case, the estimator.model attribute becomes non-None. It is this that fills up the RAM, but we don't use it here, so I just reset it to None after the test is executed

  1. Isn't it that the test case can be executed because the estimator.model attribute isn't non-None?
  2. Why isn't the estimator.model used here? It looks like it's needed to run the regression algorithm on the input data.
  3. Using this hacky method of setting estimator.model to None, how much time does it actually save you compared to not doing it?
  4. This isn't too important but I'm curious - what batch sizes have you tested?

@jmafoster1
Copy link
Contributor Author

1 and 2. Yes and no. Of course, we need the model to evaluate the test case, but we don't need it to be assigned to test_case.estimator to do that. The basic process at the moment is

  1. Train the model

  2. Use this to evaluate the test

  3. Assign it to the instance of estimator so it can be used later if needed (we don't need this here)

  4. The problem here is that I physically can't run all 87000 tests in one hit without this. What I'd been doing before was breaking the causal tests into two batches, running them separately, and concatenating the results. This has a comparable runtime, but is annoying as I have to run 3 things instead of just one and have to babysit it a bit more (and discover through trial and error how many batches I need). With this, I can just press go and come back when it's done, so it's a lot easier.

  5. I set the default to 100, which is completely arbitrary. That seems to use negligible extra RAM and runs acceptably fast, so I left it at that.

@jmafoster1
Copy link
Contributor Author

@f-allian I've removed the estimator.model property. It was only being used in tests, so I've either removed redundant assertions or modified them accordingly. There was only one that really needed the statsmodels model explicitly, which was for Richard's validation stuff, so I've made the train_model method public so that it can just be accessed directly.

@jmafoster1 jmafoster1 marked this pull request as ready for review May 20, 2025 13:38
Copy link
Contributor

@f-allian f-allian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmafoster1 Excellent, nice one Michael!

@f-allian
Copy link
Contributor

f-allian commented Jun 2, 2025

@jmafoster1 Is this ready to be merged?

@jmafoster1 jmafoster1 merged commit 7da3aef into main Jun 2, 2025
13 checks passed
@jmafoster1
Copy link
Contributor Author

Yeah I think so. I must have missed your review or something

@jmafoster1 jmafoster1 deleted the 323-batch-test-execution branch June 2, 2025 08:51
@jmafoster1
Copy link
Contributor Author

Closes #323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants