Compare tests by dguittet · Pull Request #1289 · NatLabRockies/ssc

dguittet · 2025-03-03T20:50:59Z

Follow up to #1286

test_times.yml

test_times.yml is a new workflow that is triggered by the completion of ci.yml: GitHub docs here

Once ci.yml completes on any branch (call this the feature branch even if it's patch/dev), it uploads a CSV of elapsed test times.

test_times.yml will be triggered to run on the default branch (patch or dev). It will download the CSV of the triggering workflow, which measures the elapsed test times of the feature branch. It compares that downloaded CSV to the most recent CSV produced by a workflow on the default branch.

There are two parameters in test_times.yml:
HEAD_DIFF_TOL: For comparison against the default branch head, relative value in time elapsed above which the check will fail. Currently set at 0.05
REL_DIFF_TOL: For comparison against the release, CSVs stored in test/elapsed_time_release. Set to 0.10
DIFF_THRESHOLD_MS: in milliseconds, the length of time under which comparisons don't need to be run. Currently set at 3 (don't compare tests that take <3 ms)

Note:: this test_times.yml needs to be on the default branch to run as a triggered workflow. So the github actions workflows for this PR won't run test_times.yml

Examples on my fork

Failing comparison (ssc/compare_tests vs ssc/patch): https://github.com/dguittet/ssc/actions/runs/13642276580
Successful comparison (ssc/patch vs ssc/patch): https://github.com/dguittet/ssc/actions/runs/13642494160

Other references:

release.yml

This is triggered by a new release and will update the CSVs in test/elapsed_time_release by downloading the workflow artifacts produced by the workflow associated with the SHA tagged in the release

brtietz · 2025-03-03T23:52:39Z

The code and artifacts look great! I have one concern with the threshold. It looks like this would catch a 10% increase in a single PR getting merged into patch or develop. Do we need to be concerned about a slow increase (e.g. 3-5% in each PR adding up to more than 10% over a larger number of PRs)? If so, how difficult is it to tag the reference times to the most recent released version?

sjanzou

Great work!

In building on what @brtietz said about tying the comparison to a release to see how the times change over time, how about doing a comparison of total test time to capture any below DIFF_THRESHOLD_MS that may add up...

There are 369 tests (of the 896 listed) in the artifact with elapsed time below 3ms (DIFF_THRESHOLD_MS) that would not be considered in the comparison.

The artifact is great:

Also, the total time is in the artifact

which seems like a good thing to track and compare as we add/remove/modify tests.

dguittet · 2025-03-04T16:00:45Z

The code and artifacts look great! I have one concern with the threshold. It looks like this would catch a 10% increase in a single PR getting merged into patch or develop. Do we need to be concerned about a slow increase (e.g. 3-5% in each PR adding up to more than 10% over a larger number of PRs)? If so, how difficult is it to tag the reference times to the most recent released version?

We could compare it to references times in the latest release, but we'd have to save a copy of a releases' elapsed test times since Github Actions Artifacts expire after 90 days. And so with each release, that CSV would have to be updated, which I can do via a Github action that is triggered by a release. Would this be in addition to the stepwise comparison, or a replacement?

dguittet · 2025-03-04T16:26:39Z

In building on what @brtietz said about tying the comparison to a release to see how the times change over time, how about doing a comparison of total test time to capture any below DIFF_THRESHOLD_MS that may add up...

The Total time row is included in the comparisons so if the Total time is increased by more than 10%, it'll be printed in the csv uploaded by test_times.yml (was named compare_times.csv, but I'm going to rename it as failing_test_times.csv to be more explicit).

dguittet · 2025-03-10T20:29:35Z

@brtietz @sjanzou

Added comparison to release times
Added release.yml to update the CSVs with a new release

brtietz

This looks great! Love the automated PR for releases.

sjanzou

Extremely useful for the "Operation Speedy Gonzales"!

dguittet · 2025-03-27T16:40:33Z

Actually, there's still an issue here where when there's a new Release, the Release-triggered action that sets up the PR is run at the same time as the action that creates the gtest time elapsed CSV. Since the update-via-PR action depends on the CSV, it'll fail. I have to think of how to get the dependency right

sjanzou · 2025-03-28T06:26:35Z

Actually, there's still an issue here where when there's a new Release, the Release-triggered action that sets up the PR is run at the same time as the action that creates the gtest time elapsed CSV. Since the update-via-PR action depends on the CSV, it'll fail. I have to think of how to get the dependency right

Ahh... good point - not that we have more than one ci running (ci.yml and coverage.yml) and multiple runners in ci.yml, i think we all need to become more aware of the interdependencies.

We should be testing the new Release paradigm within the next couple of weeks!

Thanks for the critical review of your own pull request!

dguittet · 2025-04-14T17:17:54Z

Added a retry loop to the release.yml which downloads the updated test CSVs produced by the CI of the release. This worked on my fork: https://github.com/dguittet/ssc/pulls

Every 20 minutes, check if the gtest elapsed times csv is available. Timeout after 3 hours

coveralls · 2025-04-14T19:26:37Z

Pull Request Test Coverage Report for Build 14451664057

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
2430 unchanged lines in 31 files lost coverage.
Overall coverage increased (+4.3%) to 54.706%

Files with Coverage Reduction	New Missed Lines	%
ssc/shared/lib_battery_dispatch.h	1	73.68%
ssc/shared/lib_util.h	2	69.92%
ssc/ssc/cmod_mhk_tidal.cpp	2	90.6%
ssc/tcs/csp_solver_pc_Rankine_indirect_224.cpp	2	60.36%
ssc/tcs/csp_solver_util.cpp	2	84.04%
ssc/tcs/numeric_solvers.h	3	85.71%
ssc/ssc/common_financial.cpp	7	72.64%
ssc/shared/lib_battery_dispatch_manual.cpp	8	87.14%
ssc/ssc/cmod_wavefile.cpp	8	73.26%
ssc/ssc/tckernel.cpp	11	54.55%

Totals
Change from base Build 13861376566:	4.3%
Covered Lines:	64318
Relevant Lines:	117571

💛 - Coveralls

dguittet added 21 commits February 27, 2025 12:05

add compare_elapsed_time.py

72e4b07

fix

136c6c4

upload artifacts separately

9252c2a

add functions

10d013d

add test_times.yml

6f8104a

access context

27813f7

fix context

090ff62

Merge branch 'gtest_times' into compare_tests

797c107

download after workflow

f706f58

temp gtest_filter to test quickly

9106665

add comparison

b2c1394

update

af98184

remove filter and fix names

ddfffd9

remove filter and fix names

53811ed

fix windows upload

1a5b497

Merge branch 'gtest_times' into compare_tests

fd23375

Merge remote-tracking branch 'origin/patch' into compare_tests

399e32d

update comparison

a74e53e

Merge branch 'patch' into compare_tests

6f66a26

Add comparison

839649b

convert to float

ae93bbb

dguittet requested review from brtietz and sjanzou March 3, 2025 21:32

always upload

115e294

sjanzou approved these changes Mar 4, 2025

View reviewed changes

dguittet added 2 commits March 4, 2025 09:01

fix windows

e4f01d2

add test times

f69056f

dguittet added 6 commits March 4, 2025 09:44

fix naming

c4258e0

Merge branch 'release_tests' into compare_tests

d69ac43

compare with release

6c46816

add release.yml

b5f57c8

switch base to develop

0e498a3

fix REL_DIFF_TOL

2095f6d

brtietz approved these changes Mar 10, 2025

View reviewed changes

brtietz changed the base branch from patch to develop March 14, 2025 17:28

sjanzou approved these changes Mar 26, 2025

View reviewed changes

during release: add retries to request for test csv artifact download

e6de64e

dguittet merged commit 8fafd75 into develop Apr 14, 2025
9 checks passed

dguittet deleted the compare_tests branch April 14, 2025 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare tests#1289

Compare tests#1289
dguittet merged 31 commits intodevelopfrom
compare_tests

dguittet commented Mar 3, 2025 •

edited

Loading

Uh oh!

brtietz commented Mar 3, 2025

Uh oh!

sjanzou left a comment

Uh oh!

dguittet commented Mar 4, 2025

Uh oh!

dguittet commented Mar 4, 2025 •

edited

Loading

Uh oh!

dguittet commented Mar 10, 2025

Uh oh!

brtietz left a comment

Uh oh!

sjanzou left a comment

Uh oh!

dguittet commented Mar 27, 2025

Uh oh!

sjanzou commented Mar 28, 2025

Uh oh!

dguittet commented Apr 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

coveralls commented Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dguittet commented Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

test_times.yml

release.yml

Uh oh!

brtietz commented Mar 3, 2025

Uh oh!

sjanzou left a comment

Choose a reason for hiding this comment

Uh oh!

dguittet commented Mar 4, 2025

Uh oh!

dguittet commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dguittet commented Mar 10, 2025

Uh oh!

brtietz left a comment

Choose a reason for hiding this comment

Uh oh!

sjanzou left a comment

Choose a reason for hiding this comment

Uh oh!

dguittet commented Mar 27, 2025

Uh oh!

sjanzou commented Mar 28, 2025

Uh oh!

dguittet commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

coveralls commented Apr 14, 2025

Pull Request Test Coverage Report for Build 14451664057

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dguittet commented Mar 3, 2025 •

edited

Loading

dguittet commented Mar 4, 2025 •

edited

Loading

dguittet commented Apr 14, 2025 •

edited

Loading