How to handle performance tests

Based on the discussion during the performance team meeting, notes here: https://acme-climate.atlassian.net/wiki/spaces/EP/pages/3922133143/2023-09-18+Performance+Infrastructure+Meeting+Notes

I am still not 100% sure on what our approach should be so I thought it might be helpful to list the features/behaviors that we want for performance tests:

1. In order to save testing time, it would be very nice if we could use some of the existing tests already being run as both "normal" and "performance" tests. This is tricky because, in order to get useful performance data, you have to turn off I/O, which then renders the test useless from the point-of-view of doing baseline comparisons. It would be nice if there were a setting to exclude I/O from performance metrics. This would allow a test to serve as both a performance test and a regular baseline test. @jayeshkrishna , is this possible?
2. If a test is known to be a performance test, CIME and scripts in $component/cime_config/* should automatically configure things for performance testing without requiring the user to make a lot of additional changes through test-mods etc. @amametjanov , it would be helpful if you could describe the things you need to change in order to get useful performance data.
3. In order to implement (2), the component will need to know the current case is a is a performance test. This should be easy if we assume SAVE_TIMING=ON implies performance test. I think the latest CIME (our submodule doesn't have this yet) we can at least rely on SAVE_TIMING being set to ON for performance tests.
4. CIME will need to mark tests with significant performance decreases as FAILs. I think we already have this via the TPUTCOMP test phase, but this CIME ticket https://github.com/ESMCI/cime/issues/2918 makes me think that maybe more functionality is needed. @billsacks , can you elaborate? 
5. By default, our Jenkins python job code is set to ignore TPUTCOMP and MEMCOMP FAILs by default when reporting test statuses to cdash. The user can change this by adding ` --check-throughput --check-memory` to the test script (in E3SM_test_scripts repo) but this is very easy to forget and it looks like almost no jobs are currently using these flags. Again, in the interest of making life easy for users, I think both these checks should be on by default for tests that have SAVE_TIMING on.
6. We want to be able to "bless" performance changes if we think a significant performance change in a test is acceptable. I think this is the one feature we fully understand and @wadeburgess and @jasonb5 already have most of what's needed here in place.

Feel free to edit to add features.

Related issues:

https://github.com/E3SM-Project/E3SM/issues/5885


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle performance tests #5937

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to handle performance tests #5937

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions