-
Notifications
You must be signed in to change notification settings - Fork 14
Refactor analyses CI #923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refactor analyses CI #923
Conversation
…endpoint, error-ignoring methods
… flag failing update_analyses to GitHub CI
| LOGFILE=output.log | ||
| ERR_MSGS=() | ||
| ERR_COUNT=0 | ||
| pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \ | ||
| while read FAIL; do | ||
| ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]" | ||
| ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 )) | ||
| REASON=`grep "^E " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason | ||
| REASON=${REASON:8} # remove prefix | ||
| ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax | ||
| done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines | ||
| cat $LOGFILE | ||
| for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors | ||
| if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunately not trivial to get the "short summary" information from pytest. This bash lines run pytest and, upon fail, collect the error messages and raise them as warning using a syntax that is picked up by the GitHub CI. Given this is somewhat lengthy as this point, I can also move this to a separate bash script in tests/ if it is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, better to move it to a separate script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AI reviewing tools indicate syntax errors in this script. Has it been tested? Can it be run locally? Note that the "Running the tests" section of the HEPData developer docs mention the act tool for running GitHub Actions locally. I haven't used act recently, but it might be helpful for debugging if you can get it to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, the "Running the tests" section of the HEPData developer docs should be updated to explain local running of the tests with/without the new strict_endpoints_test marker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the analyses CI system to improve error handling and prevent CI failures when analysis endpoint servers are unavailable. The main goal is to separate strict error handling (for development) from forgiving error handling (for CI), while still capturing diagnostic information about endpoint issues.
Key changes:
- Split
update_analyses()into two functions: a strict version that raises exceptions (update_analyses_single_tool()) and a forgiving version that logs warnings (update_analyses()) - Refactored tests to use
pytest.mark.parametrizewith test data from YAML configuration file - Added new
strict_endpoints_testmarker and separate CI step to check endpoint availability without failing the entire workflow
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_data/analyses_tests.yaml | New YAML configuration file defining test parameters for each analysis endpoint |
| tests/records_test.py | Refactored tests to use parametrization and split into multiple test functions with forgiving error handling |
| pytest.ini | Added marker definition for strict endpoint tests |
| hepdata/modules/records/utils/analyses.py | Split update function into strict and forgiving versions, changed error handling from logging to exceptions in strict version |
| .github/workflows/ci.yml | Added new CI step to check endpoint availability with continue-on-error flag and warning generation |
| continue | ||
|
|
||
| try: | ||
| update_analyses_single_tool(endpoint) |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is called with endpoint instead of analysis_endpoint. This will cause incorrect behavior when iterating through endpoints, as it will repeatedly call the same endpoint instead of the current one in the loop.
| update_analyses_single_tool(endpoint) | |
| update_analyses_single_tool(analysis_endpoint) |
| except LookupError as e: | ||
| log.error(str(e)) | ||
| except jsonschema.exceptions.ValidationError as e: | ||
| log.error("Validation error for analyses schema {0} in {1}: {2}".format(schema_version, analysis_endpoint, e)) |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable schema_version is not defined in this scope. It was defined inside update_analyses_single_tool() but is being referenced in the exception handler of update_analyses(). This will cause a NameError when this exception is caught.
| log.error("Validation error for analyses schema {0} in {1}: {2}".format(schema_version, analysis_endpoint, e)) | |
| log.error("Validation error for analyses schema in {0}: {1}".format(analysis_endpoint, e)) |
| done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines | ||
| cat $LOGFILE | ||
| for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors | ||
| if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bash syntax error: comparison should use -gt operator with spaces, and there's an extra semicolon. Should be: if [ $ERR_COUNT -gt 0 ]; then exit 1; fi
| if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors | |
| if [ $ERR_COUNT -gt 0 ]; then exit 1; fi # fail if there were any errors |
| update_analyses('TestAnalysis') | ||
|
|
||
| # Call forgiving version of update_analyses_single_tool to make sure it works as intended | ||
| assert update_analyses_single_tool_forgiving("TestAnalysis") == False |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use is False instead of == False for boolean comparisons in Python, as per PEP 8 guidelines.
| assert update_analyses_single_tool_forgiving("TestAnalysis") == False | |
| assert update_analyses_single_tool_forgiving("TestAnalysis") is False |
| """ Test update of Rivet, MadAnalyses 5, etc. analyses | ||
| Be strict about encountered errors, i.e. flag even if error is (presumably) on tool side |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing closing quotes for docstring. The triple-quoted string should be closed on the same or following line.
GraemeWatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. It looks complicated and I'm still not convinced it's needed, so I wouldn't give it a high priority. I will need to take a closer look, but here are some initial comments.
- Your branch is out-of-date and has conflicts with
main, so it is difficult to see changes. Please update your branch. - You've added "Check availability of analyses endpoints" as a step in the main
testjob. Would it be better to run it as a separate job, sharing most of the steps of thetestjob? See Reusing workflow configurations. Maybe the simplest approach would be to use a matrix strategy? The "Run end-to-end tests" step could also be run as a separate job. - Does the "Check availability of analyses endpoints" step test lines of code that are not tested by "Run tests"? If so, you would need to define a
COVERAGE_FILEthat you later include in the "Run coveralls" step. A related point is that I generally do not merge PRs that decrease test coverage, so if "Check availability of analyses endpoints" is needed to ensure test coverage (but fails), it would still block development. - Unavailability of a remote JSON file can cause exceptions that are not being caught by the current code, making this PR not useful. Two examples: (i) ConnectionRefusedError when HepForge was down for a weekend (subsequently, I asked Krzysztof to move the CheckMATE JSON file from HepForge to GitHub), (ii)
JSONDecodeErrorwhen the HackAnalysis JSON file was missing a comma so that it was not valid JSON, meaning that it would fail at theresponse.json()stage before validation. Maybe the exceptions being caught need to be made more general?
| LOGFILE=output.log | ||
| ERR_MSGS=() | ||
| ERR_COUNT=0 | ||
| pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \ | ||
| while read FAIL; do | ||
| ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]" | ||
| ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 )) | ||
| REASON=`grep "^E " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason | ||
| REASON=${REASON:8} # remove prefix | ||
| ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax | ||
| done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines | ||
| cat $LOGFILE | ||
| for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors | ||
| if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, better to move it to a separate script.
| # Remove resources from 'analysis_resources' list. | ||
| resources = list(filter(lambda a: a.file_location == _resource_url, analysis_resources)) | ||
| for resource in resources: | ||
| analysis_resources.remove(resource) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines don't seem to be covered by tests:
https://coveralls.io/jobs/172976690/source_files/8867976225#L174
|
@codecov-ai-reviewer review |
This comment has been minimized.
This comment has been minimized.
| if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors | ||
| - name: Setup Sauce Connect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bash conditional has multiple syntax errors. The comparison operator should be -gt (greater than) instead of >, and there's an unnecessary semicolon after then. The correct syntax is: if [ $ERR_COUNT -gt 0 ]; then exit 1; fi
Did we get this right? 👍 / 👎 to inform future reviews.
| pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \ | ||
| while read FAIL; do | ||
| ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]" | ||
| ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 )) | ||
| REASON=`grep "^E " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason | ||
| REASON=${REASON:8} # remove prefix | ||
| ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ERR_COUNT variable tracking logic is flawed. Inside the while loop (line 172), ERR_COUNT is reassigned to ${#ERR_MSGS[@]} + 1 on each iteration, which overwrites the previous value instead of accumulating the count. This defeats the purpose of tracking error count across all failures. Consider tracking this outside the loop or using a different approach.
Did we get this right? 👍 / 👎 to inform future reviews.
| markers = | ||
| strict_endpoints_test: tests analyses endpoints, raising errors (which might have HEPData-unrelated issues, deselect with '-m "not strict_endpoints_test"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pytest marker defined is strict_endpoints_test, but the test file uses @pytest.mark.endpoints_test (line 1063 in records_test.py), which is not defined. This will cause pytest warnings. Either define both markers or use the same marker name consistently throughout.
Did we get this right? 👍 / 👎 to inform future reviews.
| log.error(str(e)) | ||
| except jsonschema.exceptions.ValidationError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update_analyses() function calls update_analyses_single_tool(endpoint) but it should call update_analyses_single_tool(analysis_endpoint). The variable endpoint is the optional filter parameter, while analysis_endpoint is the current item being processed in the loop.
Did we get this right? 👍 / 👎 to inform future reviews.
| import_records(['ins1811596'], synchronous=True) | ||
| analysis_resources = DataResource.query.filter_by(file_type='MadAnalysis').all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test decorator at line 1063 references @pytest.mark.endpoints_test but the pytest.ini file defines the marker as strict_endpoints_test. Ensure the marker name matches what's defined in pytest.ini, or update pytest.ini to define both markers.
Did we get this right? 👍 / 👎 to inform future reviews.
We already define a |
This PR addresses #907. It chances the following things:
update_analyses(endpoint=None)intoupdate_analyses_single_tool(analysis_endpoint)andupdate_analyses(endpoint=None).update_analyses_single_tool(analysis_endpoint)will attempt to update only a for the givenanalysis_endpointand will raise an exception if the update doesn't succeed.update_analysis(endpoint=None)calls updatesupdate_analyses_single_tool()either for all analyses endpoints (ifendpoint==None) or only the givenendpoint. In any case, it will raise a warning if a known, server-side error is encountered inupdate_analyses_single_tool()and therefore address the issue of missing debug information if a status 200 is returned as in Add GAMBIT analyses JSON #908. It will not raise an exception though and will therefore always process all requested endpoints.The idea behind this architecture is that
update_analysis()can now be called whenever multiple endpoints shall be updated and/or no known errors are desired, e.g. if the server of the analysis endpoint is not available. Callingupdate_analyses_single_tool()on the other hand can be called if (known) errors are desired to be raised.update_analyses()pytest.mark.parametrizeas recommended in Prevent CI pipeline from failing if analyses backend broken #907. They use the parameters specified intests/test_data/analyses_tests.yaml.update_analyses_single_tool_forgiving()which allows to abort the test early if one of two known errors associated with a problem on the endpoint side is raised.strict_endpoints_testis introduced which is used in a new test step "Check availability of analyses endpoints". This test is allowed to fail without failing the full workflow but will raise a warning looking like thisin the workflow overview. (See here for pipeline example.)
Caveats
even if one or more analysis endpoints were unavailable because there is only "pass" or "fail" for GitHub. (As opposed to GitLab which also supports a "warning" state which would be useful here.) The pipeline should nonetheless only pass if a known error is thrown and handled so hopefully no real HEPData code issues are obscured by this setup.
Apologies for the long PR and text. I'm very much open to suggestions!