Skip to content

Conversation

@mhabedan
Copy link
Collaborator

This PR addresses #907. It chances the following things:

  • Split update_analyses(endpoint=None) into update_analyses_single_tool(analysis_endpoint) and update_analyses(endpoint=None).
    • update_analyses_single_tool(analysis_endpoint) will attempt to update only a for the given analysis_endpoint and will raise an exception if the update doesn't succeed.
    • update_analysis(endpoint=None) calls updates update_analyses_single_tool() either for all analyses endpoints (if endpoint==None) or only the given endpoint. In any case, it will raise a warning if a known, server-side error is encountered in update_analyses_single_tool() and therefore address the issue of missing debug information if a status 200 is returned as in Add GAMBIT analyses JSON #908. It will not raise an exception though and will therefore always process all requested endpoints.
      The idea behind this architecture is that update_analysis() can now be called whenever multiple endpoints shall be updated and/or no known errors are desired, e.g. if the server of the analysis endpoint is not available. Calling update_analyses_single_tool() on the other hand can be called if (known) errors are desired to be raised.
  • The previously existing tests of update_analyses()
    • have been refactored to make use of pytest.mark.parametrize as recommended in Prevent CI pipeline from failing if analyses backend broken #907. They use the parameters specified in tests/test_data/analyses_tests.yaml.
    • now won't fail any more if the analysis endpoint server is not available or provides the wrong JSON schema. Technically, this is solved by calling update_analyses_single_tool_forgiving() which allows to abort the test early if one of two known errors associated with a problem on the endpoint side is raised.
  • To collect as much information as possible about problems with the endpoints, a new CI marker strict_endpoints_test is introduced which is used in a new test step "Check availability of analyses endpoints". This test is allowed to fail without failing the full workflow but will raise a warning looking like this
    image
    in the workflow overview. (See here for pipeline example.)

Caveats

  • While this does achieve the goal set out in Prevent CI pipeline from failing if analyses backend broken #907 that the CI pipeline doesn't fail any more if an analysis endpoint server is not available, the warning about the unavailable (and therefore unchecked) analysis endpoint is somewhat hidden. GitHub will display the whole pipeline as succeeded, e.g.
    image
    even if one or more analysis endpoints were unavailable because there is only "pass" or "fail" for GitHub. (As opposed to GitLab which also supports a "warning" state which would be useful here.) The pipeline should nonetheless only pass if a known error is thrown and handled so hopefully no real HEPData code issues are obscured by this setup.
  • Parallelising different CI jobs did not work. I tested the setup here. Alternatively, see logs.zip if the pipeline result is unavailable in the future. The problem seems to be the parallel database accesses which let (random?) tests fail. I'm not enough of an expert to fix this so I abstained from using parallel jobs after all.

Apologies for the long PR and text. I'm very much open to suggestions!

Comment on lines +167 to +180
LOGFILE=output.log
ERR_MSGS=()
ERR_COUNT=0
pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \
while read FAIL; do
ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]"
ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 ))
REASON=`grep "^E " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason
REASON=${REASON:8} # remove prefix
ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax
done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines
cat $LOGFILE
for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors
if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunately not trivial to get the "short summary" information from pytest. This bash lines run pytest and, upon fail, collect the error messages and raise them as warning using a syntax that is picked up by the GitHub CI. Given this is somewhat lengthy as this point, I can also move this to a separate bash script in tests/ if it is preferred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, better to move it to a separate script.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AI reviewing tools indicate syntax errors in this script. Has it been tested? Can it be run locally? Note that the "Running the tests" section of the HEPData developer docs mention the act tool for running GitHub Actions locally. I haven't used act recently, but it might be helpful for debugging if you can get it to work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, the "Running the tests" section of the HEPData developer docs should be updated to explain local running of the tests with/without the new strict_endpoints_test marker.

@coveralls
Copy link

Coverage Status

coverage: 84.345% (-0.1%) from 84.463%
when pulling 1874654 on mh-refactorAnalysesCI
into 7983744 on main.

@GraemeWatt GraemeWatt requested a review from Copilot October 21, 2025 14:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the analyses CI system to improve error handling and prevent CI failures when analysis endpoint servers are unavailable. The main goal is to separate strict error handling (for development) from forgiving error handling (for CI), while still capturing diagnostic information about endpoint issues.

Key changes:

  • Split update_analyses() into two functions: a strict version that raises exceptions (update_analyses_single_tool()) and a forgiving version that logs warnings (update_analyses())
  • Refactored tests to use pytest.mark.parametrize with test data from YAML configuration file
  • Added new strict_endpoints_test marker and separate CI step to check endpoint availability without failing the entire workflow

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_data/analyses_tests.yaml New YAML configuration file defining test parameters for each analysis endpoint
tests/records_test.py Refactored tests to use parametrization and split into multiple test functions with forgiving error handling
pytest.ini Added marker definition for strict endpoint tests
hepdata/modules/records/utils/analyses.py Split update function into strict and forgiving versions, changed error handling from logging to exceptions in strict version
.github/workflows/ci.yml Added new CI step to check endpoint availability with continue-on-error flag and warning generation

continue

try:
update_analyses_single_tool(endpoint)
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is called with endpoint instead of analysis_endpoint. This will cause incorrect behavior when iterating through endpoints, as it will repeatedly call the same endpoint instead of the current one in the loop.

Suggested change
update_analyses_single_tool(endpoint)
update_analyses_single_tool(analysis_endpoint)

Copilot uses AI. Check for mistakes.
except LookupError as e:
log.error(str(e))
except jsonschema.exceptions.ValidationError as e:
log.error("Validation error for analyses schema {0} in {1}: {2}".format(schema_version, analysis_endpoint, e))
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable schema_version is not defined in this scope. It was defined inside update_analyses_single_tool() but is being referenced in the exception handler of update_analyses(). This will cause a NameError when this exception is caught.

Suggested change
log.error("Validation error for analyses schema {0} in {1}: {2}".format(schema_version, analysis_endpoint, e))
log.error("Validation error for analyses schema in {0}: {1}".format(analysis_endpoint, e))

Copilot uses AI. Check for mistakes.
done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines
cat $LOGFILE
for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors
if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bash syntax error: comparison should use -gt operator with spaces, and there's an extra semicolon. Should be: if [ $ERR_COUNT -gt 0 ]; then exit 1; fi

Suggested change
if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
if [ $ERR_COUNT -gt 0 ]; then exit 1; fi # fail if there were any errors

Copilot uses AI. Check for mistakes.
update_analyses('TestAnalysis')

# Call forgiving version of update_analyses_single_tool to make sure it works as intended
assert update_analyses_single_tool_forgiving("TestAnalysis") == False
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use is False instead of == False for boolean comparisons in Python, as per PEP 8 guidelines.

Suggested change
assert update_analyses_single_tool_forgiving("TestAnalysis") == False
assert update_analyses_single_tool_forgiving("TestAnalysis") is False

Copilot uses AI. Check for mistakes.
Comment on lines +1089 to +1090
""" Test update of Rivet, MadAnalyses 5, etc. analyses
Be strict about encountered errors, i.e. flag even if error is (presumably) on tool side
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing closing quotes for docstring. The triple-quoted string should be closed on the same or following line.

Copilot uses AI. Check for mistakes.
Copy link
Member

@GraemeWatt GraemeWatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. It looks complicated and I'm still not convinced it's needed, so I wouldn't give it a high priority. I will need to take a closer look, but here are some initial comments.

  1. Your branch is out-of-date and has conflicts with main, so it is difficult to see changes. Please update your branch.
  2. You've added "Check availability of analyses endpoints" as a step in the main test job. Would it be better to run it as a separate job, sharing most of the steps of the test job? See Reusing workflow configurations. Maybe the simplest approach would be to use a matrix strategy? The "Run end-to-end tests" step could also be run as a separate job.
  3. Does the "Check availability of analyses endpoints" step test lines of code that are not tested by "Run tests"? If so, you would need to define a COVERAGE_FILE that you later include in the "Run coveralls" step. A related point is that I generally do not merge PRs that decrease test coverage, so if "Check availability of analyses endpoints" is needed to ensure test coverage (but fails), it would still block development.
  4. Unavailability of a remote JSON file can cause exceptions that are not being caught by the current code, making this PR not useful. Two examples: (i) ConnectionRefusedError when HepForge was down for a weekend (subsequently, I asked Krzysztof to move the CheckMATE JSON file from HepForge to GitHub), (ii) JSONDecodeError when the HackAnalysis JSON file was missing a comma so that it was not valid JSON, meaning that it would fail at the response.json() stage before validation. Maybe the exceptions being caught need to be made more general?

Comment on lines +167 to +180
LOGFILE=output.log
ERR_MSGS=()
ERR_COUNT=0
pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \
while read FAIL; do
ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]"
ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 ))
REASON=`grep "^E " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason
REASON=${REASON:8} # remove prefix
ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax
done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines
cat $LOGFILE
for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors
if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, better to move it to a separate script.

Comment on lines +173 to +176
# Remove resources from 'analysis_resources' list.
resources = list(filter(lambda a: a.file_location == _resource_url, analysis_resources))
for resource in resources:
analysis_resources.remove(resource)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines don't seem to be covered by tests:
https://coveralls.io/jobs/172976690/source_files/8867976225#L174

@GraemeWatt
Copy link
Member

@codecov-ai-reviewer review

@codecov-ai

This comment has been minimized.

Comment on lines +180 to 181
if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
- name: Setup Sauce Connect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bash conditional has multiple syntax errors. The comparison operator should be -gt (greater than) instead of >, and there's an unnecessary semicolon after then. The correct syntax is: if [ $ERR_COUNT -gt 0 ]; then exit 1; fi

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +170 to +176
pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \
while read FAIL; do
ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]"
ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 ))
REASON=`grep "^E " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason
REASON=${REASON:8} # remove prefix
ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ERR_COUNT variable tracking logic is flawed. Inside the while loop (line 172), ERR_COUNT is reassigned to ${#ERR_MSGS[@]} + 1 on each iteration, which overwrites the previous value instead of accumulating the count. This defeats the purpose of tracking error count across all failures. Consider tracking this outside the loop or using a different approach.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +28 to +29
markers =
strict_endpoints_test: tests analyses endpoints, raising errors (which might have HEPData-unrelated issues, deselect with '-m "not strict_endpoints_test"')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pytest marker defined is strict_endpoints_test, but the test file uses @pytest.mark.endpoints_test (line 1063 in records_test.py), which is not defined. This will cause pytest warnings. Either define both markers or use the same marker name consistently throughout.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +266 to +267
log.error(str(e))
except jsonschema.exceptions.ValidationError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update_analyses() function calls update_analyses_single_tool(endpoint) but it should call update_analyses_single_tool(analysis_endpoint). The variable endpoint is the optional filter parameter, while analysis_endpoint is the current item being processed in the loop.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines -1062 to -1063
import_records(['ins1811596'], synchronous=True)
analysis_resources = DataResource.query.filter_by(file_type='MadAnalysis').all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test decorator at line 1063 references @pytest.mark.endpoints_test but the pytest.ini file defines the marker as strict_endpoints_test. Ensure the marker name matches what's defined in pytest.ini, or update pytest.ini to define both markers.

Did we get this right? 👍 / 👎 to inform future reviews.

@GraemeWatt
Copy link
Member

GraemeWatt commented Oct 22, 2025

  1. You've added "Check availability of analyses endpoints" as a step in the main test job. Would it be better to run it as a separate job, sharing most of the steps of the test job? See Reusing workflow configurations. Maybe the simplest approach would be to use a matrix strategy? The "Run end-to-end tests" step could also be run as a separate job.

We already define a matrix with postgres-version, os-version and python-version. Currently, there is only one value of each, but in the past we have ran jobs with multiple values. See also other @HEPData repos for uses of matrix with multiple values. The matrix could be defined so that the "Run tests", "Check availability of analyses endpoints" and "Run end-to-end tests" steps are run in three separate jobs using an if condition. The coverage file from each of the three jobs could be saved as artifacts, then the Run coveralls step would be moved to a separate job running after the three previous jobs have finished, that downloads the three coverage files as artifacts and combines them before running coveralls. This is probably not too difficult to implement and the failure of the "Check availability of analyses endpoints" job would be more visible then at present. I don't have experience of the qoomon/actions--parallel-steps action, but it is not an official tool and doesn't seem to be widely used, as running steps in parallel within one job is not common usage in GitHub Actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants