Refactor analyses CI #923

mhabedan · 2025-10-15T09:44:40Z

This PR addresses #907. It chances the following things:

Split update_analyses(endpoint=None) into update_analyses_single_tool(analysis_endpoint) and update_analyses(endpoint=None).
- update_analyses_single_tool(analysis_endpoint) will attempt to update only a for the given analysis_endpoint and will raise an exception if the update doesn't succeed.
- update_analysis(endpoint=None) calls updates update_analyses_single_tool() either for all analyses endpoints (if endpoint==None) or only the given endpoint. In any case, it will raise a warning if a known, server-side error is encountered in update_analyses_single_tool() and therefore address the issue of missing debug information if a status 200 is returned as in Add GAMBIT analyses JSON #908. It will not raise an exception though and will therefore always process all requested endpoints.
  The idea behind this architecture is that update_analysis() can now be called whenever multiple endpoints shall be updated and/or no known errors are desired, e.g. if the server of the analysis endpoint is not available. Calling update_analyses_single_tool() on the other hand can be called if (known) errors are desired to be raised.
The previously existing tests of update_analyses()
- have been refactored to make use of pytest.mark.parametrize as recommended in Prevent CI pipeline from failing if analyses backend broken #907. They use the parameters specified in tests/test_data/analyses_tests.yaml.
- now won't fail any more if the analysis endpoint server is not available or provides the wrong JSON schema. Technically, this is solved by calling update_analyses_single_tool_forgiving() which allows to abort the test early if one of two known errors associated with a problem on the endpoint side is raised.
To collect as much information as possible about problems with the endpoints, a new CI marker strict_endpoints_test is introduced which is used in a new test step "Check availability of analyses endpoints". This test is allowed to fail without failing the full workflow but will raise a warning looking like this

in the workflow overview. (See here for pipeline example.)

Caveats

While this does achieve the goal set out in Prevent CI pipeline from failing if analyses backend broken #907 that the CI pipeline doesn't fail any more if an analysis endpoint server is not available, the warning about the unavailable (and therefore unchecked) analysis endpoint is somewhat hidden. GitHub will display the whole pipeline as succeeded, e.g.

even if one or more analysis endpoints were unavailable because there is only "pass" or "fail" for GitHub. (As opposed to GitLab which also supports a "warning" state which would be useful here.) The pipeline should nonetheless only pass if a known error is thrown and handled so hopefully no real HEPData code issues are obscured by this setup.
Parallelising different CI jobs did not work. I tested the setup here. Alternatively, see logs.zip if the pipeline result is unavailable in the future. The problem seems to be the parallel database accesses which let (random?) tests fail. I'm not enough of an expert to fix this so I abstained from using parallel jobs after all.

Apologies for the long PR and text. I'm very much open to suggestions!

… tools

…endpoint, error-ignoring methods

… flag failing update_analyses to GitHub CI

mhabedan · 2025-10-15T09:49:44Z

.github/workflows/ci.yml

+        LOGFILE=output.log
+        ERR_MSGS=()
+        ERR_COUNT=0
+        pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \
+          while read FAIL; do
+            ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]"
+            ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 ))
+            REASON=`grep "^E       " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason
+            REASON=${REASON:8} # remove prefix
+            ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax
+          done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines
+        cat $LOGFILE
+        for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors
+        if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors


It's unfortunately not trivial to get the "short summary" information from pytest. This bash lines run pytest and, upon fail, collect the error messages and raise them as warning using a syntax that is picked up by the GitHub CI. Given this is somewhat lengthy as this point, I can also move this to a separate bash script in tests/ if it is preferred.

Yes, better to move it to a separate script.

The AI reviewing tools indicate syntax errors in this script. Has it been tested? Can it be run locally? Note that the "Running the tests" section of the HEPData developer docs mention the act tool for running GitHub Actions locally. I haven't used act recently, but it might be helpful for debugging if you can get it to work.

By the way, the "Running the tests" section of the HEPData developer docs should be updated to explain local running of the tests with/without the new strict_endpoints_test marker.

coveralls · 2025-10-21T14:21:25Z

coverage: 84.345% (-0.1%) from 84.463%
when pulling 1874654 on mh-refactorAnalysesCI
into 7983744 on main.

Copilot

Pull Request Overview

This PR refactors the analyses CI system to improve error handling and prevent CI failures when analysis endpoint servers are unavailable. The main goal is to separate strict error handling (for development) from forgiving error handling (for CI), while still capturing diagnostic information about endpoint issues.

Key changes:

Split update_analyses() into two functions: a strict version that raises exceptions (update_analyses_single_tool()) and a forgiving version that logs warnings (update_analyses())
Refactored tests to use pytest.mark.parametrize with test data from YAML configuration file
Added new strict_endpoints_test marker and separate CI step to check endpoint availability without failing the entire workflow

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/test_data/analyses_tests.yaml	New YAML configuration file defining test parameters for each analysis endpoint
tests/records_test.py	Refactored tests to use parametrization and split into multiple test functions with forgiving error handling
pytest.ini	Added marker definition for strict endpoint tests
hepdata/modules/records/utils/analyses.py	Split update function into strict and forgiving versions, changed error handling from logging to exceptions in strict version
.github/workflows/ci.yml	Added new CI step to check endpoint availability with continue-on-error flag and warning generation

Copilot · 2025-10-21T14:23:53Z

hepdata/modules/records/utils/analyses.py

+            continue
+
+        try:
+            update_analyses_single_tool(endpoint)


The function is called with endpoint instead of analysis_endpoint. This will cause incorrect behavior when iterating through endpoints, as it will repeatedly call the same endpoint instead of the current one in the loop.

Suggested change

update_analyses_single_tool(endpoint)

update_analyses_single_tool(analysis_endpoint)

Copilot · 2025-10-21T14:23:53Z

hepdata/modules/records/utils/analyses.py

+        except LookupError as e:
+            log.error(str(e))
+        except jsonschema.exceptions.ValidationError as e:
+            log.error("Validation error for analyses schema {0} in {1}: {2}".format(schema_version, analysis_endpoint, e))


The variable schema_version is not defined in this scope. It was defined inside update_analyses_single_tool() but is being referenced in the exception handler of update_analyses(). This will cause a NameError when this exception is caught.

Suggested change

log.error("Validation error for analyses schema {0} in {1}: {2}".format(schema_version, analysis_endpoint, e))

log.error("Validation error for analyses schema in {0}: {1}".format(analysis_endpoint, e))

Copilot · 2025-10-21T14:23:54Z

.github/workflows/ci.yml

+          done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines
+        cat $LOGFILE
+        for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors
+        if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors


Bash syntax error: comparison should use -gt operator with spaces, and there's an extra semicolon. Should be: if [ $ERR_COUNT -gt 0 ]; then exit 1; fi

Suggested change

if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors

if [ $ERR_COUNT -gt 0 ]; then exit 1; fi # fail if there were any errors

Copilot · 2025-10-21T14:23:54Z

tests/records_test.py

    update_analyses('TestAnalysis')

+    # Call forgiving version of update_analyses_single_tool to make sure it works as intended
+    assert update_analyses_single_tool_forgiving("TestAnalysis") == False


Use is False instead of == False for boolean comparisons in Python, as per PEP 8 guidelines.

Suggested change

assert update_analyses_single_tool_forgiving("TestAnalysis") == False

assert update_analyses_single_tool_forgiving("TestAnalysis") is False

Copilot · 2025-10-21T14:23:54Z

tests/records_test.py

+    """ Test update of Rivet, MadAnalyses 5, etc. analyses 
+        Be strict about encountered errors, i.e. flag even if error is (presumably) on tool side


Missing closing quotes for docstring. The triple-quoted string should be closed on the same or following line.

GraemeWatt

Thanks for the PR. It looks complicated and I'm still not convinced it's needed, so I wouldn't give it a high priority. I will need to take a closer look, but here are some initial comments.

Your branch is out-of-date and has conflicts with main, so it is difficult to see changes. Please update your branch.
You've added "Check availability of analyses endpoints" as a step in the main test job. Would it be better to run it as a separate job, sharing most of the steps of the test job? See Reusing workflow configurations. Maybe the simplest approach would be to use a matrix strategy? The "Run end-to-end tests" step could also be run as a separate job.
Does the "Check availability of analyses endpoints" step test lines of code that are not tested by "Run tests"? If so, you would need to define a COVERAGE_FILE that you later include in the "Run coveralls" step. A related point is that I generally do not merge PRs that decrease test coverage, so if "Check availability of analyses endpoints" is needed to ensure test coverage (but fails), it would still block development.
Unavailability of a remote JSON file can cause exceptions that are not being caught by the current code, making this PR not useful. Two examples: (i) ConnectionRefusedError when HepForge was down for a weekend (subsequently, I asked Krzysztof to move the CheckMATE JSON file from HepForge to GitHub), (ii) JSONDecodeError when the HackAnalysis JSON file was missing a comma so that it was not valid JSON, meaning that it would fail at the response.json() stage before validation. Maybe the exceptions being caught need to be made more general?

GraemeWatt · 2025-10-21T14:23:40Z

.github/workflows/ci.yml

+        LOGFILE=output.log
+        ERR_MSGS=()
+        ERR_COUNT=0
+        pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \
+          while read FAIL; do
+            ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]"
+            ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 ))
+            REASON=`grep "^E       " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason
+            REASON=${REASON:8} # remove prefix
+            ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax
+          done <<< `grep "FAILED.*\[" $LOGFILE | grep -v "%"` # get summary lines with failing tests, ignore progress lines
+        cat $LOGFILE
+        for ERR_MSG in "${ERR_MSGS[@]}"; do echo $ERR_MSG; done # flag errors
+        if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors


Yes, better to move it to a separate script.

GraemeWatt · 2025-10-21T14:31:50Z

hepdata/modules/records/utils/analyses.py

+                                # Remove resources from 'analysis_resources' list.
+                                resources = list(filter(lambda a: a.file_location == _resource_url, analysis_resources))
+                                for resource in resources:
+                                    analysis_resources.remove(resource)


These lines don't seem to be covered by tests:
https://coveralls.io/jobs/172976690/source_files/8867976225#L174

GraemeWatt · 2025-10-21T15:22:16Z

@codecov-ai-reviewer review

codecov-ai · 2025-10-21T15:23:05Z

.github/workflows/ci.yml

+        if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
    - name: Setup Sauce Connect


The bash conditional has multiple syntax errors. The comparison operator should be -gt (greater than) instead of >, and there's an unnecessary semicolon after then. The correct syntax is: if [ $ERR_COUNT -gt 0 ]; then exit 1; fi

_{Did we get this right? 👍 / 👎 to inform future reviews.}

codecov-ai · 2025-10-21T15:23:06Z

.github/workflows/ci.yml

+        pytest -vv tests/*_test.py tests/test_*.py -m "strict_endpoints_test" > $LOGFILE 2>&1 || \
+          while read FAIL; do
+            ANALYSIS=`echo ${FAIL##*strict\[} | cut -d "]" -f1` # get analysis name from pattern "strict[<ANALYSIS>]"
+            ERR_COUNT=$(( ${#ERR_MSGS[@]} + 1 ))
+            REASON=`grep "^E       " $LOGFILE | head -n $ERR_COUNT | tail -n 1` # get fail reason
+            REASON=${REASON:8} # remove prefix
+            ERR_MSGS+=("::warning ::Analyses endpoint '$ANALYSIS' was not available: $REASON") # use GitHub's warning syntax


The ERR_COUNT variable tracking logic is flawed. Inside the while loop (line 172), ERR_COUNT is reassigned to ${#ERR_MSGS[@]} + 1 on each iteration, which overwrites the previous value instead of accumulating the count. This defeats the purpose of tracking error count across all failures. Consider tracking this outside the loop or using a different approach.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

codecov-ai · 2025-10-21T15:23:06Z

pytest.ini

+markers =
+    strict_endpoints_test: tests analyses endpoints, raising errors (which might have HEPData-unrelated issues, deselect with '-m "not strict_endpoints_test"')


The pytest marker defined is strict_endpoints_test, but the test file uses @pytest.mark.endpoints_test (line 1063 in records_test.py), which is not defined. This will cause pytest warnings. Either define both markers or use the same marker name consistently throughout.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

codecov-ai · 2025-10-21T15:23:06Z

hepdata/modules/records/utils/analyses.py

+            log.error(str(e))
+        except jsonschema.exceptions.ValidationError as e:


The update_analyses() function calls update_analyses_single_tool(endpoint) but it should call update_analyses_single_tool(analysis_endpoint). The variable endpoint is the optional filter parameter, while analysis_endpoint is the current item being processed in the loop.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

codecov-ai · 2025-10-21T15:23:06Z

tests/records_test.py

-    import_records(['ins1811596'], synchronous=True)
-    analysis_resources = DataResource.query.filter_by(file_type='MadAnalysis').all()


The test decorator at line 1063 references @pytest.mark.endpoints_test but the pytest.ini file defines the marker as strict_endpoints_test. Ensure the marker name matches what's defined in pytest.ini, or update pytest.ini to define both markers.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

GraemeWatt · 2025-10-22T03:57:59Z

You've added "Check availability of analyses endpoints" as a step in the main test job. Would it be better to run it as a separate job, sharing most of the steps of the test job? See Reusing workflow configurations. Maybe the simplest approach would be to use a matrix strategy? The "Run end-to-end tests" step could also be run as a separate job.

We already define a matrix with postgres-version, os-version and python-version. Currently, there is only one value of each, but in the past we have ran jobs with multiple values. See also other @HEPData repos for uses of matrix with multiple values. The matrix could be defined so that the "Run tests", "Check availability of analyses endpoints" and "Run end-to-end tests" steps are run in three separate jobs using an if condition. The coverage file from each of the three jobs could be saved as artifacts, then the Run coveralls step would be moved to a separate job running after the three previous jobs have finished, that downloads the three coverage files as artifacts and combines them before running coveralls. This is probably not too difficult to implement and the failure of the "Check availability of analyses endpoints" job would be more visible then at present. I don't have experience of the qoomon/actions--parallel-steps action, but it is not an official tool and doesn't seem to be widely used, as running steps in parallel within one job is not common usage in GitHub Actions.

mhabedan added 17 commits September 19, 2025 16:19

use pytest.parametrize to refactor test_update_analyses for different…

87aa02c

… tools

factor update_analyses into single-endpoint, error-raising and multi-…

0be02d0

…endpoint, error-ignoring methods

mark tests as endpoints test and first try at parallel step running

dab0a0a

start ci

8aa87c3

fix id typo

b12b3c6

correct names for step

6627d61

disable endpoint tests for now

761a657

use default CI steps after all because parallelising fails the tests;…

b4573e0

… flag failing update_analyses to GitHub CI

other way to raise warning

7658330

escalate to warning to be able to fail CI

e3e6721

introduce strict endpoints test to test availability of endpoints

ed6a1eb

exclude strict test from general tests

56fdb9d

grep pytest output

21d61e0

use logical OR so job is not marked as failed

7e28552

store error messages as array

d607fb8

remove superfluous semicolon

d95a549

remove test cases

1874654

mhabedan commented Oct 15, 2025

View reviewed changes

GraemeWatt requested a review from Copilot October 21, 2025 14:22

Copilot AI reviewed Oct 21, 2025

View reviewed changes

GraemeWatt requested changes Oct 21, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

codecov-ai bot reviewed Oct 21, 2025

View reviewed changes

	update_analyses_single_tool(endpoint)
	update_analyses_single_tool(analysis_endpoint)

	log.error("Validation error for analyses schema {0} in {1}: {2}".format(schema_version, analysis_endpoint, e))
	log.error("Validation error for analyses schema in {0}: {1}".format(analysis_endpoint, e))

	if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
	if [ $ERR_COUNT -gt 0 ]; then exit 1; fi # fail if there were any errors

	assert update_analyses_single_tool_forgiving("TestAnalysis") == False
	assert update_analyses_single_tool_forgiving("TestAnalysis") is False

		""" Test update of Rivet, MadAnalyses 5, etc. analyses
		Be strict about encountered errors, i.e. flag even if error is (presumably) on tool side

		if [ $ERR_COUNT>0 ]; then; exit 1; fi # fail if there were any errors
		- name: Setup Sauce Connect

		markers =
		strict_endpoints_test: tests analyses endpoints, raising errors (which might have HEPData-unrelated issues, deselect with '-m "not strict_endpoints_test"')

		log.error(str(e))
		except jsonschema.exceptions.ValidationError as e:

		import_records(['ins1811596'], synchronous=True)
		analysis_resources = DataResource.query.filter_by(file_type='MadAnalysis').all()

Refactor analyses CI #923

Are you sure you want to change the base?

Refactor analyses CI #923

Uh oh!

Conversation

mhabedan commented Oct 15, 2025

Uh oh!

mhabedan Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

GraemeWatt Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

GraemeWatt Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

GraemeWatt Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coveralls commented Oct 21, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

GraemeWatt left a comment

Choose a reason for hiding this comment

Uh oh!

GraemeWatt Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

GraemeWatt Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

GraemeWatt commented Oct 21, 2025

Uh oh!

This comment has been minimized.

codecov-ai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-ai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-ai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-ai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-ai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

GraemeWatt commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

GraemeWatt commented Oct 22, 2025 •

edited

Loading