Add `--keep-going` option to fail at the end not at first failing installation + use proper exit code as determined by `main` function by Flamefire · Pull Request #5022 · easybuilders/easybuild-framework

Flamefire · 2025-10-09T09:17:26Z

Allows to install any possible easyconfig without stopping if a single one fails

The main function will call sys.exit with the returned exit code as-if failing by an EasyBuildError so eb --keep-going ec1.eb ec2-eb will return an error code that can be used in scripts

Allows to install any possible easyconfig without stopping if a single one fails

easybuild/tools/build_log.py

test/framework/options.py

Co-authored-by: Jan André Reuter <jan.andre.reuter@hotmail.de>

Thyre · 2025-10-12T12:19:18Z

Here's the output for a very simple test, trying to install Intel compilers on aarch64 and some other EasyConfig at the same time:

Used command:

[reuter1@jrc0900 ~]$ eb --configfile=../jedi/.config/easybuild/config.cfg --keep-going intel-compilers-2025.2.0.eb M4-1.4.20.eb --rebuild --accept-eula-for=".*" --force-download sources

Output:

Click to open

== Temporary log file in case of crash /tmp/eb-r600p1g2/easybuild-s_pf9cns.log
== processing EasyBuild easyconfig /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs/i/intel-compilers/intel-compilers-2025.2.0.eb
== building and installing Core/intel-compilers/2025.2.0...
  >> installation prefix: /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/intel-compilers/2025.2.0
== fetching files and verifying checksums...

WARNING: Found file intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh, but re-downloading it anyway...

  >> download succeeded: https://registrationcenter-download.intel.com/akdlm/IRC_NAS/39c79383-66bf-4f44-a6dd-14366e34e255/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh

WARNING: Found file intel-fortran-compiler-2025.2.0.534_offline.sh at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-fortran-compiler-2025.2.0.534_offline.sh, but re-downloading it anyway...

  >> download succeeded: https://registrationcenter-download.intel.com/akdlm/IRC_NAS/2c69ab6a-dfff-4d8f-ae1c-8368c79a1709/intel-fortran-compiler-2025.2.0.534_offline.sh
  >> sources:
  >> /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh [SHA256: aea3c1ccb97728db138b4f11f771411264292ba7bbec313782229510c9b831bc]
  >> /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-fortran-compiler-2025.2.0.534_offline.sh [SHA256: 3808000bbcef15f17b608156b956e0114393a1b64ee6d9fb29be06450fa40083]
== ... (took 42 secs)
== creating build dir, resetting environment...
  >> build dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system
== ... (took < 1 sec)
== unpacking...
  >> running shell command:
        cp -dR /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh .
        [started at: 2025-10-12 14:16:49]
        [working dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/cp-jqz62zt8]
  >> command completed: exit 0, ran in < 1s
  >> running shell command:
        cp -dR /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-fortran-compiler-2025.2.0.534_offline.sh .
        [started at: 2025-10-12 14:16:49]
        [working dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/cp-2afxutgh]
  >> command completed: exit 0, ran in < 1s
== ... (took < 1 sec)
== patching...
== ... (took < 1 sec)
== preparing...
  >> (no build dependencies specified)
  >> loading modules for (runtime) dependencies:
  >>  * GCCcore/14.3.0
  >>  * binutils/2.44
== ... (took < 1 sec)
== configuring...
== ... (took < 1 sec)
== building...
== ... (took < 1 sec)
== testing...
== ... (took < 1 sec)
== installing...
== installing part 1/2 (intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh)...
  >> running shell command:
        HOME=/dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system  ./intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh -a --action install --silent --eula accept --install-dir /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/intel-compilers/2025.2.0
        [started at: 2025-10-12 14:16:49]
        [working dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/system-system-ml9cm8kw]

ERROR: Shell command failed!
    full command              ->  HOME=/dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system  ./intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh -a --action install --silent --eula accept --install-dir /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/intel-compilers/2025.2.0
    exit code                 ->  126
    called from               ->  'install_step_oneapi' function in /p/project1/cswmanage/reuter1/EasyBuild/prog/lib/python3.9/site-packages/easybuild/easyblocks/generic/intelbase.py (line 441)
    working directory         ->  /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system
    output (stdout + stderr)  ->  /tmp/eb-r600p1g2/run-shell-cmd-output/system-system-ml9cm8kw/out.txt
    interactive shell script  ->  /tmp/eb-r600p1g2/run-shell-cmd-output/system-system-ml9cm8kw/cmd.sh

== ... (took 4 secs)
== FAILED: Installation ended unsuccessfully: shell command 'system-system ...' failed with exit code 126 in install step for intel-compilers-2025.2.0.eb (took 47 secs)
== Results of the build can be found in the log file(s) /tmp/eb-r600p1g2/easybuild-intel-compilers-2025.2.0-20251012.141606.eGYin.log
== processing EasyBuild easyconfig /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs/m/M4/M4-1.4.20.eb
== building and installing Core/M4/1.4.20...
  >> installation prefix: /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/M4/1.4.20
== fetching files and verifying checksums...

WARNING: Found file m4-1.4.20.tar.gz at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/m/M4/m4-1.4.20.tar.gz, but re-downloading it anyway...

  >> download succeeded: https://ftpmirror.gnu.org/gnu/m4/m4-1.4.20.tar.gz
  >> sources:
  >> /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/m/M4/m4-1.4.20.tar.gz [SHA256: 6ac4fc31ce440debe63987c2ebbf9d7b6634e67a7c3279257dc7361de8bdb3ef]

WARNING: Found file config.guess at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/generic/eb_v5.1.3.dev0/ConfigureMake/config.guess, but re-downloading it anyway...

  >> download succeeded: https://git.savannah.gnu.org/cgit/config.git/plain/config.guess?id=28ea239c53a2d5d8800c472bc2452eaa16e37af2
== ... (took 2 secs)
== creating build dir, resetting environment...
  >> build dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system
== ... (took < 1 sec)
== unpacking...
  >> running shell command:
        tar xzf /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/m/M4/m4-1.4.20.tar.gz
        [started at: 2025-10-12 14:16:56]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/tar-bwj1k9qe]
  >> command completed: exit 0, ran in < 1s
== ... (took < 1 sec)
== patching...
== ... (took < 1 sec)
== preparing...
== ... (took < 1 sec)
== configuring...
  >> running shell command:
        /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/generic/eb_v5.1.3.dev0/ConfigureMake/config.guess
        [started at: 2025-10-12 14:16:57]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/configguess-xz_q3btp]
  >> command completed: exit 0, ran in < 1s
  >> running shell command:
        ./configure --prefix=/p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/M4/1.4.20  --build=aarch64-unknown-linux-gnu  --host=aarch64-unknown-linux-gnu --enable-c++ CPPFLAGS=-fgnu89-inline
        [started at: 2025-10-12 14:16:57]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/configure-73f346ds]
  >> command completed: exit 0, ran in 00h00m41s
== ... (took 41 secs)
== building...
  >> running shell command:
        make  -j 16
        [started at: 2025-10-12 14:17:38]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/make-rn8l8dka]
  >> command completed: exit 0, ran in 00h00m01s
== ... (took 1 secs)
== testing...
== ... (took < 1 sec)
== installing...
  >> running shell command:
        make install
        [started at: 2025-10-12 14:17:41]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/make-kpai64u_]
  >> command completed: exit 0, ran in < 1s
== ... (took < 1 sec)
== taking care of extensions...
== ... (took < 1 sec)
== restore after iterating...
== ... (took < 1 sec)
== postprocessing...
== ... (took < 1 sec)
== sanity checking...
  >> file 'bin/m4' found: OK
  >> loading modules: M4/1.4.20...
== ... (took < 1 sec)
== cleaning up...
== ... (took < 1 sec)
== creating module...
  >> generating module file @ /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/modules/all/Core/M4/1.4.20.lua
== ... (took < 1 sec)
== permissions...
== ... (took < 1 sec)
== packaging...
== ... (took < 1 sec)
== COMPLETED: Installation ended successfully (took 47 secs)
== Results of the build can be found in the log file(s) /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/M4/1.4.20/easybuild/easybuild-M4-1.4.20-20251012.141742.log
== Build succeeded for 1 out of 2
== Summary:
   * [FAILED]  Core/intel-compilers/2025.2.0
   * [SUCCESS] Core/M4/1.4.20

So this is working as expected. Might come in very handy in preparing 2025a on this system for PR testing...

Thyre · 2025-10-13T20:42:23Z

There are errors which this doesn't catch. Mainly, I've encountered these two so far:

ERROR: Failed to process easyconfig /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs/n/nvtop/nvtop-3.2.0-GCCcore-14.2.0.eb: One or more OS dependencies were not found: [('libsystemd-dev', 'libudev-dev', 'systemd-devel')]

== Temporary log file in case of crash /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/log/easybuild-t5ux6ild.log
ERROR: One or more files not found: non_existent_config.eb (search paths: /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs)

The second case is arbitrarily made up, but came up due to the EasyStack I used being created with an EasyConfig not existing upstream. Don't know how easy it would be to catch such cases as well, since they also abort --upload-test-report and --dump-test-report at the moment.

This honestly isn't a blocker though, as one can still use flags to work around the former and should make sure that the files exist for the latter.

Flamefire · 2025-10-14T07:43:45Z

I think having no or missing easyconfigs should error in any case as it might hint that you mistyped an option or similar issue where it might not do what you expected.

And we can argue that this option is documented to continue on a failed build and if it fails to parse it is a different issue.

Thyre · 2025-10-14T07:46:46Z

I think having no or missing easyconfigs should error in any case as it might hint that you mistyped an option or similar issue where it might not do what you expected.

And we can argue that this option is documented to continue on a failed build and if it fails to parse it is a different issue.

Absolutely, a missing EasyConfig should error out in any case.
Missing OS deps (i.e. fails to parse) is a separate issue, and I think this should be handled in a separate PR, if at all.

I'm fine with keeping the current behavior for all three (--upload-test-report, --dump-test-report & --keep-going).

Flamefire · 2025-10-14T08:04:13Z

Missing OS deps (i.e. fails to parse) is a separate issue, and I think this should be handled in a separate PR, if at all.

IIRC we have --ignore-os-deps for that.

And failing to parse an easyconfig could as well be that you accidentally passed an easyblock or patch instead of an easyconfig, so again it isn't a build issue which we want to ignore with this option. If this can be made more clear in the description we could do that. But as it fails right at the start I think it is fine as-is.
hence I wouldn't handle that and just let it fail

easybuild/main.py

boegel · 2025-12-03T07:11:12Z

easybuild/main.py

+    if options.dump_test_report or options.upload_test_report:
+        # Generation test reports is successful even when software failed to build
+        return EasyBuildExit.SUCCESS


Hmm, not sure about this...

Exit code should always indicate whether (all) installations were successful or not, regardless of whether a test report was uploaded/dumped? That seems like the least surprising behavior to me...

Shouldn't the exit code indicate an error in the requested operation? And a failed installation while explicitly testing for those isn't an error.

So it's a definition issue and I'm not fully sure how to solve it, so either way is fine to me. Hence I'm changing this to an error as suggested

easybuild/main.py

boegel · 2025-12-06T16:44:27Z

@Flamefire merge conflict to fix...

Flamefire · 2025-12-09T11:27:02Z

Resolved

boegel

lgtm

Add --keep-going option to fail at the end

c183086

Allows to install any possible easyconfig without stopping if a single one fails

Flamefire force-pushed the keep-going branch from 4f7d912 to e0d4cbe Compare October 9, 2025 10:14

Flamefire added 3 commits October 9, 2025 12:45

Style fixes

e250315

Don't fail if generating test reports

3de55d1

Exit with integer exit code

45e4e6a

Flamefire force-pushed the keep-going branch from e0d4cbe to 45e4e6a Compare October 9, 2025 10:45

Thyre reviewed Oct 9, 2025

View reviewed changes

easybuild/tools/build_log.py Outdated Show resolved Hide resolved

test/framework/options.py Outdated Show resolved Hide resolved

Flamefire and others added 2 commits October 9, 2025 16:48

Fix typo

74e0ce4

Co-authored-by: Jan André Reuter <jan.andre.reuter@hotmail.de>

Remove comment

fb8b94f

boegel added the enhancement label Oct 22, 2025

boegel added this to the next release (5.2.0?) milestone Oct 22, 2025

Merge branch 'develop' into keep-going

1de1314

boegel requested changes Dec 3, 2025

View reviewed changes

Thyre reviewed Dec 3, 2025

View reviewed changes

easybuild/main.py Outdated Show resolved Hide resolved

Flamefire added 2 commits December 3, 2025 16:03

Reduce parens in exit call

306e43e

Report error even for test reports

d87d8a0

Flamefire force-pushed the keep-going branch from afb25c8 to d87d8a0 Compare December 3, 2025 15:03

Merge branch 'develop' into keep-going

b03db3b

boegel changed the title ~~Add --keep-going option to fail at the end not at first failing installation~~ Add --keep-going option to fail at the end not at first failing installation + use proper exit code as determined by main function Dec 14, 2025

boegel approved these changes Dec 14, 2025

View reviewed changes

boegel merged commit 7d33d7b into easybuilders:develop Dec 14, 2025
40 checks passed

Flamefire deleted the keep-going branch December 15, 2025 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `--keep-going` option to fail at the end not at first failing installation + use proper exit code as determined by `main` function#5022

Add `--keep-going` option to fail at the end not at first failing installation + use proper exit code as determined by `main` function#5022
boegel merged 10 commits intoeasybuilders:developfrom
Flamefire:keep-going

Flamefire commented Oct 9, 2025

Uh oh!

Uh oh!

Uh oh!

Thyre commented Oct 12, 2025

Uh oh!

Thyre commented Oct 13, 2025

Uh oh!

Flamefire commented Oct 14, 2025

Uh oh!

Thyre commented Oct 14, 2025

Uh oh!

Flamefire commented Oct 14, 2025

Uh oh!

Uh oh!

boegel Dec 3, 2025

Uh oh!

Flamefire Dec 3, 2025

Uh oh!

Uh oh!

boegel commented Dec 6, 2025

Uh oh!

Flamefire commented Dec 9, 2025

Uh oh!

boegel left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Flamefire commented Oct 9, 2025

Uh oh!

Uh oh!

Uh oh!

Thyre commented Oct 12, 2025

Uh oh!

Thyre commented Oct 13, 2025

Uh oh!

Flamefire commented Oct 14, 2025

Uh oh!

Thyre commented Oct 14, 2025

Uh oh!

Flamefire commented Oct 14, 2025

Uh oh!

Uh oh!

boegel Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Flamefire Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

boegel commented Dec 6, 2025

Uh oh!

Flamefire commented Dec 9, 2025

Uh oh!

boegel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants