Skip to content

Conversation

@casparvl
Copy link
Contributor

@casparvl casparvl commented Aug 6, 2025

Related to #56

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13594692

date job status comment
Aug 06 10:05:24 UTC 2025 submitted job id 13594692 will be eligible to start in about 20 seconds
Aug 06 10:05:33 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 10:05:46 UTC 2025 running job 13594692 is running
Aug 06 10:14:02 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13594692.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-17544751500.tar.gzsize: 0 MiB (45 bytes)
entries: 0
modules under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/software
no software packages in tarball
reprod directories under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/reprod
no reprod directories in tarball
other under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90
no other files in tarball
Aug 06 10:14:02 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13594692.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Hm, it's not running into the issue but it's also not bind-mounting:

BIND_PATHS after processing 'eessi.io-2023.06-software,mode=bind'
  BIND_PATHS=/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa/eessi.6LAYiqpy0O/var-lib-cvmfs:/var/lib/cvmfs,/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa/eessi.6LAYiqpy0O/var-run-cvmfs:/var/run/cvmfs,/projects/eessibot/eessi-bot-surf/SHARED/host-injections:/opt/eessi,/tmp/eessibot/EESSI/eessi_job.v
eSqtbGnHa/eessi.6LAYiqpy0O:/tmp,/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa,/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_dcad9470-72ac-11f0-9544-da349bf14e99/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software,/dev,/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa/eessi.6LAYiqpy0O/var-log:/var/log,/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa/eessi.6LAYiqpy0O/usr-local-cuda:/usr/local/cuda,/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa/eessi.6LAYiqpy0O/repos_cfg/default.local:/etc/cvmfs/default.local,/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa/eessi.6LAYiqpy0O/repos_cfg
/eessi.io/eessi.io.pub:/etc/cvmfs/keys/eessi.io/eessi.io.pub,/tmp/eessibot/EESSI/eessi_job.veSqtbGnHa/eessi.6LAYiqpy0O/repos_cfg/eessi.io.conf:/etc/cvmfs/domain.d/eessi.io.conf

This should contain something like /cvmfs/software.eessi.io:/cvmfs_ro/software.eessi.io for bind-mounting the host's CVMFS repo into the container. Instead, I see the container being started with:

singularity  run --nv --contain --fusemount container:cvmfs2 software.eessi.io /cvmfs_ro/software.eessi.io  ...

Which also suggests no bind-mounting is happening.

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

add fusemount options for CVMFS repo 'eessi.io-2023.06-software,mode=bind'
Using a fuse mount for /cvmfs/eessi.io-2023.06-software

That doesn't look as intended...

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13595155

date job status comment
Aug 06 10:38:07 UTC 2025 submitted job id 13595155 will be eligible to start in about 20 seconds
Aug 06 10:38:20 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 10:38:33 UTC 2025 running job 13595155 is running
Aug 06 10:40:26 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13595155.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
Details
No artefacts were created or found.
Aug 06 10:40:26 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13595155.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Original error reproduced in 13595155. E.g.

mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/eessi_defaults.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory
sed command failed
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/lmod/bash.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory
sed command failed
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/lmod/csh.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory
sed command failed
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/lmod/fish.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory
sed command failed
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/lmod/ksh.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory
sed command failed
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/lmod/zsh.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory
sed command failed
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/modules/EESSI/2023.06.lua.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory
sed command failed
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/easybuild/eb_hooks.py.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_6ebc7df0-72b1-11f0-8949-492a928ad525/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 51: : No such file or directory

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Let's try to do an ls very early on, see if it helps

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13595265

date job status comment
Aug 06 10:41:16 UTC 2025 submitted job id 13595265 will be eligible to start in about 20 seconds
Aug 06 10:41:20 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 10:41:43 UTC 2025 running job 13595265 is running
Aug 06 10:43:39 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13595265.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
Details
No artefacts were created or found.
Aug 06 10:43:39 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13595265.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Let's try to do an touch very early on, see if that works

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13595379

date job status comment
Aug 06 10:43:57 UTC 2025 submitted job id 13595379 will be eligible to start in about 20 seconds
Aug 06 10:44:03 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 10:44:26 UTC 2025 running job 13595379 is running
Aug 06 10:46:18 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13595379.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
Details
No artefacts were created or found.
Aug 06 10:46:18 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13595379.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Get another ls output, so we can see if foo was written:

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13595421

date job status comment
Aug 06 10:45:27 UTC 2025 submitted job id 13595421 will be eligible to start in about 20 seconds
Aug 06 10:45:37 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 10:45:52 UTC 2025 running job 13595421 is running
Aug 06 10:47:51 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13595421.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
Details
No artefacts were created or found.
Aug 06 10:47:51 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13595421.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Ok, so there was a point at which I could write:

TRYING TO LS FIRST
total 4
drwxrwxr-x. 1 eessibot eessibot   40 Aug  6 12:45 .
drwxr-xr-x. 4 nobody   nogroup     0 Aug  6 12:41 ..
-rw-r--r--. 1 nobody   nogroup  1070 May 19 14:06 .cvmfsdirtab
-rw-r--r--. 1 nobody   nogroup   565 Dec  2  2023 README.eessi
lrwxrwxrwx. 1 nobody   nogroup    25 Oct  3  2023 host_injections -> /gpfs/admin/hpc/sw/EESSI/
drwxr-xr-x. 1 nobody   nogroup    21 Sep  5  2024 init
drwxr-xr-x. 1 nobody   nogroup    21 Jul 22  2024 versions
TRYING TO WRITE AT THIS LEVEL
SEE IF WE HAVE A FOO FILE
total 4
drwxrwxr-x. 1 eessibot eessibot   60 Aug  6 12:45 .
drwxr-xr-x. 4 nobody   nogroup     0 Aug  6 12:41 ..
-rw-r--r--. 1 nobody   nogroup  1070 May 19 14:06 .cvmfsdirtab
-rw-r--r--. 1 nobody   nogroup   565 Dec  2  2023 README.eessi
-rw-rw-r--. 1 eessibot eessibot    0 Aug  6 12:45 foo
lrwxrwxrwx. 1 nobody   nogroup    25 Oct  3  2023 host_injections -> /gpfs/admin/hpc/sw/EESSI/
drwxr-xr-x. 1 nobody   nogroup    21 Sep  5  2024 init
drwxr-xr-x. 1 nobody   nogroup    21 Jul 22  2024 versions

Next, let's see if we can do it closer to the mktemp commands

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13595490

date job status comment
Aug 06 10:49:16 UTC 2025 submitted job id 13595490 will be eligible to start in about 20 seconds
Aug 06 10:49:25 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 10:49:49 UTC 2025 running job 13595490 is running
Aug 06 10:51:36 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13595490.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
Details
No artefacts were created or found.
Aug 06 10:51:36 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13595490.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Ok, writing also works at the level of EESSI-software-layer.sh.

EESSI-software-layer.sh: TRYING TO LS
total 4
drwxrwxr-x 1 eessibot eessibot   60 Aug  6 12:49 .
drwxr-xr-x 4 nobody   nogroup     0 Aug  6 12:26 ..
-rw-r--r-- 1 nobody   nogroup  1070 May 19 14:06 .cvmfsdirtab
-rw-r--r-- 1 nobody   nogroup   565 Dec  2  2023 README.eessi
-rw-rw-r-- 1 eessibot eessibot    0 Aug  6 12:49 foo
lrwxrwxrwx 1 nobody   nogroup    25 Oct  3  2023 host_injections -> /gpfs/admin/hpc/sw/EESSI/
drwxr-xr-x 1 nobody   nogroup    21 Sep  5  2024 init
drwxr-xr-x 1 nobody   nogroup    21 Jul 22  2024 versions
EESSI-software-layer.sh: TRYING TO TOUCH
EESSI-software-layer.sh CHECK FOR NEW FILE
total 4
drwxrwxr-x 1 eessibot eessibot   80 Aug  6 12:49 .
drwxr-xr-x 4 nobody   nogroup     0 Aug  6 12:26 ..
-rw-r--r-- 1 nobody   nogroup  1070 May 19 14:06 .cvmfsdirtab
-rw-r--r-- 1 nobody   nogroup   565 Dec  2  2023 README.eessi
-rw-rw-r-- 1 eessibot eessibot    0 Aug  6 12:49 bar
-rw-rw-r-- 1 eessibot eessibot    0 Aug  6 12:49 foo
lrwxrwxrwx 1 nobody   nogroup    25 Oct  3  2023 host_injections -> /gpfs/admin/hpc/sw/EESSI/
drwxr-xr-x 1 nobody   nogroup    21 Sep  5  2024 init
drwxr-xr-x 1 nobody   nogroup    21 Jul 22  2024 versions

Going one level deeper, into install-scripts.sh

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13595742

date job status comment
Aug 06 11:01:52 UTC 2025 submitted job id 13595742 will be eligible to start in about 20 seconds
Aug 06 11:02:01 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 11:02:24 UTC 2025 running job 13595742 is running
Aug 06 11:04:18 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13595742.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
Details
No artefacts were created or found.
Aug 06 11:04:18 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13595742.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Now try from within the function that fails

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Aug 6, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/13596894

date job status comment
Aug 06 11:40:58 UTC 2025 submitted job id 13596894 will be eligible to start in about 20 seconds
Aug 06 11:41:09 UTC 2025 received job awaits launch by Slurm scheduler
Aug 06 11:41:32 UTC 2025 running job 13596894 is running
Aug 06 11:43:17 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13596894.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
Details
No artefacts were created or found.
Aug 06 11:43:17 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (2/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (3/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (4/8) Skipping GPU test : only 1 GPU available for this test case
[ SKIP ] (5/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (6/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (7/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ SKIP ] (8/8) Skipping test : 1 GPU(s) available for this test case, need exactly 2
[ PASSED ] Ran 0/8 test case(s) from 8 check(s) (0 failure(s), 8 skipped, 0 aborted)
Details
✅ job output file slurm-13596894.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Contributor Author

casparvl commented Aug 6, 2025

Hmmm

total 45
drwxrwxr-x 1 nobody nogroup   23 Nov 25  2023 .
drwxr-xr-x 1 nobody nogroup   22 Oct  3  2023 ..
drwxrwxr-x 1 nobody nogroup   39 Nov 25  2023 Magic_Castle
-rw-rw-r-- 1 nobody nogroup 1587 Apr 12  2024 README.md
drwxrwxr-x 1 nobody nogroup   87 Nov 25  2023 arch_specs
-rw-rw-r-- 1 nobody nogroup 1656 Jun 25 12:22 bash
drwxr-xr-x 1 nobody nogroup   25 Oct 18  2024 easybuild
-rwxrwxr-x 1 nobody nogroup 8451 Jan 10  2025 eessi_archdetect.sh
-rw-rw-r-- 1 nobody nogroup 1282 Jun 15 13:48 eessi_defaults
-rw-rw-r-- 1 nobody nogroup 8601 Jun 25 12:22 eessi_environment_variables
-rwxrwxr-x 1 nobody nogroup 4410 Apr 12  2024 eessi_software_subdir_for_host.py
drwxr-xr-x 1 nobody nogroup 4096 Sep 13  2024 lmod
-rw-rw-r-- 1 nobody nogroup  164 Aug  8  2024 lmod_eessi_archdetect_wrapper.sh
-rw-rw-r-- 1 nobody nogroup  157 Oct 11  2024 lmod_eessi_archdetect_wrapper_accel.sh
-rw-rw-r-- 1 nobody nogroup 1075 Oct 14  2024 minimal_eessi_env
drwxr-xr-x 1 nobody nogroup 4096 Sep  5  2024 modules
-rw-rw-r-- 1 nobody nogroup 3086 Apr 12  2024 test.py
install-scripts.sh:sed_update_if_changed: TRYING TO TOUCH
touch: cannot touch '/cvmfs/software.eessi.io/versions/2023.06/init/foo_install_scripts_sed_update': Permission denied
foo.iw8hQp5
install-scripts.sh:sed_update_if_changed: CHECK FOR NEW FILE
total 45
drwxrwxr-x 1 nobody nogroup   23 Nov 25  2023 .
drwxr-xr-x 1 nobody nogroup   22 Oct  3  2023 ..
drwxrwxr-x 1 nobody nogroup   39 Nov 25  2023 Magic_Castle
-rw-rw-r-- 1 nobody nogroup 1587 Apr 12  2024 README.md
drwxrwxr-x 1 nobody nogroup   87 Nov 25  2023 arch_specs
-rw-rw-r-- 1 nobody nogroup 1656 Jun 25 12:22 bash
drwxr-xr-x 1 nobody nogroup   25 Oct 18  2024 easybuild
-rwxrwxr-x 1 nobody nogroup 8451 Jan 10  2025 eessi_archdetect.sh
-rw-rw-r-- 1 nobody nogroup 1282 Jun 15 13:48 eessi_defaults
-rw-rw-r-- 1 nobody nogroup 8601 Jun 25 12:22 eessi_environment_variables
-rwxrwxr-x 1 nobody nogroup 4410 Apr 12  2024 eessi_software_subdir_for_host.py
drwxr-xr-x 1 nobody nogroup 4096 Sep 13  2024 lmod
-rw-rw-r-- 1 nobody nogroup  164 Aug  8  2024 lmod_eessi_archdetect_wrapper.sh
-rw-rw-r-- 1 nobody nogroup  157 Oct 11  2024 lmod_eessi_archdetect_wrapper_accel.sh
-rw-rw-r-- 1 nobody nogroup 1075 Oct 14  2024 minimal_eessi_env
drwxr-xr-x 1 nobody nogroup 4096 Sep  5  2024 modules
-rw-rw-r-- 1 nobody nogroup 3086 Apr 12  2024 test.py
mktemp: failed to create file via template '/cvmfs/software.eessi.io/versions/2023.06/init/eessi_defaults.XXXXXX': Permission denied
/gpfs/work1/1/eessibot/eessi-bot-surf/jobs/2025.08/pr_58/event_361b7ce0-72ba-11f0-9603-6dfd3359a125/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/install_scripts.sh: line 61: : No such file or directory
sed command failed

Not how

cvmfs/software.eessi.io/versions/2023.06/init

is owned by nobody nogroup. Whereas the top level /cvmfs/software.eessi.io is owned by eessibot (and therefore writing there succeeds!)

@casparvl
Copy link
Contributor Author

Closing this. I managed to reproduce the issue at #56 , but as discussed in that issue there doesn't seem to be an easy fix, and it's not high enough priority to investigate further.

@casparvl casparvl closed this Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant