-
Notifications
You must be signed in to change notification settings - Fork 16
Add bindexec #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add bindexec #103
Conversation
2a976a5 to
c4e808b
Compare
c4e808b to
211b6b1
Compare
|
I feel bad for not trying this out yet given that I asked for it, but I do plan to come back to it. My understanding from reading the code is that this can't mount a squashfs, is that correct? (Sorry for my ignorance, I don't even know if that would be possible). Right now a squashfs is the most likely use case we have, we expect the parallel filesystem to groan under a fully unpacked software stack. |
|
If you need to use a squashfs filesystem I think the thing to do is to use squashfuse_ll on it and then bindmount its mountpoint into /cvmfs using this script. If that works for you but is too complicated I might consider integrating that piece of complication into this script too, similar to the way that cvmfsexec supports fuse2fs mounts. It also occurs to me that the problems you experienced with apptainer and MPI applications might reappear with just the bindexec script, depending on what the root cause was. FYI I have some known issues with this version, in particular it doesn't work within apptainer. When I last worked on it I was running out of things to try, and since then it hasn't made it back to the top of the priority stack. I do hope to get back to it sometime however. Since you were trying to avoid apptainer anyway that particular issue probably won't bother you. |
|
I pushed an updated version. This now works for me within apptainer and nested. It places a writable overlay over everything, except for nfs filesystems, /tmp, and /var/tmp by default. If there's anything that you want to be able to do persistent writes on you need to bind them yourself from the host onto the same place with a This may be too onerous of a user interface. Maybe it needs to be redone with an "underlay" algorithm like we used to have in apptainer rather than using fuse-overlayfs, where it binds everything in from the host except for places where custom bind mounts are added. The algorithm can get kind of hairy however depending on how deep the custom bind points are added. For now fuse-overlayfs needs to be in its PATH. I usually do when testing because that has a very recent version of fuse-overlayfs. It could also come from |
|
@DrDaveD The |
Oops. It's there now. |
|
I took this for a test drive on Ubuntu, and apart from the warning it worked very well: ocaisa@~$ ls test
random_file
ocaisa@~$ mksquashfs $PWD/test test.sqsh
Parallel mksquashfs: Using 8 processors
Creating 4.0 filesystem on test.sqsh, block size 131072.
....
ocaisa@~$ squashfuse_ll test.sqsh chicken
ocaisa@~$ ls chicken/
random_file
ocaisa@~$ bash bindexec $PWD/chicken:/cvmfake -- ls /cvmfake
mount: /dev/shm/bindexec/overlay/tmp: wrong fs type, bad option, bad superblock on /tmp, missing codepage or helper program, or other error.
random_file |
|
I tried the same test at Barcelona Supercomputing Centre. I could find I tried to debug this a bit via bash ( I tried a few things to see if I could pin it down, but without success. |
|
fuse-overlayfs v1.14 should be the latest, but maybe just to be sure that's not the cause you could install an unprivileged version of apptainer and get fuse-overlayfs from there. What is the HPC operating system? squashfuse_ll should not be used by bindexec. |
[ub686081@alogin1 ~]$ cat /etc/os-release
NAME="Red Hat Enterprise Linux"
VERSION="9.2 (Plow)"I tried The I don't know if it matters, but there are two things about the system to note:
|
|
I think I see what is going wrong now, If I disable the check for nfs, things work: [ub686081@alogin1 ~]$ diff bindexec bindexec.mod
136c136
< if [[ \$TYPE = nfs* ]]; then
---
> if [[ \$TYPE = nfsnnn* ]]; then
[ub686081@alogin1 ~]$ bash ~/bindexec $PWD/chicken:/cvmfake -- ls /cvmfake
mkdir: cannot create directory '.old-root': Read-only file system
pivot_root: failed to change root from `.' to `.old-root': No such file or directory
rmdir: failed to remove '.old-root': Read-only file system
ls: cannot access '/cvmfake': No such file or directory
[ub686081@alogin1 ~]$ bash ~/bindexec.mod $PWD/chicken:/cvmfake -- ls /cvmfake
random_file
[ub686081@alogin1 ~]$ ls chicken/
random_file
[ub686081@alogin1 ~]$ ps ax | grep squash
84296 ? Ssl 0:00 squashfuse_ll /home/ub/ub686081/test.sqsh /home/ub/ub686081/chicken |
|
On the Lumi system I also saw a failure, but the reason is more obvious: ocaisala@uan04:~> export PATH=$PWD/apptainer/x86_64/libexec/apptainer/bin:$PATH
ocaisala@uan04:~> squashfuse_ll test.sqsh chicken
ocaisala@uan04:~> ls chicken/
random_file
ocaisala@uan04:~> bash bindexec $PWD/chicken:/cvmfake -- ls /cvmfake
unshare: unshare failed: No space left on device
ocaisala@uan04:~> cat /proc/sys/user/max_user_namespaces
0 |
I saw it cause other problems though. Maybe it should only be applied for non- |
|
You know much better than me what the consequences might be. Certainly checking for EDIT: Ah, I see
so indeed, [ub686081@alogin1 ~]$ df /tmp /dev/shm
Filesystem 1K-blocks Used Available Use% Mounted on
rw 263987692 1057648 262930044 1% /tmp
tmpfs 263987692 66704 263920988 1% /dev/shmso strictly speaking they do seem to meet that requirement. |
1f03965 to
d1db859
Compare
d1db859 to
135741a
Compare
cvmfsexecwithcvmfs_shrinkwrap#100singcvmfs? #101