Skip to content

Commit 1f03965

Browse files
committed
add bindexec
1 parent 36012ef commit 1f03965

File tree

4 files changed

+255
-10
lines changed

4 files changed

+255
-10
lines changed

ChangeLog

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
- Add a bindexec command.
2+
- Stop using $TMPDIR as a temporary variable name in cvmfsexec because it
3+
might be already set and exported.
4+
15
cvmfsexec-4.46 - 2 April 2025
26
- Go back to selecting the cvmfs version from the egi and osg distribution.
37
- Properly sort the cvmfs version number from the downloaded list of packages.

README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,10 @@ do this in 4 different ways:
4242
unprivileged user namespaces enabled,
4343
this can also be used with unprivileged singularity or apptainer.
4444

45+
In addition, this package contains a related tool called
46+
[bindexec](#bindexec) which starts a new user namespace with given
47+
bind mounts added.
48+
4549
# Supported operating systems
4650

4751
Operating systems currently supported by this package are Red Hat
@@ -370,3 +374,40 @@ $ mkfs.ext3 -F -O ^has_journal -d tmp scratch.img
370374
By default the cvmfs logs are written to a top-level `log` directory, alongside
371375
the top-level `dist` directory. The variable `SINGCVMFS_LOGDIR` can be used to
372376
write them to a different directory, which will be created if it doesn't exist.
377+
378+
# bindexec
379+
380+
As a bonus, this package also includes a separate tool called `bindexec`
381+
that accepts any set of bind mounts to add into a new unprivileged user
382+
mount namespace. The usage is much like `cvmfsexec` except that instead
383+
of cvmfs repository names you give it `src:dest` pairs where `src` is a
384+
source directory or file and `dest` is a destination path. For example:
385+
386+
```
387+
$ bindexec /etc/motd:/var/lib/mydir/motd -- ls /var/lib/mydir
388+
motd
389+
```
390+
391+
Like `cvmfsexec`, if no command is supplied after `--` it runs an
392+
interactive shell.
393+
394+
Bind mounts require target destinations to exist, but if they are
395+
missing `bindexec` will automatically create them. This requires the
396+
fuse-overlayfs command to be in the PATH, although if there is demand
397+
for it a script for making that easily distributable as well will be
398+
supplied (probably through a `makedist` option).
399+
400+
Some system directories (`/proc`, `/sys`, `/dev`, and `/run`) are
401+
included as-is on top of the overlay so anything bound into those
402+
directories will not appear. In addition, any `nfs` filesystem types
403+
are automatically added on top of the overlay because they don't work
404+
properly through overlay, so no bind mounts will appear in those paths
405+
either.
406+
407+
`bindexec` always creates a new process namespace because that's the
408+
easiest way to make sure that the fuse-overlayfs process will exit when
409+
the command exits. This means that processes start over at pid 1 and no
410+
process can be seen outside of the namespace. Also because it is using
411+
an unprivileged user namespace, any files owned by anyone other than the
412+
current user will show up as being owned by `nobody` (just as it does in
413+
`cvmfsexec`).

bindexec

Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
#!/bin/bash
2+
# Add bind mounts in a user namespace and change to that space.
3+
# Requires being able to run unshare -rm and the ability to do fuse mounts
4+
# (kernel >= 4.18) and requires fuse-overlayfs.
5+
# Written by Dave Dykstra November 2024, based heavily on cvmfsexec.
6+
7+
#set -x
8+
#PS4='c$$+ '
9+
10+
VERSION=4.42
11+
12+
usage()
13+
{
14+
echo "Usage: bindexec [-v] [src:dest ...] -- [command]" >&2
15+
echo " Bind mount each src to dest in new user mount namespace and run command" >&2
16+
echo " -v: print current version and exit" >&2
17+
exit 1
18+
}
19+
20+
# needed for pivot_root
21+
PATH=$PATH:/usr/sbin
22+
23+
TMPD=/dev/shm/bindexec
24+
STARTFIFO=$TMPD/start
25+
WAITFIFO=$TMPD/wait
26+
27+
# bash syntax {NAME}<&N doesn't work on older bashes such as the
28+
# version 3.2.x on macOS Big Sur, and in fact it fails with an error
29+
# message but not an error code, so test for it first to be able to
30+
# gracefully die
31+
32+
if [ -n "$({TESTX}<&0 2>&1)" ]; then
33+
echo "Cannot assign file descriptors to variables, bash version too old" >&2
34+
exit 1
35+
fi
36+
37+
# make a copy of stdin fd, for sending to the final command
38+
exec {STDINCOPYFD}<&0
39+
40+
ORIGPWD=$PWD
41+
42+
# can't use OPTIND because it can't distinguish between -- there or missing
43+
NOPTS=0
44+
while getopts "v" OPTION; do
45+
let NOPTS+=1
46+
case $OPTION in
47+
v) echo "$VERSION"
48+
exit
49+
;;
50+
\?) usage
51+
;;
52+
esac
53+
done
54+
shift $NOPTS
55+
56+
BINDS=""
57+
for ARG; do
58+
if [ "$ARG" == "--" ]; then
59+
break
60+
fi
61+
if [[ "$ARG" != *:* ]]; then
62+
echo "bindexec: $ARG does not contain a colon" >&2
63+
usage
64+
fi
65+
if [[ "$ARG" != /* ]] || [[ "$ARG" != *:/* ]]; then
66+
echo "bindexec: source or destination in $ARG do not start with \"/\"" >&2
67+
usage
68+
fi
69+
BINDS="$BINDS $ARG"
70+
shift
71+
done
72+
73+
if [ "$ARG" != "--" ]; then
74+
echo "bindexec: no double-hyphen found" >&2
75+
usage
76+
fi
77+
shift
78+
79+
ORIGUID="$(id -u)"
80+
ORIGGID="$(id -g)"
81+
82+
UNSHAREOPTS="--propagation unchanged"
83+
84+
# Note that within the HERE document, unprotected $ substitutions are
85+
# done by the surrounding shell, and \$ is within the unshare shell
86+
unshare -rm -pf $UNSHAREOPTS /bin/bash /dev/stdin "${@:-$SHELL}" <<!EOF-1!
87+
#set -x
88+
#PS4='c\$$+ '
89+
90+
# mount a private /dev/shm
91+
mount -t tmpfs tmpfs /dev/shm
92+
mkdir $TMPD
93+
94+
# now in the first "fake root" namespace
95+
mount -t proc proc /proc
96+
mkdir -p $TMPD/upper $TMPD/work $TMPD/overlay
97+
98+
# put the bind mounts into the upper dir
99+
for BIND in $BINDS; do
100+
SRC="\${BIND%:*}"
101+
DST="\${BIND#*:}"
102+
if [ -d "\$SRC" ]; then
103+
mkdir -p $TMPD/upper\$DST
104+
elif [ -f "\$SRC" ]; then
105+
DSTDIR="\${DST%/*}"
106+
if [ "\$DST" != "\$DSTDIR" ]; then
107+
mkdir -p $TMPD/upper\$DSTDIR
108+
fi
109+
touch $TMPD/upper\$DST
110+
else
111+
echo "bindexec: \$SRC not found, skipping" >&2
112+
fi
113+
mount --rbind \$SRC $TMPD/upper\$DST
114+
done
115+
116+
# Leave this bash running as PID 1, because most other
117+
# programs won't handle signals & child reaping correctly.
118+
# Note that all other processes in the namespaces will get
119+
# a SIGKILL when PID 1 exits.
120+
trap "" 1 2 3 15 # ignore all ordinary signals
121+
122+
fuse-overlayfs -o allow_other,noacl,squash_to_root,lowerdir=/,upperdir=$TMPD/upper,workdir=$TMPD/work $TMPD/overlay 2> >(grep -v lazytime >&2)
123+
# Put original system dirs on top of the overlay
124+
mount -t proc proc $TMPD/overlay/proc
125+
mount --rbind /sys $TMPD/overlay/sys
126+
mount --rbind /dev $TMPD/overlay/dev
127+
128+
# Add cvmfs on top if it is present
129+
if [ -d /cvmfs ]; then
130+
mkdir -p $TMPD/overlay/cvmfs
131+
mount --rbind /cvmfs $TMPD/overlay/cvmfs
132+
fi
133+
134+
# Also bind on top nfs mounts because they don't work through fuse-overlayfs
135+
mount|while read FROM X TO X TYPE REST; do
136+
if [[ \$TYPE = nfs* ]]; then
137+
mkdir -p $TMPD/overlay\$TO
138+
# this sometimes fails with weird bind mount combinations
139+
# under apptainer so just save the output in a variable so
140+
# it can be seen with debugging enabled
141+
MSG="\$(mount --rbind \$TO $TMPD/overlay\$TO 2>&1)"
142+
fi
143+
done
144+
145+
# Also bind /tmp and /var/tmp so files created there go into the system
146+
# directories instead of into a ram disk
147+
for D in /tmp /var/tmp; do
148+
mount --rbind \$D $TMPD/overlay\$D
149+
done
150+
151+
# Start a second fake root namespace so we don't interfere with the
152+
# fuse-overlayfs mount space when we do the pivot_root.
153+
# Quoting the HERE document's delimeter makes this nested shell not
154+
# interpret $ substitutions, but the previous one still does so
155+
# need to use \$ when don't want first shell to expand.
156+
unshare -rm $UNSHAREOPTS /bin/bash /dev/stdin "\${@:-$SHELL}" <<'!EOF-2!'
157+
#set -x
158+
#PS4='c\$$+ '
159+
160+
mkfifo $STARTFIFO $WAITFIFO
161+
162+
(
163+
# This is a background process for setting up the child's uid map
164+
trap "" 1 2 3 15 # ignore ordinary signals
165+
read PID
166+
# set up uid/gid map
167+
echo "$ORIGGID 0 1" >/proc/"\$PID"/gid_map
168+
echo "$ORIGUID 0 1" >/proc/"\$PID"/uid_map
169+
echo "ready" >$WAITFIFO
170+
) <$STARTFIFO &
171+
172+
# Change to the new root. Would use chroot but it doesn't work.
173+
mount --rbind $TMPD/overlay $TMPD/overlay # pivot_root requires this
174+
cd $TMPD/overlay
175+
mkdir -p .old-root
176+
pivot_root . .old-root
177+
umount -l .old-root 2>/dev/null
178+
rmdir .old-root
179+
cd /
180+
181+
# Finally, start the user namespace with the original uid/gid
182+
# This HERE document is also quoted and so the shell does not expand
183+
exec unshare -U $UNSHAREOPTS /bin/bash /dev/stdin "\${@:-$SHELL}" <<'!EOF-3!'
184+
#set -x
185+
#PS4='c\$$+ '
186+
187+
# now in the user namespace
188+
189+
cd $ORIGPWD
190+
191+
echo "\$$" >$STARTFIFO
192+
# wait for the uid/gid maps to be set up
193+
read X <$WAITFIFO
194+
195+
exec "\$@" <&$STDINCOPYFD $STDINCOPYFD<&-
196+
!EOF-3!
197+
198+
!EOF-2!
199+
200+
!EOF-1!

cvmfsexec

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,13 @@ elif [ "$MAJORKERN" -eq 3 -a "$MINORKERN" -eq 10 -a "$REVKERN" -ge 1127 ]; then
3939
USERFUSE=true
4040
fi
4141

42-
TMPDIR=$(mktemp -d)
43-
trap "rm -rf $TMPDIR" 0 # note that trap does not carry past exec
44-
CMDFIFO1=$TMPDIR/cmd1
45-
WAITFIFO1=$TMPDIR/wait1
46-
CMDFIFO2=$TMPDIR/cmd2
47-
WAITFIFO2=$TMPDIR/wait2
48-
FUNCS=$TMPDIR/funcs
42+
TMPD=$(mktemp -d)
43+
trap "rm -rf $TMPD" 0 # note that trap does not carry past exec
44+
CMDFIFO1=$TMPD/cmd1
45+
WAITFIFO1=$TMPD/wait1
46+
CMDFIFO2=$TMPD/cmd2
47+
WAITFIFO2=$TMPD/wait2
48+
FUNCS=$TMPD/funcs
4949

5050
# create the fifos used for interprocess communication
5151
mkfifo $CMDFIFO1 $WAITFIFO1 $CMDFIFO2 $WAITFIFO2
@@ -238,7 +238,7 @@ else
238238
fi
239239
./umountrepo $REPO >/dev/null
240240
done
241-
rm -rf $TMPDIR
241+
rm -rf $TMPD
242242
) &
243243
fi
244244

@@ -252,7 +252,7 @@ unshare -rm $UNSHAREOPTS /bin/bash /dev/stdin "${@:-$SHELL}" <<!EOF-1!
252252
#set -x
253253
#PS4='c\$$+ '
254254
# now in the "fakeroot" namespace
255-
trap "rm -rf $TMPDIR" 0 # note that this does not carry through "exec"
255+
trap "rm -rf $TMPD" 0 # note that this does not carry through "exec"
256256
257257
mkdir -p $HERE/mnt
258258
mount --rbind $HERE/mnt $HERE/mnt # pivot_root requires this mountpoint
@@ -411,7 +411,7 @@ unshare -rm $UNSHAREOPTS /bin/bash /dev/stdin "${@:-$SHELL}" <<!EOF-1!
411411
# processes in the namespaces will get a SIGKILL when
412412
# PID 1 exits.
413413
EXEC=""
414-
trap "rm -rf $TMPDIR" 0
414+
trap "rm -rf $TMPD" 0
415415
trap "" 1 2 3 15 # ignore all ordinary signals
416416
else
417417
EXEC=exec

0 commit comments

Comments
 (0)