Skip to content

Commit 2a976a5

Browse files
committed
add bindexec
1 parent 672ceed commit 2a976a5

File tree

4 files changed

+224
-10
lines changed

4 files changed

+224
-10
lines changed

ChangeLog

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
1+
- Add a bindexec command.
12
- Add the variable SINGCVMFS_LOGDIR to override the location of the
23
cvmfs logs.
4+
- Stop using $TMPDIR as a temporary variable name in cvmfsexec because it
5+
might be already set and exported.
36

47
cvmfsexec-4.42 - 24 September 2024
58
- Add rhel9-aarch64 and rhel9-ppc64le machine types.

README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,10 @@ do this in 4 different ways:
4242
unprivileged user namespaces enabled,
4343
this can also be used with unprivileged singularity or apptainer.
4444

45+
In addition, this package contains a related tool called
46+
[bindexec](#bindexec) which starts a new user namespace with given
47+
bind mounts added.
48+
4549
# Supported operating systems
4650

4751
Operating systems currently supported by this package are Red Hat
@@ -356,3 +360,40 @@ $ mkfs.ext3 -F -O ^has_journal -d tmp scratch.img
356360
By default the cvmfs logs are written to a top-level `log` directory, alongside
357361
the top-level `dist` directory. The variable `SINGCVMFS_LOGDIR` can be used to
358362
write them to a different directory, which will be created if it doesn't exist.
363+
364+
# bindexec
365+
366+
As a bonus, this package also includes a separate tool called `bindexec`
367+
that accepts any set of bind mounts to add into a new unprivileged user
368+
mount namespace. The usage is much like `cvmfsexec` except that instead
369+
of cvmfs repository names you give it `src:dest` pairs where `src` is a
370+
source directory or file and `dest` is a destination path. For example:
371+
372+
```
373+
$ bindexec /etc/motd:/var/lib/mydir/motd -- ls /var/lib/mydir
374+
motd
375+
```
376+
377+
Like `cvmfsexec`, if no command is supplied after `--` it runs an
378+
interactive shell.
379+
380+
Bind mounts require target destinations to exist, but if they are
381+
missing `bindexec` will automatically create them. This requires the
382+
fuse-overlayfs command to be in the PATH, although if there is demand
383+
for it a script for making that easily distributable as well will be
384+
supplied (probably through a `makedist` option).
385+
386+
Some system directories (`/proc`, `/sys`, `/dev`, and `/run`) are
387+
included as-is on top of the overlay so anything bound into those
388+
directories will not appear. In addition, any `nfs` filesystem types
389+
are automatically added on top of the overlay because they don't work
390+
properly through overlay, so no bind mounts will appear in those paths
391+
either.
392+
393+
`bindexec` always creates a new process namespace because that's the
394+
easiest way to make sure that the fuse-overlayfs process will exit when
395+
the command exits. This means that processes start over at pid 1 and no
396+
process can be seen outside of the namespace. Also because it is using
397+
an unprivileged user namespace, any files owned by anyone other than the
398+
current user will show up as being owned by `nobody` (just as it does in
399+
`cvmfsexec`).

bindexec

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
#!/bin/bash
2+
# Add bind mounts in a user namespace and change to that space.
3+
# Requires being able to run unshare -rm and the ability to do fuse mounts
4+
# (kernel >= 4.18) and requires fuse-overlayfs.
5+
# Written by Dave Dykstra November 2024, based heavily on cvmfsexec.
6+
7+
#set -x
8+
#PS4='c$$+ '
9+
10+
VERSION=4.42
11+
12+
usage()
13+
{
14+
echo "Usage: bindexec [-v] [src:dest ...] -- [command]" >&2
15+
echo " Bind mount each src to dest in new user mount namespace" >&2
16+
echo " -v: print current version and exit" >&2
17+
exit 1
18+
}
19+
20+
# needed for pivot_root
21+
PATH=$PATH:/usr/sbin
22+
23+
TMPD="$(mktemp -d /dev/shm/bindexec.XXXXXXXXXX)"
24+
trap "rm -rf $TMPD" 0 # note that trap does not carry past exec
25+
STARTFIFO=$TMPD/start
26+
WAITFIFO=$TMPD/wait
27+
mkfifo $STARTFIFO $WAITFIFO
28+
29+
# bash syntax {NAME}<&N doesn't work on older bashes such as the
30+
# version 3.2.x on macOS Big Sur, and in fact it fails with an error
31+
# message but not an error code, so test for it first to be able to
32+
# gracefully die
33+
34+
if [ -n "$({TESTX}<&0 2>&1)" ]; then
35+
echo "Cannot assign file descriptors to variables, bash version too old" >&2
36+
exit 1
37+
fi
38+
39+
# make a copy of stdin fd, for sending to the final command
40+
exec {STDINCOPYFD}<&0
41+
42+
ORIGPWD=$PWD
43+
44+
# can't use OPTIND because it can't distinguish between -- there or missing
45+
NOPTS=0
46+
while getopts "v" OPTION; do
47+
let NOPTS+=1
48+
case $OPTION in
49+
v) echo "$VERSION"
50+
exit
51+
;;
52+
\?) usage
53+
;;
54+
esac
55+
done
56+
shift $NOPTS
57+
58+
BINDS=""
59+
for ARG; do
60+
if [ "$ARG" == "--" ]; then
61+
break
62+
fi
63+
if [[ "$ARG" != *:* ]]; then
64+
usage
65+
fi
66+
BINDS="$BINDS $ARG"
67+
shift
68+
done
69+
70+
if [ "$ARG" != "--" ]; then
71+
usage
72+
fi
73+
shift
74+
75+
ORIGUID="$(id -u)"
76+
ORIGGID="$(id -g)"
77+
78+
# Note that within the HERE document, unprotected $ substitutions are
79+
# done by the surrounding shell, and \$ is within the unshare shell
80+
unshare -rm -pf /bin/bash /dev/stdin "${@:-$SHELL}" <<!EOF-1!
81+
#set -x
82+
#PS4='c\$$+ '
83+
84+
# now in the first "fake root" namespace
85+
mount -t proc proc /proc
86+
mkdir -p $TMPD/upper $TMPD/work $TMPD/overlay
87+
88+
# put the bind mounts into the upper dir
89+
for BIND in $BINDS; do
90+
SRC="\${BIND%:*}"
91+
DST="\${BIND#*:}"
92+
if [ -d "\$SRC" ]; then
93+
mkdir -p $TMPD/upper/\$DST
94+
elif [ -f "\$SRC" ]; then
95+
DSTDIR="\${DST%/*}"
96+
if [ "\$DST" != "\$DSTDIR" ]; then
97+
mkdir -p $TMPD/upper/\$DSTDIR
98+
fi
99+
touch $TMPD/upper/\$DST
100+
else
101+
echo "bindexec: \$SRC not found, skipping" >&2
102+
fi
103+
mount --bind \$SRC $TMPD/upper/\$DST
104+
done
105+
106+
# Leave this bash running as PID 1, because most other
107+
# programs won't handle signals & child reaping correctly.
108+
# Note that all other processes in the namespaces will get
109+
# a SIGKILL when PID 1 exits.
110+
trap "" 1 2 3 15 # ignore all ordinary signals
111+
112+
fuse-overlayfs -o lowerdir=/,upperdir=$TMPD/upper,workdir=$TMPD/work $TMPD/overlay 2> >(grep -v lazytime >&2)
113+
# put original system dirs on top of the overlay
114+
mount -t proc proc $TMPD/overlay/proc
115+
mount --rbind /sys $TMPD/overlay/sys
116+
mount --rbind /dev $TMPD/overlay/dev
117+
mount --rbind /run $TMPD/overlay/run
118+
119+
# also overlay nfs mounts because they don't work through overlay
120+
mount|while read FROM X TO X TYPE REST; do
121+
if [[ \$TYPE = nfs* ]]; then
122+
mkdir -p $TMPD/overlay/\$TO
123+
mount --bind \$TO $TMPD/overlay/\$TO
124+
fi
125+
done
126+
127+
# Start a second fake root namespace so we don't interfere with the
128+
# fuse-overlayfs mount space when we do the pivot_root.
129+
# Quoting the HERE document's delimeter makes this nested shell not
130+
# interpret $ substitutions, but the previous one still does so
131+
# need to use \$ when don't want first shell to expand.
132+
unshare -rm /bin/bash /dev/stdin "\${@:-$SHELL}" <<'!EOF-2!'
133+
#set -x
134+
#PS4='c\$$+ '
135+
136+
(
137+
# This is a background process for setting up the child's uid map
138+
trap "" 1 2 3 15 # ignore ordinary signals
139+
read PID
140+
# set up uid/gid map
141+
echo "$ORIGGID 0 1" >/proc/"\$PID"/gid_map
142+
echo "$ORIGUID 0 1" >/proc/"\$PID"/uid_map
143+
echo "ready" >$WAITFIFO
144+
) <$STARTFIFO &
145+
146+
# Change to the new root. Would use chroot but it doesn't work.
147+
mount --rbind $TMPD/overlay $TMPD/overlay # pivot_root requires this
148+
cd $TMPD/overlay
149+
mkdir -p .old-root
150+
pivot_root . .old-root
151+
cd $ORIGPWD
152+
153+
# Finally, start the user namespace with the original uid/gid
154+
# This HERE document is also quoted and so the shell does not expand
155+
exec unshare -U /bin/bash /dev/stdin "\${@:-$SHELL}" <<'!EOF-3!'
156+
#set -x
157+
#PS4='c\$$+ '
158+
159+
# now in the user namespace
160+
161+
echo "\$$" >$STARTFIFO
162+
# wait for the uid/gid maps to be set up
163+
read X <$WAITFIFO
164+
165+
exec "\$@" <&$STDINCOPYFD $STDINCOPYFD<&-
166+
!EOF-3!
167+
168+
!EOF-2!
169+
170+
!EOF-1!

cvmfsexec

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,13 @@ elif [ "$MAJORKERN" -eq 3 -a "$MINORKERN" -eq 10 -a "$REVKERN" -ge 1127 ]; then
3939
USERFUSE=true
4040
fi
4141

42-
TMPDIR=$(mktemp -d)
43-
trap "rm -rf $TMPDIR" 0 # note that trap does not carry past exec
44-
CMDFIFO1=$TMPDIR/cmd1
45-
WAITFIFO1=$TMPDIR/wait1
46-
CMDFIFO2=$TMPDIR/cmd2
47-
WAITFIFO2=$TMPDIR/wait2
48-
FUNCS=$TMPDIR/funcs
42+
TMPD=$(mktemp -d)
43+
trap "rm -rf $TMPD" 0 # note that trap does not carry past exec
44+
CMDFIFO1=$TMPD/cmd1
45+
WAITFIFO1=$TMPD/wait1
46+
CMDFIFO2=$TMPD/cmd2
47+
WAITFIFO2=$TMPD/wait2
48+
FUNCS=$TMPD/funcs
4949

5050
# create the fifos used for interprocess communication
5151
mkfifo $CMDFIFO1 $WAITFIFO1 $CMDFIFO2 $WAITFIFO2
@@ -238,7 +238,7 @@ else
238238
fi
239239
./umountrepo $REPO >/dev/null
240240
done
241-
rm -rf $TMPDIR
241+
rm -rf $TMPD
242242
) &
243243
fi
244244

@@ -252,7 +252,7 @@ unshare -rm $UNSHAREOPTS /bin/bash /dev/stdin "${@:-$SHELL}" <<!EOF-1!
252252
#set -x
253253
#PS4='c\$$+ '
254254
# now in the "fakeroot" namespace
255-
trap "rm -rf $TMPDIR" 0 # note that this does not carry through "exec"
255+
trap "rm -rf $TMPD" 0 # note that this does not carry through "exec"
256256
257257
mkdir -p $HERE/mnt
258258
mount --rbind $HERE/mnt $HERE/mnt # pivot_root requires this mountpoint
@@ -411,7 +411,7 @@ unshare -rm $UNSHAREOPTS /bin/bash /dev/stdin "${@:-$SHELL}" <<!EOF-1!
411411
# processes in the namespaces will get a SIGKILL when
412412
# PID 1 exits.
413413
EXEC=""
414-
trap "rm -rf $TMPDIR" 0
414+
trap "rm -rf $TMPD" 0
415415
trap "" 1 2 3 15 # ignore all ordinary signals
416416
else
417417
EXEC=exec

0 commit comments

Comments
 (0)