Skip to content

Commit 133e2d3

Browse files
avaginkees
authored andcommitted
fs/exec: allow to unshare a time namespace on vfork+exec
Right now, a new process can't be forked in another time namespace if it shares mm with its parent. It is prohibited, because each time namespace has its own vvar page that is mapped into a process address space. When a process calls exec, it gets a new mm and so it could be "legal" to switch time namespace in that case. This was not implemented and now if we want to do this, we need to add another clone flag to not break backward compatibility. We don't have any user requests to switch times on exec except the vfork+exec combination, so there is no reason to add a new clone flag. As for vfork+exec, this should be safe to allow switching timens with the current clone flag. Right now, vfork (CLONE_VFORK | CLONE_VM) fails if a child is forked into another time namespace. With this change, vfork creates a new process in parent's timens, and the following exec does the actual switch to the target time namespace. Suggested-by: Florian Weimer <[email protected]> Signed-off-by: Andrei Vagin <[email protected]> Acked-by: Christian Brauner (Microsoft) <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
1 parent b13bacc commit 133e2d3

File tree

3 files changed

+13
-2
lines changed

3 files changed

+13
-2
lines changed

fs/exec.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@
6565
#include <linux/io_uring.h>
6666
#include <linux/syscall_user_dispatch.h>
6767
#include <linux/coredump.h>
68+
#include <linux/time_namespace.h>
6869

6970
#include <linux/uaccess.h>
7071
#include <asm/mmu_context.h>
@@ -982,10 +983,12 @@ static int exec_mmap(struct mm_struct *mm)
982983
{
983984
struct task_struct *tsk;
984985
struct mm_struct *old_mm, *active_mm;
986+
bool vfork;
985987
int ret;
986988

987989
/* Notify parent that we're no longer interested in the old VM */
988990
tsk = current;
991+
vfork = !!tsk->vfork_done;
989992
old_mm = current->mm;
990993
exec_mm_release(tsk, old_mm);
991994
if (old_mm)
@@ -1030,6 +1033,10 @@ static int exec_mmap(struct mm_struct *mm)
10301033
tsk->mm->vmacache_seqnum = 0;
10311034
vmacache_flush(tsk);
10321035
task_unlock(tsk);
1036+
1037+
if (vfork)
1038+
timens_on_fork(tsk->nsproxy, tsk);
1039+
10331040
if (old_mm) {
10341041
mmap_read_unlock(old_mm);
10351042
BUG_ON(active_mm != old_mm);

kernel/fork.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2033,8 +2033,11 @@ static __latent_entropy struct task_struct *copy_process(
20332033
/*
20342034
* If the new process will be in a different time namespace
20352035
* do not allow it to share VM or a thread group with the forking task.
2036+
*
2037+
* On vfork, the child process enters the target time namespace only
2038+
* after exec.
20362039
*/
2037-
if (clone_flags & (CLONE_THREAD | CLONE_VM)) {
2040+
if ((clone_flags & (CLONE_VM | CLONE_VFORK)) == CLONE_VM) {
20382041
if (nsp->time_ns != nsp->time_ns_for_children)
20392042
return ERR_PTR(-EINVAL);
20402043
}

kernel/nsproxy.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -179,7 +179,8 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
179179
if (IS_ERR(new_ns))
180180
return PTR_ERR(new_ns);
181181

182-
timens_on_fork(new_ns, tsk);
182+
if ((flags & CLONE_VM) == 0)
183+
timens_on_fork(new_ns, tsk);
183184

184185
tsk->nsproxy = new_ns;
185186
return 0;

0 commit comments

Comments
 (0)