Double indirection for userns=auto #26851
Replies: 4 comments 1 reply
-
Partial answer to my own question: it's only safe for rootful containers because otherwise there's a risk of UID impersonation. My rootless user may have UID 1000, but it doesn't have the next 64k UIDs beyond that. If the container needs any of those, it needs the current system. However, my question stands for the rootful case: if it grabs ~1024 users starting from 0 inside, but we want privilege isolation at the kernel level but for it to be smashed down to real local IDs at the filesystem level, what prevents the kernel from doing that other than the lack of code to make it so? |
Beta Was this translation helpful? Give feedback.
-
It is called idmapped mounts and is supported a for a while now, |
Beta Was this translation helpful? Give feedback.
-
Aha! I do believe this is a reasonable test case of what you're telling me:
Without the I have tested this with overlapping instances to force multiple Thank you! |
Beta Was this translation helpful? Give feedback.
-
Sorry for a follow up, I actually have no clue, what this error means:
No matter what size you're giving in |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
One of the problems with
--userns=auto
comes when you use it with a container that makes use of multiple IDs within it, preventing the--volume …:U
flag from doing the right thing.Another is that it makes debugging difficult on the host side: when looking at
ls -l
output on a mapped-in volume, you have to map high-valued UIDs back down to the actual ones outside.I have no reason to suppose what i'm about to propose exists in any form, but I felt I needed to put it out there into the world regardless. It is this: what precludes having a double-indirection system around sub[ug]id ranges such that
$UID-$UID+64k
maps to 100000+ inside the kernel, but this then gets reverse-mapped to regular user IDs before files hit persistent storage?If possible, it would mean
:U
isn't necessary. If the kernel gives a different UID range for a given container instance than the prior run, it simply goes through a different mapping, but the UID/GIDs on disk remain compatible.Beta Was this translation helpful? Give feedback.
All reactions