-
Notifications
You must be signed in to change notification settings - Fork 364
Optionally disallow the use of 'root' user for Processes and Tasks of docker lifecycle Apps #4452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Optionally disallow the use of 'root' user for Processes and Tasks of docker lifecycle Apps #4452
Conversation
Edit: I still had to click sign again. But it seems to be working now. |
7c07dd0
to
c5fcbde
Compare
@acosta11 you will also need to open a PR to add this new property to https://github.com/cloudfoundry/capi-release |
Draft PR: cloudfoundry/capi-release#561 Still need to test this end to end with the bosh release config. |
app/models/runtime/droplet_model.rb
Outdated
@@ -138,7 +138,7 @@ def docker_user | |||
end | |||
end | |||
|
|||
container_user.presence || AppModel::DEFAULT_DOCKER_CONTAINER_USER | |||
container_user.presence || (Config.config.get(:allow_process_root_user) ? AppModel::DEFAULT_DOCKER_CONTAINER_USER : AppModel::DEFAULT_CONTAINER_USER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is semantically a bit confusing, since it means in some cases the docker_user
default is not the DEFAULT_DOCKER_USER
, but is instead the DEFAULT_CONTAINER_USER
. Also we have to duplicate the same ternary in both droplet and process model.
Maybe instead (roughly):
- Encapsulate the ternary in a method (on AppModel?)
default_docker_user
DEFAULT_DOCKER_CONTAINER_USER
goes away (since it's only used in one place)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I could see trying to move this up higher in the hierarchy in general to avoid duplication. I'll give it a try and see how it looks.
I haven't validated this myself yet, but I don't think this PR covers the case where the |
Indeed it's not exercised through that complete flow. I can look at adding a test to explicitly configure the user to root in the docker metadata. I think the setup that I copied was only setting image information so would have to update the call to the factory. |
There is some complexity there, since the user specified in the Dockerfile ->
I think we need to check the user-that-will-be-used when actually starting to run the process/task (but, I'm open to other ideas). I'm fine if you want to break that into a separate PR, to keep things more manageable with this one. |
c5fcbde
to
ce9433b
Compare
Initial pass on the runtime check for 'root' user leverages the desired_lrp and task_action builder layer to avoid disrupting staging. Talking with Greg, I also considered going into all the relevant action implementations to start the error message a bit earlier in the call chain. That said, the builder level check ended up fairly succinct and easy to test, so I'm slightly favoring this implementation to avoid missing something in the action layer with a larger set of required changes. Curious if it looks reasonable to y'all. |
d7fd389
to
a3f85ca
Compare
Wrapped up end to end testing, added some error handling from an issue detected as part of that process, and rebased on latest main with the task handler and ccng config updates. Should be good for review now. See updated PR description for nsync edge case that I also validated in the e2e env. If anything, taking one last look at our handling of the |
a3f85ca
to
f31d46d
Compare
ef38937
to
824088f
Compare
spec/unit/lib/cloud_controller/diego/docker/lifecycle_protocol_spec.rb
Outdated
Show resolved
Hide resolved
TestConfig.override(allow_docker_root_user: false) | ||
end | ||
|
||
context 'and the process does not set a user' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also want to block the case where:
allowed_users
includesroot
- process/task user is set to
root
allow_docker_root_user
is set tofalse
This case that may never happen in the real world, but I think that's the correct behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need equivalent logic for surfacing the error when running tasks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, I'm not familiar with task execution error cases and only found this particular layer with some guidance from Tim. Is it a problem for a new execution of a task to fail after the cutover to blocking the root user? I assume that a previously running task isn't an availability concern the same way an already deployed running app would be.
* Ignore user in droplet docker execution metadata via dockerfile for staging because user may be subsequently overridden on Process or Task model * Enforce that 'root' or '0' user is not used at runtime by task_action and desired_lrp builders * Re-raise the error in desire app handler. If not re-raised, error is supressed with no clear user feedback.
824088f
to
188d860
Compare
Rebased on main again and ultimately removed all of the model layer changes in favor of the user allow list. Added more explicit tests at the diego builder layer for the cases where user is '0' or absent in the Dockerfile to round out the ways by which a root user may be specified. |
Proposed Change
Add a configuration option to allow or disallow the use of the 'root' and '0' user for processes and tasks of docker lifecycle apps.
Use 'vcap' as the default user in the case that the 'root' user is disallowed.Edit: opting to leave the user as is given docker images need to add the desired user anyway, which makes a non-root default less clear.Use Case
Operators can secure their runtime infrastructure by preventing the possibility of cascading user privilege escalation to the root user of the underlying os namespace construct realizing the desired Process.
Dev Notes
Given the scope of docker lifecycle apps, it seems like the validation logic for this feature has to live in the policy layer. If implemented in the message layer, I think it may have the unintended effect of allowing the root user in the net new context of buildpack lifecycle apps when configured to match the current default of allowed for docker apps (assuming the flag exists at this level of coarseness).Edit: the user allow list functionality in the policy layer works to this end, but we still need to ensure docker apps that set the user in the Dockerfile are appropriately handled.Confirmed in manual end to end testing that disallowing 'root' user on the cloud_controller_clock will cause an error in the nsync process for apps with the root user that were previously valid. Triggered a new sync by manually deleting the lrp on the diego cell. The error does not get surfaced to the api/cli. Instead, a user would have to go into the bosh logs and see the attempted sync on the scheduler job. Note that allowing the 'root' user on the clock job avoids this error, but still prevents new process/task updates from using the 'root' user, so it seems like a reasonable option for backwards compatibility.
Example error:
Checklist
I have reviewed the contributing guide
I have viewed, signed, and submitted the Contributor License Agreement
I have made this pull request to the
main
branchI have run all the unit tests using
bundle exec rake
I have run CF Acceptance Tests