Add opa policy to restrict PDBs, always allow at least 1 disruption#2459
Add opa policy to restrict PDBs, always allow at least 1 disruption#2459
Conversation
...le.d/charts/gatekeeper/constraints/templates/restrict-pod-disruption-budgets/constraint.yaml
Show resolved
Hide resolved
helmfile.d/charts/gatekeeper/templates/policies/restrict-pod-disruption-budgets.rego
Outdated
Show resolved
Hide resolved
helmfile.d/charts/gatekeeper/templates/policies/restrict-pod-disruption-budgets.rego
Show resolved
Hide resolved
Xartos
left a comment
There was a problem hiding this comment.
Super nice! This should definitely fix some issues we have in some clusters
|
What happens if a pdb that violates any of the rules already exists? Will it have to be manually removed? |
That is a good point. The PDB will be left in place, but you cannot edit it or any deployment in a way that continues to violate these rules. But there should not be any issue to modify the PDB or deployment so that it then is valid. Alternatively remove the PDB and start over. |
This was one of the patterns you mentioned that was not included in this PR, was it written correctly? If yes, could you expand? I like that the PR description is very extensive and informative, thanks for that! I think it would be nice if you wrote in text format as well which patterns to not allow were added in PR.
I think this is good information but it doesn't explain what is actually happening under the hood 🙂 |
I think the text is accurate, but maybe not very easy to read. Regardless, the idea is that I think PDBs should not be allowed for daemonsets, cronjobs, and jobs. Does that make sense?
Thanks, I will try to clarify this. |
I have now updated the PR description with some more details. Please let me know if this explains it well or if I should clarify further. |
davidumea
left a comment
There was a problem hiding this comment.
This PR also included some whitespace fixes that I stumbled upon. I hope it is ok that is is fixed in this PR. Otherwise let me know and I will move that to a separate PR.
I think this is fine, thanks for the cleanup!
I have now updated the PR description with some more details. Please let me know if this explains it well or if I should clarify further.
Thanks it's crystal clear now!
I plan on continuing my review tomorrow
...le.d/charts/gatekeeper/constraints/templates/restrict-pod-disruption-budgets/constraint.yaml
Outdated
Show resolved
Hide resolved
e6ee32a to
8d6c09e
Compare
|
Added a bunch of tests now. I have probably missed some tests, but this should cover most things. I at least have full code coverage: ❯ opa test restrict-pod-disruption-budgets.rego tests/restrict-pod-disruption-budgets.rego -v -c
{
"files": {
"restrict-pod-disruption-budgets.rego": {
...
"covered_lines": 98,
"coverage": 100
},
"tests/restrict-pod-disruption-budgets.rego": {
...
"covered_lines": 257,
"coverage": 100
}
},
"covered_lines": 355,
"not_covered_lines": 0,
"coverage": 100
} |
| input_wrap(obj) = input { | ||
| input := {"review": {"object": obj}} | ||
| } |
| pod_controller_groups_kinds := [ | ||
| {"group": "apps/v1", "kind": "Deployment"}, | ||
| {"group": "apps/v1", "kind": "StatefulSet"}, | ||
| {"group": "apps/v1", "kind": "ReplicaSet"}, | ||
| {"group": "v1", "kind": "ReplicationController"} | ||
| ] |
There was a problem hiding this comment.
You only use the group here in 1 place (below), might want to split these into separate functions, one for group and one for kind.
Or do they need to be together to make sure it's the same object? If I understand it correctly it should take one object at a time so even if the information is fetched from two functions that shouldn't be a problem.
There was a problem hiding this comment.
The idea was to ensure that it just uses these pairs of group and kind. This was the easiest way that I could think of to do this, but there are probably other ways.
I could also just have two separate lists of groups and kinds and then let it go through all combinations. I assume that is slightly less efficient, but probably not significantly.
IMO the current version feels nice to read and understand which groups and kinds belong to each other. But I think we should use the version that most people find easiest to read and understand. So I'm ok with changing it if that is what you and others want.
There was a problem hiding this comment.
Okay, I would be fine with leaving it but I don't think it reads super well when referencing this pod_controller_group_kind.group
There was a problem hiding this comment.
I will keep it like this for now then.
helmfile.d/charts/gatekeeper/templates/policies/restrict-pod-disruption-budgets.rego
Outdated
Show resolved
Hide resolved
helmfile.d/charts/gatekeeper/templates/policies/tests/restrict-pod-disruption-budgets.rego
Outdated
Show resolved
Hide resolved
davidumea
left a comment
There was a problem hiding this comment.
I think the code looks good. Really nice work 🙂
I assume the plan is to add public docs and update the links here before you want to merge this?
|
Public docs PR is now up as well https://github.com/elastisys/welkin/pull/1073 |
helmfile.d/charts/gatekeeper/templates/policies/restrict-pod-disruption-budgets.rego
Outdated
Show resolved
Hide resolved
helmfile.d/charts/gatekeeper/templates/policies/tests/restrict-pod-disruption-budgets.rego
Outdated
Show resolved
Hide resolved
helmfile.d/charts/gatekeeper/templates/policies/tests/restrict-pod-disruption-budgets.rego
Outdated
Show resolved
Hide resolved
davidumea
left a comment
There was a problem hiding this comment.
Really nice start on this 🙌
1f6d705 to
80516c9
Compare
|
Task for the scenarios that were not covered in this PR: https://github.com/elastisys/welkin-apps/issues/68 |
Warning
This is a public repository, ensure not to disclose:
What kind of PR is this?
Required: Mark one of the following that is applicable:
Optional: Mark one or more of the following that are applicable:
Important
Breaking changes should be marked
kind/admin-changeorkind/dev-changedepending on typeCritical security fixes should be marked with
kind/securityPlatform Administrator notice
A new gatekeeper policy has been added that will deny any PodDisruptionBudget and connected Pod controller if the PodDisruptionBudget does not allow at least 1 Pod disruption. Note that this will apply in both sc and wc, it will also apply to namespaces even if they have the label owner=operator.
Application Developer notice
A new gatekeeper policy has been added that will deny any PodDisruptionBudget and connected Pod controller if the PodDisruptionBudget does not allow at least 1 Pod disruption.
What does this PR do / why do we need this PR?
This adds a new gatekeeper policy that will deny any PodDisruptionBudget and connected Pod controller if the PodDisruptionBudget does not allow at least 1 Pod disruption. Note that this will apply in both sc and wc, it will also apply to namespaces even if they have the label owner=operator.
Pod controllers only includes: Deployment, ReplicaSet, StatefulSet, ReplicaController.
The general logic for this is:
In sentences this means that we want to stop either PDBs or pod controllers if they together do not allow for any disruption. If the PDB is using maxUnavailable, then this happens if it is set to 0 or 0%, regardless of the replicas in the pod controller. If the PDB is using minAvailable, then this happens if it is equal or higher than the replicas in the pod controller. The policy includes logic to stop requests that would create or edit both PDBs or pod controllers.
There is also an extra check to allow replicasets to violate this policy if they are controlled by a deployment. This is because any matching PDB will look at all of the pods for the deployment, not any single replicaset.
The logic has been based on an upstream policy, but it has almost entirely been reworked. The logic of matching selectors has been based on the logic in our networkpolicy gatekeeper policy.
I was planning on adding more checks to this. But this took longer to implement than planned, so I'm now planning on turning the rest into a separate task. The things that were not implemented are:
This PR adds more resources that will be synced (cached) by gatekeeper. This is needed so that we can compare PDBs to pod controllers (otherwise you just have access to the object that is being validated).
But adding resources here will increase the resource usage (primarily memory) for gatekeeper. Syncing PDBs will likely not increase the usage significantly, since there are relatively few PDBs in a cluster. However pod controllers are a lot more common and will noticeably increase the memory usage, but as I show below I think that the usage has not increased so much more that it should prevent us from adding this feature. I did some testing to see how much the resource usage increased. Note that some of the testing included pods, because that would be needed for one of the extra features mentioned above that I was planning on adding but am now skipping.
Even though this is part of an autoscaling task, this policy is good to have for any type of cluster. Since PDBs can disrupt any type of scenario where we want to drain nodes or similar.
Information to reviewers
As of creating the PR I have not yet added tests or public documentation. I will work on that next, but the code and config should be ok to review for now.
This PR also included some whitespace fixes that I stumbled upon. I hope it is ok that is is fixed in this PR. Otherwise let me know and I will move that to a separate PR.
Checklist