|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +title: "Live and let live with Kluctl and Server Side Apply" |
| 4 | +date: 2022-11-04 |
| 5 | +slug: live-and-let-live-with-kluctl-and-ssa |
| 6 | +--- |
| 7 | + |
| 8 | +**Author:** Alexander Block |
| 9 | + |
| 10 | +This blog post was inspired by a previous Kubernetes blog post about |
| 11 | +[Advanced Server Side Apply](https://kubernetes.io/blog/2022/10/20/advanced-server-side-apply/). |
| 12 | +The author of said blog post listed multiple benefits for applications and |
| 13 | +controllers when switching to server-side apply (from now on abbreviated with |
| 14 | +SSA). Especially the chapter about |
| 15 | +[CI/CD systems](https://kubernetes.io/blog/2022/10/20/advanced-server-side-apply/#ci-cd-systems) |
| 16 | +motivated me to respond and write down my thoughts and experiences. |
| 17 | + |
| 18 | +These thoughts and experiences are the results of me working on [Kluctl](https://kluctl.io) |
| 19 | +for the past 2 years. I describe Kluctl as "The missing glue to put together |
| 20 | +large Kubernetes deployments, composed of multiple smaller parts |
| 21 | +(Helm/Kustomize/...) in a manageable and unified way." |
| 22 | + |
| 23 | +To get a basic understanding of Kluctl, I suggest to visit the [kluctl.io](https://kluctlio) |
| 24 | +website and read through the documentation and tutorials, for example the |
| 25 | +[microservices demo tutorial](https://kluctl.io/docs/guides/tutorials/microservices-demo/). |
| 26 | +As an alternative, you can watch [Hands-on Introduction to kluctl](https://www.youtube.com/watch?v=9LoYLjDjOdg) |
| 27 | +from the Rawkode Academy YouTube channel which shows a hands-on demo session. |
| 28 | + |
| 29 | +There is also a [Kluctl delivery scenario](https://github.com/codablock/podtato-head/tree/kluctl/delivery/kluctl) |
| 30 | +available in my fork of the [podtato-head](https://github.com/codablock/podtato-head) demo project. |
| 31 | + |
| 32 | +## Live and let live |
| 33 | + |
| 34 | +One of the main philosophies that Kluctl follows is ["live and let live"](https://kluctl.io/docs/philosophy/#live-and-let-live), |
| 35 | +meaning that it will try its best to work in conjunction with any other tool or |
| 36 | +controller running outside or inside your clusters. Kluctl will not overwrite |
| 37 | +any fields that it lost ownership of, unless you explicitly tell it to do so. |
| 38 | + |
| 39 | +Achieving this would not have been possible (or at least several magnitudes |
| 40 | +harder) without the use of SSA. Server-side apply allows Kluctl |
| 41 | +to detect when ownership for a field got lost, for example when another controller |
| 42 | +or operator updates that field to another value. Kluctl can then decide on a |
| 43 | +field-by-field basis if force-applying is required before retrying based on these |
| 44 | +decisions. |
| 45 | + |
| 46 | +## The days before SSA |
| 47 | + |
| 48 | +The first versions of Kluctl were based on shelling out to `kubectl` and thus |
| 49 | +implicitly relied on client-side apply. At that time, SSA was |
| 50 | +still alpha and quite buggy. And to be honest, I didn't even know it was a |
| 51 | +thing at that time. |
| 52 | + |
| 53 | +The way client-side apply worked had some serious drawbacks. The most obvious one |
| 54 | +(it was guaranteed that you'd stumble on this by yourself if enough time passed) |
| 55 | +is that it relied on an annotation (`kubectl.kubernetes.io/last-applied-configuration`) |
| 56 | +being added to the object, bringing in all the limitations and issues with huge |
| 57 | +annotation values. A good example of such issues are |
| 58 | +[CRDs being so large](https://github.com/prometheus-operator/prometheus-operator/issues/4439), |
| 59 | +that they don't fit into the annotation's value anymore. |
| 60 | + |
| 61 | +Another drawback can be seen just by looking at the name (**client**-side apply). |
| 62 | +Being **client** side means that each client has to provide the apply-logic on |
| 63 | +its own, which at that time was only properly implemented inside `kubectl`, |
| 64 | +making it hard to be replicated inside controllers. |
| 65 | + |
| 66 | +This added `kubectl` as a dependency (either as an executable or in the form of |
| 67 | +Go packages) to all controllers that wanted to leverage the apply-logic. |
| 68 | + |
| 69 | +However, even if one managed to get client-side apply running from inside a |
| 70 | +controller, you ended up with a solution that gave no control over how it |
| 71 | +worked internally. As an example, there was no way to individually decide which |
| 72 | +fields to overwrite in case of external changes and which ones to let go. |
| 73 | + |
| 74 | +## Discovering SSA apply |
| 75 | + |
| 76 | +I was never happy with the solution described above and then somehow stumbled |
| 77 | +across [server-side apply](/docs/reference/using-api/server-side-apply/), |
| 78 | +which was still in beta at that time. Experimenting with it via |
| 79 | +`kubectl apply --server-side` revealed immediately that the true power of |
| 80 | +SSA can not be easily leveraged by shelling out to `kubectl`. |
| 81 | + |
| 82 | +The way SSA is implemented in `kubectl` does not allow enough |
| 83 | +control over conflict resolution as it can only switch between |
| 84 | +"not force-applying anything and erroring out" and "force-applying everything |
| 85 | +without showing any mercy!". |
| 86 | + |
| 87 | +The API documentation however made it clear that SSA is able to |
| 88 | +control conflict resolution on field level, simply by choosing which fields |
| 89 | +to include and which fields to omit from the supplied object. |
| 90 | + |
| 91 | +## Moving away from kubectl |
| 92 | + |
| 93 | +This meant that Kluctl had to move away from shelling out to `kubectl` first. Only |
| 94 | +after that was done, I would have been able to properly implement SSA |
| 95 | +with its powerful conflict resolution. |
| 96 | + |
| 97 | +To achieve this, I first implemented access to the target clusters via a |
| 98 | +Kubernetes client library. This had the nice side effect of dramatically |
| 99 | +speeding up Kluctl as well. It also improved the security and usability of |
| 100 | +Kluctl by ensuring that a running Kluctl command could not be messed around |
| 101 | +with by externally modifying the kubeconfig while it was running. |
| 102 | + |
| 103 | +## Implementing SSA |
| 104 | + |
| 105 | +After switching to a Kubernetes client library, leveraging SSA |
| 106 | +felt easy. Kluctl now has to send each manifest to the API server as part of a |
| 107 | +`PATCH` request, which signals |
| 108 | +that Kluctl wants to perform a SSA operation. The API server then |
| 109 | +responds with an OK response (HTTP status code 200), or with a Conflict response |
| 110 | +(HTTP status 409). |
| 111 | + |
| 112 | +In case of a Conflict response, the body of that response includes machine-readable |
| 113 | +details about the conflicts. Kluctl can then use these details to figure out |
| 114 | +which fields are in conflict and which actors (field managers) have taken |
| 115 | +ownership of the conflicted fields. |
| 116 | + |
| 117 | +Then, for each field, Kluctl will decide if the conflict should be ignored or |
| 118 | +if it should be force-applied. If any field needs to be force-applied, Kluctl |
| 119 | +will retry the apply operation with the ignored fields omitted and the `force` |
| 120 | +flag being set on the API call. |
| 121 | + |
| 122 | +In case a conflict is ignored, Kluctl will issue a warning to the user so that |
| 123 | +the user can react properly (or ignore it forever...). |
| 124 | + |
| 125 | +That's basically it. That is all that is required to leverage SSA. |
| 126 | +Big thanks and thumbs-up to the Kubernetes developers who made this possible! |
| 127 | + |
| 128 | +## Conflict Resolution |
| 129 | + |
| 130 | +Kluctl has a few simple rules to figure out if a conflict should be ignored |
| 131 | +or force-applied. |
| 132 | + |
| 133 | +It first checks the field's actor (the field manager) against a list of known |
| 134 | +field manager strings from tools that are frequently used to perform manual modifications. These |
| 135 | +are for example `kubectl` and `k9s`. Any modifications performed with these tools |
| 136 | +are considered "temporary" and will be overwritten by Kluctl. |
| 137 | + |
| 138 | +If you're using Kluctl along with `kubectl` where you don't want the changes from |
| 139 | +`kubectl` to be overwritten (for example, using in a script) then you can specify |
| 140 | +`--field-manager=<manager-name>` on the command line to `kubectl`, and Kluctl |
| 141 | +doesn't apply its special heuristic. |
| 142 | + |
| 143 | +If the field manager is not known by Kluctl, it will check if force-applying is |
| 144 | +requested for that field. Force-applying can be requested in different ways: |
| 145 | + |
| 146 | +1. By passing `--force-apply` to Kluctl. This will cause ALL fields to be force-applied on conflicts. |
| 147 | +2. By adding the [`kluctl.io/force-apply=true`](https://kluctl.io/docs/reference/deployments/annotations/all-resources/#kluctlioforce-apply) annotation to the object in question. This will cause all fields of that object to be force-applied on conflicts. |
| 148 | +3. By adding the [`kluctl.io/force-apply-field=my.json.path`](https://kluctl.io/docs/reference/deployments/annotations/all-resources/#kluctlioforce-apply-field) annotation to the object in question. This causes only fields matching the JSON path to be force-applied on conflicts. |
| 149 | + |
| 150 | +Marking a field to be force-applied is required whenever some other actor is |
| 151 | +known to erroneously claim fields (the ECK operator does this to the nodeSets |
| 152 | +field for example), you can ensure that Kluctl always overwrites these fields |
| 153 | +to the original or a new value. |
| 154 | + |
| 155 | +In the future, Kluctl will allow even more control about conflict resolution. |
| 156 | +For example, the CLI will allow to control force-applying on field level. |
| 157 | + |
| 158 | +## DevOps vs Controllers |
| 159 | + |
| 160 | +So how does SSA in Kluctl lead to "live and let live"? |
| 161 | + |
| 162 | +It allows the co-existence of classical pipelines (e.g. Github Actions or |
| 163 | +Gitlab CI), controllers (e.g. the HPA controller or GitOps style controllers) |
| 164 | +and even admins running deployments from their local machines. |
| 165 | + |
| 166 | +Wherever you are on your infrastructure automation journey, Kluctl has a place |
| 167 | +for you. From running deployments using a script on your PC, all the way to |
| 168 | +fully automated CI/CD with the pipelines themselves defined in code, Kluctl |
| 169 | +aims to complement the workflow that's right for you. |
| 170 | + |
| 171 | +And even after fully automating everything, you can intervene with your admin |
| 172 | +permissions if required and run a `kubectl` command that will modify a field |
| 173 | +and prevent Kluctl from overwriting it. You'd just have to switch to a |
| 174 | +field-manager (e.g. "admin-override") that is not overwritten by Kluctl. |
| 175 | + |
| 176 | +## A few takeaways |
| 177 | + |
| 178 | +Server-side apply is a great feature and essential for the future of |
| 179 | +controllers and tools in Kubernetes. The amount of controllers involved |
| 180 | +will only get more and proper modes of working together are a must. |
| 181 | + |
| 182 | +I believe that CI/CD-related controllers and tools should leverage |
| 183 | +SSA to perform proper conflict resolution. I also believe that |
| 184 | +other controllers (e.g. Flux and ArgoCD) would benefit from the same kind |
| 185 | +of conflict resolution control on field-level. |
| 186 | + |
| 187 | +It might even be a good idea to come together and work on a standardized |
| 188 | +set of annotations to control conflict resolution for CI/CD-related tooling. |
| 189 | + |
| 190 | +On the other side, non CI/CD-related controllers should ensure that they don't |
| 191 | +cause unnecessary conflicts when modifying objects. As of |
| 192 | +[the server-side apply documentation](https://kubernetes.io/docs/reference/using-api/server-side-apply/#using-server-side-apply-in-a-controller), |
| 193 | +it is strongly recommended for controllers to always perform force-applying. When |
| 194 | +following this recommendation, controllers should really make sure that only |
| 195 | +fields related to the controller are included in the applied object. |
| 196 | +Otherwise, unnecessary conflicts are guaranteed. |
| 197 | + |
| 198 | +In many cases, controllers are meant to only modify the status subresource |
| 199 | +of the objects they manage. In this case, controllers should only patch the |
| 200 | +status subresource and not touch the actual object. If this is followed, |
| 201 | +conflicts become impossible to occur. |
| 202 | + |
| 203 | +If you are a developer of such a controller and unsure about your controller |
| 204 | +adhering to the above, simply try to retrieve an object managed by your |
| 205 | +controller and look at the `managedFields` (you'll need to pass |
| 206 | +`--show-managed-fields -oyaml` to `kubectl get`) to see if some field got |
| 207 | +claimed unexpectedly. |
0 commit comments