Skip to content

Commit df20e98

Browse files
authored
Merge pull request #37563 from codablock/blog-kluctl-ssa
Add a blog post about Kluctl and its use of SSA
2 parents 68f50cb + c8dd619 commit df20e98

File tree

1 file changed

+207
-0
lines changed

1 file changed

+207
-0
lines changed
Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
---
2+
layout: blog
3+
title: "Live and let live with Kluctl and Server Side Apply"
4+
date: 2022-11-04
5+
slug: live-and-let-live-with-kluctl-and-ssa
6+
---
7+
8+
**Author:** Alexander Block
9+
10+
This blog post was inspired by a previous Kubernetes blog post about
11+
[Advanced Server Side Apply](https://kubernetes.io/blog/2022/10/20/advanced-server-side-apply/).
12+
The author of said blog post listed multiple benefits for applications and
13+
controllers when switching to server-side apply (from now on abbreviated with
14+
SSA). Especially the chapter about
15+
[CI/CD systems](https://kubernetes.io/blog/2022/10/20/advanced-server-side-apply/#ci-cd-systems)
16+
motivated me to respond and write down my thoughts and experiences.
17+
18+
These thoughts and experiences are the results of me working on [Kluctl](https://kluctl.io)
19+
for the past 2 years. I describe Kluctl as "The missing glue to put together
20+
large Kubernetes deployments, composed of multiple smaller parts
21+
(Helm/Kustomize/...) in a manageable and unified way."
22+
23+
To get a basic understanding of Kluctl, I suggest to visit the [kluctl.io](https://kluctlio)
24+
website and read through the documentation and tutorials, for example the
25+
[microservices demo tutorial](https://kluctl.io/docs/guides/tutorials/microservices-demo/).
26+
As an alternative, you can watch [Hands-on Introduction to kluctl](https://www.youtube.com/watch?v=9LoYLjDjOdg)
27+
from the Rawkode Academy YouTube channel which shows a hands-on demo session.
28+
29+
There is also a [Kluctl delivery scenario](https://github.com/codablock/podtato-head/tree/kluctl/delivery/kluctl)
30+
available in my fork of the [podtato-head](https://github.com/codablock/podtato-head) demo project.
31+
32+
## Live and let live
33+
34+
One of the main philosophies that Kluctl follows is ["live and let live"](https://kluctl.io/docs/philosophy/#live-and-let-live),
35+
meaning that it will try its best to work in conjunction with any other tool or
36+
controller running outside or inside your clusters. Kluctl will not overwrite
37+
any fields that it lost ownership of, unless you explicitly tell it to do so.
38+
39+
Achieving this would not have been possible (or at least several magnitudes
40+
harder) without the use of SSA. Server-side apply allows Kluctl
41+
to detect when ownership for a field got lost, for example when another controller
42+
or operator updates that field to another value. Kluctl can then decide on a
43+
field-by-field basis if force-applying is required before retrying based on these
44+
decisions.
45+
46+
## The days before SSA
47+
48+
The first versions of Kluctl were based on shelling out to `kubectl` and thus
49+
implicitly relied on client-side apply. At that time, SSA was
50+
still alpha and quite buggy. And to be honest, I didn't even know it was a
51+
thing at that time.
52+
53+
The way client-side apply worked had some serious drawbacks. The most obvious one
54+
(it was guaranteed that you'd stumble on this by yourself if enough time passed)
55+
is that it relied on an annotation (`kubectl.kubernetes.io/last-applied-configuration`)
56+
being added to the object, bringing in all the limitations and issues with huge
57+
annotation values. A good example of such issues are
58+
[CRDs being so large](https://github.com/prometheus-operator/prometheus-operator/issues/4439),
59+
that they don't fit into the annotation's value anymore.
60+
61+
Another drawback can be seen just by looking at the name (**client**-side apply).
62+
Being **client** side means that each client has to provide the apply-logic on
63+
its own, which at that time was only properly implemented inside `kubectl`,
64+
making it hard to be replicated inside controllers.
65+
66+
This added `kubectl` as a dependency (either as an executable or in the form of
67+
Go packages) to all controllers that wanted to leverage the apply-logic.
68+
69+
However, even if one managed to get client-side apply running from inside a
70+
controller, you ended up with a solution that gave no control over how it
71+
worked internally. As an example, there was no way to individually decide which
72+
fields to overwrite in case of external changes and which ones to let go.
73+
74+
## Discovering SSA apply
75+
76+
I was never happy with the solution described above and then somehow stumbled
77+
across [server-side apply](/docs/reference/using-api/server-side-apply/),
78+
which was still in beta at that time. Experimenting with it via
79+
`kubectl apply --server-side` revealed immediately that the true power of
80+
SSA can not be easily leveraged by shelling out to `kubectl`.
81+
82+
The way SSA is implemented in `kubectl` does not allow enough
83+
control over conflict resolution as it can only switch between
84+
"not force-applying anything and erroring out" and "force-applying everything
85+
without showing any mercy!".
86+
87+
The API documentation however made it clear that SSA is able to
88+
control conflict resolution on field level, simply by choosing which fields
89+
to include and which fields to omit from the supplied object.
90+
91+
## Moving away from kubectl
92+
93+
This meant that Kluctl had to move away from shelling out to `kubectl` first. Only
94+
after that was done, I would have been able to properly implement SSA
95+
with its powerful conflict resolution.
96+
97+
To achieve this, I first implemented access to the target clusters via a
98+
Kubernetes client library. This had the nice side effect of dramatically
99+
speeding up Kluctl as well. It also improved the security and usability of
100+
Kluctl by ensuring that a running Kluctl command could not be messed around
101+
with by externally modifying the kubeconfig while it was running.
102+
103+
## Implementing SSA
104+
105+
After switching to a Kubernetes client library, leveraging SSA
106+
felt easy. Kluctl now has to send each manifest to the API server as part of a
107+
`PATCH` request, which signals
108+
that Kluctl wants to perform a SSA operation. The API server then
109+
responds with an OK response (HTTP status code 200), or with a Conflict response
110+
(HTTP status 409).
111+
112+
In case of a Conflict response, the body of that response includes machine-readable
113+
details about the conflicts. Kluctl can then use these details to figure out
114+
which fields are in conflict and which actors (field managers) have taken
115+
ownership of the conflicted fields.
116+
117+
Then, for each field, Kluctl will decide if the conflict should be ignored or
118+
if it should be force-applied. If any field needs to be force-applied, Kluctl
119+
will retry the apply operation with the ignored fields omitted and the `force`
120+
flag being set on the API call.
121+
122+
In case a conflict is ignored, Kluctl will issue a warning to the user so that
123+
the user can react properly (or ignore it forever...).
124+
125+
That's basically it. That is all that is required to leverage SSA.
126+
Big thanks and thumbs-up to the Kubernetes developers who made this possible!
127+
128+
## Conflict Resolution
129+
130+
Kluctl has a few simple rules to figure out if a conflict should be ignored
131+
or force-applied.
132+
133+
It first checks the field's actor (the field manager) against a list of known
134+
field manager strings from tools that are frequently used to perform manual modifications. These
135+
are for example `kubectl` and `k9s`. Any modifications performed with these tools
136+
are considered "temporary" and will be overwritten by Kluctl.
137+
138+
If you're using Kluctl along with `kubectl` where you don't want the changes from
139+
`kubectl` to be overwritten (for example, using in a script) then you can specify
140+
`--field-manager=<manager-name>` on the command line to `kubectl`, and Kluctl
141+
doesn't apply its special heuristic.
142+
143+
If the field manager is not known by Kluctl, it will check if force-applying is
144+
requested for that field. Force-applying can be requested in different ways:
145+
146+
1. By passing `--force-apply` to Kluctl. This will cause ALL fields to be force-applied on conflicts.
147+
2. By adding the [`kluctl.io/force-apply=true`](https://kluctl.io/docs/reference/deployments/annotations/all-resources/#kluctlioforce-apply) annotation to the object in question. This will cause all fields of that object to be force-applied on conflicts.
148+
3. By adding the [`kluctl.io/force-apply-field=my.json.path`](https://kluctl.io/docs/reference/deployments/annotations/all-resources/#kluctlioforce-apply-field) annotation to the object in question. This causes only fields matching the JSON path to be force-applied on conflicts.
149+
150+
Marking a field to be force-applied is required whenever some other actor is
151+
known to erroneously claim fields (the ECK operator does this to the nodeSets
152+
field for example), you can ensure that Kluctl always overwrites these fields
153+
to the original or a new value.
154+
155+
In the future, Kluctl will allow even more control about conflict resolution.
156+
For example, the CLI will allow to control force-applying on field level.
157+
158+
## DevOps vs Controllers
159+
160+
So how does SSA in Kluctl lead to "live and let live"?
161+
162+
It allows the co-existence of classical pipelines (e.g. Github Actions or
163+
Gitlab CI), controllers (e.g. the HPA controller or GitOps style controllers)
164+
and even admins running deployments from their local machines.
165+
166+
Wherever you are on your infrastructure automation journey, Kluctl has a place
167+
for you. From running deployments using a script on your PC, all the way to
168+
fully automated CI/CD with the pipelines themselves defined in code, Kluctl
169+
aims to complement the workflow that's right for you.
170+
171+
And even after fully automating everything, you can intervene with your admin
172+
permissions if required and run a `kubectl` command that will modify a field
173+
and prevent Kluctl from overwriting it. You'd just have to switch to a
174+
field-manager (e.g. "admin-override") that is not overwritten by Kluctl.
175+
176+
## A few takeaways
177+
178+
Server-side apply is a great feature and essential for the future of
179+
controllers and tools in Kubernetes. The amount of controllers involved
180+
will only get more and proper modes of working together are a must.
181+
182+
I believe that CI/CD-related controllers and tools should leverage
183+
SSA to perform proper conflict resolution. I also believe that
184+
other controllers (e.g. Flux and ArgoCD) would benefit from the same kind
185+
of conflict resolution control on field-level.
186+
187+
It might even be a good idea to come together and work on a standardized
188+
set of annotations to control conflict resolution for CI/CD-related tooling.
189+
190+
On the other side, non CI/CD-related controllers should ensure that they don't
191+
cause unnecessary conflicts when modifying objects. As of
192+
[the server-side apply documentation](https://kubernetes.io/docs/reference/using-api/server-side-apply/#using-server-side-apply-in-a-controller),
193+
it is strongly recommended for controllers to always perform force-applying. When
194+
following this recommendation, controllers should really make sure that only
195+
fields related to the controller are included in the applied object.
196+
Otherwise, unnecessary conflicts are guaranteed.
197+
198+
In many cases, controllers are meant to only modify the status subresource
199+
of the objects they manage. In this case, controllers should only patch the
200+
status subresource and not touch the actual object. If this is followed,
201+
conflicts become impossible to occur.
202+
203+
If you are a developer of such a controller and unsure about your controller
204+
adhering to the above, simply try to retrieve an object managed by your
205+
controller and look at the `managedFields` (you'll need to pass
206+
`--show-managed-fields -oyaml` to `kubectl get`) to see if some field got
207+
claimed unexpectedly.

0 commit comments

Comments
 (0)