Skip to content

Commit 0c5b731

Browse files
committed
sidecar: Add clarifications from Istio devs
In the previous PR, istio devs commented that some things were not accurate. This commit just updates the text to (hopefully) correctly reflect it now. Removed the paragraph about this removing the need for an initContainer due to comment here: kubernetes#1913 (comment) I thought it was an okay to insert the iptables rules within the sidecar proxy container, but it is not okay as that requires more permissions (capabilities) on the sidecar proxy container which is not considered accetable by Istio devs. Signed-off-by: Rodrigo Campos <[email protected]>
1 parent 58590b7 commit 0c5b731

File tree

1 file changed

+21
-32
lines changed

1 file changed

+21
-32
lines changed

keps/sig-node/0753-sidecarcontainers.md

Lines changed: 21 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -275,23 +275,18 @@ Linkerd and Istio), need to do several hacks to have the basic
275275
functionality. These are explained in detail in the
276276
[alternatives](#alternatives) section. Nonetheless, here is a quick highlight
277277
of some of the things some service mesh currently need to do:
278-
279278
* Recommend users to delay starting their apps by using a script to wait for
280279
the service mesh to be ready. The goal of a service mesh to augment the app
281280
functionality without modifying it, that goal is lost in this case.
282-
* To guarantee that traffic goes via the services mesh, an initContainer is
283-
added to blackhole traffic until the service mesh containers are up. This way,
284-
other containers that might be started before the service mesh container can't
285-
use the network until the service mesh container is started and ready. A side
286-
effect is that traffic is blackholed until the service mesh is up and in a
287-
ready state.
281+
* If they don't delay starting their application, the network connection they
282+
try to establish are blacklisted until the service mesh container is up.
288283
* Use preStop hooks with a "sleep infinity" to make sure the service mesh
289284
doesn't terminate before other containers that might be serving requests.
290285

291-
The auto-inject of initContainer [has caused bugs][linkerd-bug-init-last], as it
292-
competes with other tools auto-injecting a container to be run last too.
293-
294-
[linkerd-bug-init-last]: https://github.com/linkerd/linkerd2/issues/4758#issuecomment-658457737
286+
This KEP adds guarantees to startup/shutdown behavior, so _those_ problems will
287+
be solved for service mesh. However, service mesh do have other problems that
288+
are out of scope for this KEP, e.g. enable service mesh before initContainers
289+
are started.
295290

296291
### Problems: Coupling infrastructure with applications
297292

@@ -308,9 +303,6 @@ used to completely remove the need for such an initContainer. But this
308303
alternative requires that nodes have the CNI plugin installed, effectively
309304
coupling the service mesh app with the infrastructure.
310305

311-
This KEP removes the need for a service mesh to use either an initContainer or a
312-
CNI plugin: just guarantee that the sidecar container can be started first.
313-
314306
While in this specific example the CNI plugin has some benefits too (removes the
315307
need for some capabilities in the pod) and might be worth pursuing, it is used
316308
here as an example to show thee possibility of coupling apps with
@@ -779,11 +771,10 @@ startup, as explained in the next section.
779771

780772
#### Service mesh or metrics sidecars
781773

782-
Let app container be the main app that just has the service mesh extra container
783-
in the pod.
784-
785-
Service mesh, today, have to do the following workarounds due to lack of startup
786-
ordering:
774+
Service mesh, today, have to do the following workarounds due to lack of
775+
startup/shutdown ordering:
776+
* Recommend users to delay starting their apps by using a script to wait for
777+
the service mesh to be ready.
787778
* Blackhole all the traffic until service mesh container is up (usually using
788779
an initContainer for this)
789780
* Some trickery (sleep preStop hooks or some alternative) to not be killed
@@ -792,28 +783,26 @@ ordering:
792783

793784
This means that if the app container is started before the service mesh is
794785
started and ready, all traffic will be blackholed and the app needs to retry.
795-
Once the service mesh container is ready, traffic will be allowed.
786+
Once the service mesh container is ready, traffic will be allowed. A similar
787+
problem happens for shutdown: if the service mesh container is killed first, the
788+
network is down for the rest of the containers in the pod.
796789

797790
This has another major disadvantage: several apps crash if traffic is blackholed
798791
during startup (common in some rails middleware, for example) and have to resort
799792
to some kind of workaround, like [this one][linkerd-wait] to wait. This makes
800793
also service mesh miss their goal of augmenting containers functionality without
801794
modifying the main application.
802795

803-
Istio has an alternative to the initContainer hack. Istio [has an
804-
option][istio-cni-opt] to integrate with CNI and inject the blackhole from there
805-
instead of using the initContainer. In that case, it will do (just c&p from the
806-
link, in case it breaks in the future):
807-
808-
> By default Istio injects an initContainer, istio-init, in pods deployed in the mesh. The istio-init container sets up the pod network traffic redirection to/from the Istio sidecar proxy. This requires the user or service-account deploying pods to the mesh to have sufficient Kubernetes RBAC permissions to deploy containers with the NET_ADMIN and NET_RAW capabilities. Requiring Istio users to have elevated Kubernetes RBAC permissions is problematic for some organizations’ security compliance
809-
> ...
810-
> The Istio CNI plugin performs the Istio mesh pod traffic redirection in the Kubernetes pod lifecycle’s network setup phase, thereby removing the requirement for the NET_ADMIN and NET_RAW capabilities for users deploying pods into the Istio mesh. The Istio CNI plugin replaces the functionality provided by the istio-init container.
796+
This KEP addresses these 3 problems just listed when initContainer are not used
797+
by the application. If initContainers are used, the first and the last problem
798+
are solved only. In other words, traffic might still be blackholed for
799+
initContainers that run after the service mesh iptables rules are inserted.
811800

812-
In other words, Istio has an alternative to configure the traffic blockhole
813-
without an initContainer. But the other problems and hacks mentioned remain,
814-
though.
801+
Such rules are usually inserted as an initContainer (trying to run last, to
802+
avoid blackholing traffic to other initContainers) or alternatively, in the case
803+
of Istio, using a [CNI plugin][istio-cni-opt]. When using the CNI plugin all
804+
traffic from initContainers will be blackholed.
815805

816-
[linkerd-last-container]: https://github.com/linkerd/linkerd2/issues/4758#issuecomment-658457737
817806
[istio-cni-opt]: https://istio.io/latest/docs/setup/additional-setup/cni/
818807
[linkerd-wait]: https://github.com/olix0r/linkerd-await
819808

0 commit comments

Comments
 (0)