Strimzi makes repeatable LIST requests causing instsability of Kubernetes Control Plane #7794

marseel · 2022-12-13T12:25:01Z

marseel
Dec 13, 2022

Describe the bug
strimzi-cluster-operator/0.31.1 does not use Kubernetes API Server cache for listing resources. All list calls go directly to Etcd, which puts significant load on Etcd causing Kubernetes Control Plane instability.

Example logs from Kubernetes API Server:

"HTTP" verb="LIST" URI="/api/v1/namespaces/<PII>/secrets?labelSelector=..." latency="1.492406837s" userAgent="strimzi-cluster-operator/0.31.1" <PII>  apf_pl="workload-low" apf_fs="service-accounts" resp=200

Similarly, Strimzi is also listing configmaps/pods/persistentvolumeclaims/services/...

Short term mitigation:
For each LIST/GET request set resourceVersion=0 to use Kubernetes API Server cache. This will allow requests to be served from Kubernetes API Server cache without interaction with Etcd.

Long term solution:
Migrate to use List and Watch pattern.
Relevant documentation: https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes
https://cloud.google.com/kubernetes-engine/docs/concepts/planning-scalability#use_list_and_watch_pattern_instead_of_periodic_listing

Expected behavior
By default, strimzi should use Kubernetes API Server cache and ideally List and Watch pattern instead of repeatable LIST calls.

Environment (please complete the following information):

Strimzi version: 0.31.1
Installation method: N/A
Kubernetes cluster: 1.23
Infrastructure: GKE

Additional context
Related issue that I've opened in Fabric8: fabric8io/kubernetes-client#4670

marseel · 2022-12-13T13:13:46Z

marseel
Dec 13, 2022
Author

More precisely, for example LIST requests for pods are in the format of:

/api/v1/namespaces/namespace-name/pods?labelSelector=strimzi.io/cluster=cluster-name,strimzi.io/name=some-name,strimzi.io/kind=Kafka

These type of requests make full LIST call to Etcd and then Kubernetes API Server makes filtering. I've observed 10 QPS, which is quite significant for such expensive calls (not including LISTs for other resources like secrets etc) especially when there are tens of thousands of pods in cluster.

0 replies

scholzj · 2022-12-13T14:02:35Z

scholzj
Dec 13, 2022
Maintainer

I think you should probably start by providing more details about how you use Strimzi and proper logs to see what where and how is happening. That might allow us to locate the exact area andsee if something can be done about it or not.

1 reply

nasabah Jan 9, 2023

Hi @scholzj and @marseel,
Sorry for the long delay from our side, but I think I can help with the details related to the mentioned Strimzi usage.

We're running a multitenant logging infrastructure, and each tenant has their own infrastructure that consists of one Kafka cluster (mostly each of them is having 3 nodes) and other components. Currently, we have around 400 tenants, so we're also having the same number of Kafka clusters.

For the log itself, I don't think we have it at the span when the accident happens, but let me know if there's something else I can provide to help the debugging process. Thank you.

marseel · 2022-12-13T14:15:43Z

marseel
Dec 13, 2022
Author

I will reach out to customer to see if they can provide that information. I was investigating it from Kubernetes Control Plane point of view and unfortunately I am unable to say how it was used/deployed etc.

2 replies

scholzj Dec 13, 2022
Maintainer

Ok, that would be great.

I quickly looked through the code. There are certainly areas where we use LIST calls. Those done directly by Strimzi are easy to identify. There will be some done by the informers, but those are controlled by the informers (I do not know the exact implementation details of informers in Fabric8 - but I assume these would happen once at startup and possibly when something gets wrong with the watch). Setting the resourceVersion for them might be relatively easy. We would need to check how it works in practice - the contract for resourceVersion="0" does not make much guarantees - but I guess it should work fine in real life.

But I do not see anything in the code what would flood your server with LIST requests. So for that I would definitely need more information. Maybe the user runs really a huge amount of operands or there is some other issue causing some kind of misbehavior.

shawkins Dec 14, 2022

fabric8io/kubernetes-client#4676 will address using a 0 resourceVersion initially for the informers.

but I guess it should work fine in real life

The pr adds a note to the changelog about the possibility of the informer cache being less fresh initially than it would have been. Since this matches the behavior of the go client I would agree that should be fine in practical terms.

but I assume these would happen once at startup and possibly when something gets wrong with the watch)

That is correct - it takes a watch version related failure, something like a resource too old or too large, for the relisting to occur. Anything thing else will just restart the watch at the last resource version seen.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strimzi

Strimzi makes repeatable LIST requests causing instsability of Kubernetes Control Plane #7794

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Strimzi

Strimzi makes repeatable LIST requests causing instsability of Kubernetes Control Plane #7794

Uh oh!

marseel Dec 13, 2022

Replies: 3 comments · 3 replies

Uh oh!

Uh oh!

marseel Dec 13, 2022 Author

Uh oh!

scholzj Dec 13, 2022 Maintainer

Uh oh!

nasabah Jan 9, 2023

Uh oh!

marseel Dec 13, 2022 Author

Uh oh!

scholzj Dec 13, 2022 Maintainer

Uh oh!

shawkins Dec 14, 2022

marseel
Dec 13, 2022

Replies: 3 comments 3 replies

marseel
Dec 13, 2022
Author

scholzj
Dec 13, 2022
Maintainer

marseel
Dec 13, 2022
Author

scholzj Dec 13, 2022
Maintainer