Skip to content

Commit 440faa5

Browse files
authored
Merge pull request #46911 from serathius/blogpost2
Consistent read from cache blogpost
2 parents 7b7b184 + f98d759 commit 440faa5

File tree

1 file changed

+90
-0
lines changed

1 file changed

+90
-0
lines changed
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
layout: blog
3+
title: 'Kubernetes v1.31: Accelerating Cluster Performance with Consistent Reads from Cache'
4+
date: 2024-08-15
5+
slug: consistent-read-from-cache-beta
6+
author: >
7+
Marek Siarkowicz (Google)
8+
---
9+
10+
Kubernetes is renowned for its robust orchestration of containerized applications,
11+
but as clusters grow, the demands on the control plane can become a bottleneck.
12+
A key challenge has been ensuring strongly consistent reads from the etcd datastore,
13+
requiring resource-intensive quorum reads.
14+
15+
Today, the Kubernetes community is excited to announce a major improvement:
16+
_consistent reads from cache_, graduating to Beta in Kubernetes v1.31.
17+
18+
### Why consistent reads matter
19+
20+
Consistent reads are essential for ensuring that Kubernetes components have an accurate view of the latest cluster state.
21+
Guaranteeing consistent reads is crucial for maintaining the accuracy and reliability of Kubernetes operations,
22+
enabling components to make informed decisions based on up-to-date information.
23+
In large-scale clusters, fetching and processing this data can be a performance bottleneck,
24+
especially for requests that involve filtering results.
25+
While Kubernetes can filter data by namespace directly within etcd,
26+
any other filtering by labels or field selectors requires the entire dataset to be fetched from etcd and then filtered in-memory by the Kubernetes API server.
27+
This is particularly impactful for components like the kubelet,
28+
which only needs to list pods scheduled to its node - but previously required the API Server and etcd to process all pods in the cluster.
29+
30+
### The breakthrough: Caching with confidence
31+
32+
Kubernetes has long used a watch cache to optimize read operations.
33+
The watch cache stores a snapshot of the cluster state and receives updates through etcd watches.
34+
However, until now, it couldn't serve consistent reads directly, as there was no guarantee the cache was sufficiently up-to-date.
35+
36+
The _consistent reads from cache_ feature addresses this by leveraging etcd's
37+
[progress notifications](https://etcd.io/docs/v3.5/dev-guide/interacting_v3/#watch-progress)
38+
mechanism.
39+
These notifications inform the watch cache about how current its data is compared to etcd.
40+
When a consistent read is requested, the system first checks if the watch cache is up-to-date.
41+
If the cache is not up-to-date, the system queries etcd for progress notifications until it's confirmed that the cache is sufficiently fresh.
42+
Once ready, the read is efficiently served directly from the cache,
43+
which can significantly improve performance,
44+
particularly in cases where it would require fetching a lot of data from etcd.
45+
This enables requests that filter data to be served from the cache,
46+
with only minimal metadata needing to be read from etcd.
47+
48+
**Important Note:** To benefit from this feature, your Kubernetes cluster must be running etcd version 3.4.31+ or 3.5.13+.
49+
For older etcd versions, Kubernetes will automatically fall back to serving consistent reads directly from etcd.
50+
51+
### Performance gains you'll notice
52+
53+
This seemingly simple change has a profound impact on Kubernetes performance and scalability:
54+
55+
* **Reduced etcd Load:** Kubernetes v1.31 can offload work from etcd,
56+
freeing up resources for other critical operations.
57+
* **Lower Latency:** Serving reads from cache is significantly faster than fetching
58+
and processing data from etcd. This translates to quicker responses for components,
59+
improving overall cluster responsiveness.
60+
* **Improved Scalability:** Large clusters with thousands of nodes and pods will
61+
see the most significant gains, as the reduction in etcd load allows the
62+
control plane to handle more requests without sacrificing performance.
63+
64+
**5k Node Scalability Test Results:** In recent scalability tests on 5,000 node
65+
clusters, enabling consistent reads from cache delivered impressive improvements:
66+
67+
* **30% reduction** in kube-apiserver CPU usage
68+
* **25% reduction** in etcd CPU usage
69+
* **Up to 3x reduction** (from 5 seconds to 1.5 seconds) in 99th percentile pod LIST request latency
70+
71+
### What's next?
72+
73+
With the graduation to beta, consistent reads from cache are enabled by default,
74+
offering a seamless performance boost to all Kubernetes users running a supported
75+
etcd version.
76+
77+
Our journey doesn't end here. Kubernetes community is actively exploring
78+
pagination support in the watch cache, which will unlock even more performance
79+
optimizations in the future.
80+
81+
### Getting started
82+
83+
Upgrading to Kubernetes v1.31 and ensuring you are using etcd version 3.4.31+ or
84+
3.5.13+ is the easiest way to experience the benefits of consistent reads from
85+
cache.
86+
If you have any questions or feedback, don't hesitate to reach out to the Kubernetes community.
87+
88+
**Let us know how** _consistent reads from cache_ **transforms your Kubernetes experience!**
89+
90+
Special thanks to @ah8ad3 and @p0lyn0mial for their contributions to this feature!

0 commit comments

Comments
 (0)