Skip to content

Commit 45b7604

Browse files
committed
KEP-4743: Kubernetes-etcd interface
1 parent 6f64800 commit 45b7604

File tree

2 files changed

+387
-0
lines changed

2 files changed

+387
-0
lines changed
Lines changed: 372 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,372 @@
1+
# KEP-4743: The Kubernetes-etcd interface
2+
<!-- toc -->
3+
- [Summary](#summary)
4+
- [Goals](#goals)
5+
- [Non-Goals](#non-goals)
6+
- [Proposal](#proposal)
7+
- [User Stories](#user-stories)
8+
- [Kubernetes Backporting an etcd client patch version](#kubernetes-backporting-an-etcd-client-patch-version)
9+
- [Making Changes/Patching Bugs in the Interface](#making-changespatching-bugs-in-the-interface)
10+
- [Kubernetes Leveraging New etcd Functionality](#kubernetes-leveraging-new-etcd-functionality)
11+
- [Code location](#code-location)
12+
- [The Interface](#the-interface)
13+
- [KV interface](#kv-interface)
14+
- [Design considerations](#design-considerations)
15+
- [Watch interface](#watch-interface)
16+
- [Design considerations](#design-considerations-1)
17+
- [Alternatives](#alternatives)
18+
- [Code location](#code-location-1)
19+
- [Part of the etcd Client Struct](#part-of-the-etcd-client-struct)
20+
- [New Package in etcd Repository](#new-package-in-etcd-repository)
21+
- [New Repository under etcd-io](#new-repository-under-etcd-io)
22+
<!-- /toc -->
23+
24+
## Summary
25+
26+
This design proposal introduces an etcd-Kubernetes interface to be added to the
27+
etcd client and adopted by Kubernetes. This interface aims to create a clear and
28+
standardized contract between the two projects, codifying the interactions
29+
outlined in the [Implicit Kubernetes-ETCD Contract]. By formalizing this contract,
30+
we will improve the testability of both Kubernetes and etcd, prevent common
31+
errors in their interaction, and establish a framework for the future evolution
32+
of this critical contract.
33+
34+
[Implicit Kubernetes-ETCD Contract]: https://docs.google.com/document/d/1NUZDiJeiIH5vo_FMaTWf0JtrQKCx0kpEaIIuPoj9P6A/edit#heading=h.tlkin1a8b8bl
35+
36+
### Goals
37+
38+
* **Improved Testability:** Enable thorough testing of etcd and Kubernetes
39+
interactions through a well-defined interface, as envisioned in [#15820].
40+
* **Error Prevention:** Reduce incorrect contract usage, addressing issues like Kubernetes [#110210].
41+
* **Reviewable Changes:** Make contract modifications easily reviewable and
42+
trackable, ensuring a transparent and collaborative evolution.
43+
* **Backward Compatibility:** Ensure the interface remains compatible with all
44+
etcd versions supported by Kubernetes at the time of a Kubernetes release.
45+
46+
[#15820]: https://github.com/etcd-io/etcd/issues/15820
47+
[#110210]: https://github.com/kubernetes/kubernetes/issues/110210
48+
49+
In scope
50+
* [etcd3 store]: The primary Kubernetes object storage interface.
51+
* [Master leases]: Lease management for Kubernetes control plane components (utilizing the [etcd3 store])
52+
53+
[etcd3 store]: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/etcd3/store.go
54+
[Master leases]: https://github.com/kubernetes/kubernetes/blob/dae1859c896d742de1ee60a349475f8e28b61995/pkg/controlplane/reconcilers/lease.go#L47-L66
55+
56+
### Non-Goals
57+
* **Alternative Interface Implementations:** This KEP focuses solely on defining
58+
the interface for the existing etcd backend and ensuring its compatibility
59+
with Kubernetes. It does not encompass the development or support of
60+
alternative storage backends or implementations for the interface.
61+
* **Non-Storage Usage of etcd Client:**
62+
* [Kubeadm] - Primarily used for etcd cluster administration, not Kubernetes object storage.
63+
* [Compaction] - Kubernetes aims to encourage native etcd compaction. See [#80513].
64+
* [Monitor] - Don’t see benefits of standardizing etcd metrics used for Kubernetes, at least for now.
65+
* [Prober], [Feature checker] - These have planned migrations to native etcd features. See [Design Doc: etcd livez and readyz probes] and [KEP-4647].
66+
* [Lease manager] - Planned removal in favor of one lease per key to address [#110210].
67+
68+
[Kubeadm]: https://github.com/kubernetes/kubernetes/blob/6ba9fa89fb5889550649bfde847c742a55d3d29c/cmd/kubeadm/app/util/etcd/etcd.go#L66-L93
69+
[Compaction]: https://github.com/kubernetes/kubernetes/blob/6ba9fa89fb5889550649bfde847c742a55d3d29c/staging/src/k8s.io/apiserver/pkg/storage/etcd3/compact.go#L133-L162
70+
[#80513]: https://github.com/kubernetes/kubernetes/issues/80513
71+
[Monitor]: https://github.com/kubernetes/kubernetes/blob/6ba9fa89fb5889550649bfde847c742a55d3d29c/staging/src/k8s.io/apiserver/pkg/storage/storagebackend/factory/etcd3.go#L270-L283
72+
[Prober]: https://github.com/kubernetes/kubernetes/blob/6ba9fa89fb5889550649bfde847c742a55d3d29c/staging/src/k8s.io/apiserver/pkg/storage/storagebackend/factory/etcd3.go#L256-L268
73+
[Feature checker]: https://github.com/kubernetes/kubernetes/blob/6ba9fa89fb5889550649bfde847c742a55d3d29c/staging/src/k8s.io/apiserver/pkg/storage/feature/feature_support_checker.go#L143-L151
74+
[Design Doc: etcd livez and readyz probes]: https://docs.google.com/document/d/1SkzmO4RT_GI9YhT0dw4a6nEwKVCciwrwbCDxK0D7ASM/edit?usp=sharing
75+
[KEP-4647]: https://github.com/kubernetes/enhancements/pull/4662
76+
[Lease manager]: https://github.com/kubernetes/kubernetes/blob/6ba9fa89fb5889550649bfde847c742a55d3d29c/staging/src/k8s.io/apiserver/pkg/storage/etcd3/lease_manager.go#L90-L120
77+
[#110210]: https://github.com/kubernetes/kubernetes/issues/110210
78+
79+
## Proposal
80+
81+
This KEP proposes creating an etcd-Kubernetes code interface owned and
82+
maintained by SIG-etcd. The interface will serve as a formalization of the
83+
existing etcd-Kubernetes contract, ensuring the correct usage of etcd within
84+
Kubernetes and enabling improved testing and validation.
85+
86+
The interface will prioritize etcd's existing capabilities and behaviors,
87+
focusing on compatibility with the current etcd API. It will not introduce
88+
features or behaviors not supported by etcd, adhering to the existing
89+
SIG API Machinery policy outlined in [Storage for Extension API Servers].
90+
This policy designates etcd as the sole supported storage backend for Kubernetes
91+
for the foreseeable future.
92+
93+
[Storage for Extension API Servers]: https://docs.google.com/document/d/1i0xzRFB-uGLmLYueLMBTpHrOot9ScFxpkkcVcZHVbyA/edit?usp=sharing]
94+
95+
## User Stories
96+
97+
To better understand the importance of code location let’s visit the following use cases:
98+
99+
### Kubernetes Backporting an etcd client patch version
100+
101+
**The Journey:** Kubernetes regularly updates to newer etcd versions to leverage
102+
bug fixes, or security patches. However, ensuring compatibility between the
103+
codified etcd-Kubernetes interface and etcd client is essential.
104+
105+
**Considerations:**
106+
107+
* Even minor etcd client updates might inadvertently introduce changes that
108+
break the interface's assumptions or functionality.
109+
* Tight coupling to the etcd client could necessitate backporting the
110+
interface to older etcd branches, a complex and time-consuming process.
111+
112+
113+
### Making Changes/Patching Bugs in the Interface
114+
115+
**The Journey:** Despite careful design, the complex etcd-Kubernetes contract
116+
might reveal bugs or require adjustments.
117+
118+
**Considerations:**
119+
120+
* Changes and bug fixes need to be implemented and released with minimal
121+
disruption to both Kubernetes and etcd users.
122+
* Tightly coupled interface might require a full etcd
123+
client release for bug fixes, slowing down the process.
124+
125+
### Kubernetes Leveraging New etcd Functionality
126+
127+
**The Journey:** Kubernetes wants to expand its use of the etcd API beyond the
128+
current interface's scope.
129+
130+
**Considerations:** The interface codifies a minimal subset of the etcd API
131+
currently used by Kubernetes, and new features will initially be outside its
132+
scope. Balancing new feature adoption with interface stability is crucial.
133+
134+
**Mitigation:**
135+
136+
* Allow Kubernetes to directly use new etcd client features during alpha/beta
137+
stages, bypassing the interface temporarily.
138+
* Extending etcd robustness test to cover the new functionality before
139+
formalizing them in interface.
140+
* Once a feature is mature and stable, extend the interface, ensuring backward
141+
compatibility for existing Kubernetes versions.
142+
143+
## Code location
144+
145+
We propose locating the interface in a `kubernetes` subdirectory under
146+
https://github.com/etcd-io/etcd/tree/main/client/v3.
147+
This approach allows for seamless integration with the etcd client while
148+
maintaining a dedicated space for the interface code.
149+
150+
Interface will be part of the etcd client package and it's release will be
151+
combined with etcd release. For immediate Kubernetes use etcd will backport the
152+
client to `release-3.5` branch and introduce it in next etcd patch release for
153+
Kubernetes to consume.
154+
155+
Alternative code locations are discussed at the end of the document.
156+
157+
## The Interface
158+
159+
To ensure smoother transition we propose the adoption of the etcd-Kubernetes interface to be done in two stages:
160+
161+
1. **KV Interface:** Covering the basic get, list, count, put, and delete operations.
162+
2. **Watch Interface: **Covering Watch operation and requesting progress notification for it.
163+
164+
165+
### KV interface
166+
167+
For the reasoning please see the section below.
168+
169+
```
170+
// Interface defines the minimal client-side interface that Kubernetes requires
171+
// to interact with etcd. Methods below are standard etcd operations with
172+
// semantics adjusted to better suit Kubernetes' needs.
173+
type Interface interface {
174+
// Get retrieves a single key-value pair from etcd.
175+
//
176+
// If opts.Revision is set to a non-zero value, the key-value pair is retrieved at the specified revision.
177+
// If the required revision has been compacted, the request will fail with ErrCompacted.
178+
Get(ctx context.Context, key string, opts GetOptions) (GetResponse, error)
179+
180+
// List retrieves key-value pairs with the specified prefix.
181+
//
182+
// If opts.Revision is non-zero, the key-value pairs are retrieved at the specified revision.
183+
// If the required revision has been compacted, the request will fail with ErrCompacted.
184+
// If opts.Limit is greater than zero, the number of returned key-value pairs is bounded by the limit.
185+
// If opts.Continue is not empty, the listing will start from the key immediately after the one specified by Continue.
186+
List(ctx context.Context, prefix string, opts ListOptions) (ListResponse, error)
187+
188+
// Count returns the number of keys with the specified prefix.
189+
Count(ctx context.Context, prefix string) (int64, error)
190+
191+
// OptimisticPut creates or updates a key-value pair if the key has not been modified or created
192+
// since the revision specified in expectedRevision. Otherwise, it updates the key-value pair
193+
// only if it hasn't been modified since expectedRevision.
194+
//
195+
// If opts.GetOnFailure is true, the modified key-value pair will be returned if the put operation fails due to a revision mismatch.
196+
// If opts.LeaseID is provided, it overrides the lease associated with the key. If not provided, the existing lease is cleared.
197+
OptimisticPut(ctx context.Context, key string, value []byte, expectedRevision int64, opts PutOptions) (PutResponse, error)
198+
199+
// OptimisticDelete deletes the key-value pair if it hasn't been modified since the revision
200+
// specified in expectedRevision.
201+
//
202+
// If opts.GetOnFailure is true, the modified key-value pair will be returned if the delete operation fails due to a revision mismatch.
203+
OptimisticDelete(ctx context.Context, key string, expectedRevision int64, opts DeleteOptions) (DeleteResponse, error)
204+
}
205+
206+
type GetOptions struct {
207+
Revision int64
208+
}
209+
210+
type ListOptions struct {
211+
Revision int64
212+
Limit int64
213+
Continue string
214+
}
215+
216+
type PutOptions struct {
217+
GetOnFailure bool
218+
// LeaseID
219+
// Deprecated: Should be replaced with TTL when Interface starts using one lease per object.
220+
LeaseID clientv3.LeaseID
221+
}
222+
223+
type DeleteOptions struct {
224+
GetOnFailure bool
225+
}
226+
227+
type GetResponse struct {
228+
KV *mvccpb.KeyValue
229+
Revision int64
230+
}
231+
232+
type ListResponse struct {
233+
KVs []*mvccpb.KeyValue
234+
Count int64
235+
Revision int64
236+
}
237+
238+
type PutResponse struct {
239+
KV *mvccpb.KeyValue
240+
Succeeded bool
241+
Revision int64
242+
}
243+
244+
type DeleteResponse struct {
245+
KV *mvccpb.KeyValue
246+
Succeeded bool
247+
Revision int64
248+
}
249+
250+
251+
```
252+
253+
### Design considerations
254+
255+
**How should arguments be passed?** Proposed: Options struct.
256+
257+
* It’s more extensible than a hardcoded list of arguments, allowing adding more fields in future.
258+
* It’s more readable than the variadic options list when arguments are optional.
259+
Take a server code to manage [list limit options] as an example.
260+
* Same arguments apply for response struct.
261+
262+
[list limit options]: https://github.com/kubernetes/kubernetes/blob/97e87e2c40e5b83399a44738d38653fd59c58e99/staging/src/k8s.io/apiserver/pkg/storage/etcd3/store.go#L640-L645
263+
264+
**Prefer Range vs List semantics?** Proposed: List
265+
266+
* List matches the intention of the Kubernetes behavior
267+
268+
**Combine Create and Update?** Proposed: Combine them into Put
269+
270+
* They are the same from an argument standpoint. Create is a Update with ExpectedRevision set to 0.
271+
* The difference in on failure can be solved by optional argument `GetOnFailure`
272+
273+
274+
### Watch interface
275+
276+
For the reasoning please see the section below.
277+
```
278+
279+
type Kubernetes interface {
280+
Watch(ctx context.Context, key string, opts WatchOptions) KubernetesWatchChan
281+
RequestProgress(ctx context.Context, opts RequestProgressOptions) error
282+
}
283+
284+
type WatchOptions struct {
285+
StreamKey string
286+
Revision int64
287+
Prefix bool
288+
}
289+
290+
type RequestProgressOptions struct {
291+
StreamKey string
292+
}
293+
294+
type KubernetesWatchChan <-chan KubernetesWatchEvent
295+
296+
type KubernetesEventType string
297+
298+
const (
299+
Added KubernetesEventType = "ADDED"
300+
Modified KubernetesEventType = "MODIFIED"
301+
Deleted KubernetesEventType = "DELETED"
302+
Bookmark KubernetesEventType = "BOOKMARK"
303+
Error KubernetesEventType = "ERROR"
304+
)
305+
306+
type KubernetesWatchEvent struct {
307+
Type KubernetesEventType
308+
309+
Error error
310+
Revision int64
311+
Key string
312+
Value []byte
313+
PreviousValue []byte
314+
}
315+
```
316+
317+
### Design considerations
318+
319+
**What control does the user have over requesting progress?** Proposed: Allow user to set streamKey when create watch and requesting progress
320+
321+
* StreamKey is used to separate watch grpc streams.
322+
For Kubernetes we always use one stream as we don’t change grpc metadata between requests (e.g. WithRequireLeader).
323+
Currently etcd client doesn’t expose streamKey to the user, just calculates it based on grpc metadata taken from context.
324+
* Having access to streamKey is useful as progress notifications cannot be requested on a per watch basis, only for the whole stream.
325+
This isn’t a big problem in their current setup as Kubernetes opens only one watch per resource. However, this would become a scalability issue for CRDs.
326+
327+
**Should Kubernetes explicitly pass WithRequireLeader or make it default?** Proposed: Make it default if Kubernetes interface is used.
328+
329+
**Should we wrap the watch response?** Proposed: Yes, it allows us to codify the Kubernetes dependency on single revision per transaction and PrevKV dependency.
330+
331+
## Alternatives
332+
333+
### Code location
334+
335+
#### Part of the etcd Client Struct
336+
337+
**Pros:**
338+
339+
* **Seamless Integration:** The interface becomes inherently part of the client, fostering intuitive usage.
340+
* **Code Reuse:** Leverage existing private client methods, reducing redundancy.
341+
342+
**Cons:**
343+
344+
* **Tight Coupling:** Changes to the interface necessitate updates to the entire etcd client, impacting Kubernetes upgrades.
345+
* **Limited Autonomy:** Release and bug-fix cycles are bound to the etcd project's schedule, which may not align with Kubernetes' needs.
346+
* **Backporting Challenge:** Requires backporting to v3.5 for Kubernetes compatibility, going against the etcd project's goal of minimizing backports.
347+
348+
#### New Package in etcd Repository
349+
350+
**Pros:**
351+
352+
* **Versioning Flexibility:** Allows for independent versioning (e.g., `v3.5.13-interface.1`) to track interface changes separately from the etcd client.
353+
* **Manageable Integration:** Separates the interface from the client but keeps it within the etcd project, simplifying coordination.
354+
355+
**Cons:**
356+
357+
* **Backporting Challenge: **Still requires backporting to v3.5 for initial Kubernetes compatibility.
358+
* **Maintenance Overhead:** Separate versioning introduces some additional maintenance effort to ensure compatibility between the interface and etcd versions.
359+
* **Compatibility Risk:** Incompatibilities may arise between etcd and interface versions if not managed meticulously.
360+
361+
362+
#### New Repository under etcd-io
363+
364+
**Pros:**
365+
366+
* **Maximum Autonomy:** Grants Kubernetes full control over development, releases, and bug fixes.
367+
368+
**Cons:**
369+
370+
* **Increased Overhead:** Demands significant effort for maintenance, versioning, and compatibility across etcd client versions.
371+
* **Dependency Management:** Introduces an additional dependency for Kubernetes, increasing the complexity of version management.
372+
* **Potential for Code Duplication:** Implementing the interface might necessitate changes to internal client behavior, potentially requiring some code to be copied.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
title: Kubernetes-etcd interface
2+
kep-number: 4743
3+
authors:
4+
- "@serathius"
5+
owning-sig: sig-etcd
6+
participating-sigs:
7+
- sig-api-machinery
8+
status: provisional
9+
creation-date: 2024-06-28
10+
reviewers:
11+
- "@dims"
12+
approvers:
13+
- "@jpbetz"
14+
- "@ahrtr"
15+
- "@wojtek-t"

0 commit comments

Comments
 (0)