Skip to content

Commit e9814d8

Browse files
authored
Merge pull request kubernetes#3133 from aramase/3130-kms-observability
KEP-3130: kms observability
2 parents 1205bf4 + e183b3a commit e9814d8

File tree

3 files changed

+288
-0
lines changed

3 files changed

+288
-0
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 3130
2+
alpha:
3+
approver: "@deads2k"
Lines changed: 259 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
# KEP-3130: KMS Observability
2+
3+
<!-- toc -->
4+
- [Release Signoff Checklist](#release-signoff-checklist)
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [Design Details](#design-details)
11+
- [Test Plan](#test-plan)
12+
- [Graduation Criteria](#graduation-criteria)
13+
- [Alpha](#alpha)
14+
- [Beta](#beta)
15+
- [GA](#ga)
16+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
17+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
18+
- [Monitoring Requirements](#monitoring-requirements)
19+
- [Dependencies](#dependencies)
20+
- [Scalability](#scalability)
21+
- [Troubleshooting](#troubleshooting)
22+
- [Implementation History](#implementation-history)
23+
- [Alternatives](#alternatives)
24+
<!-- /toc -->
25+
26+
## Release Signoff Checklist
27+
28+
Items marked with (R) are required *prior to targeting to a milestone / release*.
29+
30+
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
31+
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
32+
- [ ] (R) Design details are appropriately documented
33+
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
34+
- [ ] e2e Tests for all Beta API Operations (endpoints)
35+
- [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
36+
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
37+
- [ ] (R) Graduation criteria is in place
38+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
39+
- [ ] (R) Production readiness review completed
40+
- [ ] (R) Production readiness review approved
41+
- [ ] "Implementation History" section is up-to-date for milestone
42+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
43+
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
44+
45+
[kubernetes.io]: https://kubernetes.io/
46+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
47+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
48+
[kubernetes/website]: https://git.k8s.io/website
49+
50+
## Summary
51+
52+
Currently, it is not possible to correlate (in logs) the sequence of calls that are involved in the enveloping operation: kube-apiserver->kms-plugin->KMS. This KEP proposes extending the signature of the kms-plugin interface to include the transaction ID (to be generated by the kube-apiserver), which kms-plugin could pass to KMS.
53+
54+
## Motivation
55+
56+
The only way to correlate a successful/failed envelope operation today is to use the approximate timestamp of the operation to check events in kube-apiserver, kms-plugin and KMS. There is no guarantee that the timestamp of the operation is the same as the timestamp of the corresponding event in KMS. This KEP proposes extending the signature of the kms-plugin interface to include the transaction ID (to be generated by the kube-apiserver), which kms-plugin could pass to KMS. This transaction ID will be logged with additional metadata such a secret name and namespace for the envelope operation. Similarly, the transaction ID will be logged in the kms-plugin and optionally passed to KMS.
57+
58+
### Goals
59+
60+
- Add transaction ID to kms-plugin interface
61+
- Update the logging in kube-apiserver to include transaction ID and non-sensitive metadata such as secret name, namespace for envelope operations
62+
63+
### Non-Goals
64+
65+
- Using this transaction ID for audit logging
66+
67+
## Proposal
68+
69+
- Generate a new UID for each envelope operation in kube-apiserver.
70+
- Add a new UID field to the envelope operation in kms-plugin interface.
71+
72+
## Design Details
73+
74+
<!--
75+
This section should contain enough information that the specifics of your
76+
change are understandable. This may include API specs (though not always
77+
required) or even code snippets. If there's any ambiguity about HOW your
78+
proposal will be implemented, this is the place to discuss them.
79+
-->
80+
81+
This design is centered around generating a new UID for each envelope operation similar to UID generation in admission review requests here: https://github.com/kubernetes/kubernetes/blob/e9e669aa6037c380469b45200e59cff9b52d6d68/staging/src/k8s.io/apiserver/pkg/admission/plugin/webhook/request/admissionreview.go#L137.
82+
83+
A new UID field will be added to the `EncryptRequest` and `DecryptRequest` structs in the kms-plugin interface. The field is a pointer to a string. If the feature gate is disabled, the UID field will be nil and this results in byte equivalent data on the wire when compared to a 1.23 API server.
84+
85+
```go
86+
type EncryptRequest struct {
87+
// UID is a unique identifier for the request.
88+
UID *string `protobuf:"bytes,3,opt,name=uid,proto3" json:"uid,omitempty"`
89+
// Version of the KMS plugin API.
90+
Version string `protobuf:"bytes,1,opt,name=version,proto3" json:"version,omitempty"`
91+
// The data to be encrypted.
92+
Plain []byte `protobuf:"bytes,2,opt,name=plain,proto3" json:"plain,omitempty"`
93+
XXX_NoUnkeyedLiteral struct{} `json:"-"`
94+
XXX_unrecognized []byte `json:"-"`
95+
XXX_sizecache int32 `json:"-"`
96+
}
97+
```
98+
99+
```go
100+
type DecryptRequest struct {
101+
// UID is a unique identifier for the request.
102+
UID *string `protobuf:"bytes,3,opt,name=uid,proto3" json:"uid,omitempty"`
103+
// Version of the KMS plugin API.
104+
Version string `protobuf:"bytes,1,opt,name=version,proto3" json:"version,omitempty"`
105+
// The data to be decrypted.
106+
Cipher []byte `protobuf:"bytes,2,opt,name=cipher,proto3" json:"cipher,omitempty"`
107+
XXX_NoUnkeyedLiteral struct{} `json:"-"`
108+
XXX_unrecognized []byte `json:"-"`
109+
XXX_sizecache int32 `json:"-"`
110+
}
111+
```
112+
113+
The UID generated in the kube-apiserver will be used:
114+
115+
1. For logging in the kube-apiserver. All envelope operations to the kms-plugin will be logged with the corresponding UID.
116+
1. The UID will be logged using a wrapper in the kube-apiserver to ensure that the UID is logged in the same format and is always logged.
117+
2. In addition to the UID, the kube-apiserver will also log non-sensitive metadata such as name, namespace and GroupVersionResource of the object that triggered the envelope operation.
118+
2. Sent to the kms-plugin as part of the `EncryptRequest` and `DecryptRequest` structs.
119+
120+
### Test Plan
121+
122+
Unit tests covering:
123+
124+
1. Generation of UID for each envelope operation
125+
126+
Integration test covering:
127+
128+
1. Logging of UID in kube-apiserver
129+
2. UID in the `EncryptRequest` and `DecryptRequest`
130+
3. UID set to nil in the `EncryptRequest` and `DecryptRequest` when the feature gate is disabled
131+
1. Confirm this results in byte equivalent data on the wire when compared to a 1.23 API server.
132+
133+
### Graduation Criteria
134+
135+
#### Alpha
136+
137+
- Feature implemented behind a feature flag
138+
- Initial unit and integration tests completed and enabled
139+
140+
#### Beta
141+
142+
- Gather feedback from providers using the feature
143+
- Any known bugs fixed
144+
145+
#### GA
146+
147+
- This is part of the KMS reference implementation
148+
149+
## Production Readiness Review Questionnaire
150+
151+
### Feature Enablement and Rollback
152+
153+
###### How can this feature be enabled / disabled in a live cluster?
154+
155+
<!--
156+
Pick one of these and delete the rest.
157+
-->
158+
159+
- Feature gate
160+
- Feature gate name: `KMSUID`
161+
- Components depending on the feature gate:
162+
- kube-apiserver
163+
164+
```go
165+
FeatureSpec{
166+
Default: false,
167+
LockToDefault: false,
168+
PreRelease: featuregate.Alpha,
169+
}
170+
```
171+
172+
###### Does enabling the feature change any default behavior?
173+
174+
UID sent as part of the envelope operation is a change in the default behavior. This is backwards compatible.
175+
176+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
177+
178+
Yes, via the `KMSUID` feature gate. Disabling this gate will cause the API server to not send the UID as part of `Encrypt` or `Decrypt` envelope operation.
179+
180+
### Monitoring Requirements
181+
182+
###### How can someone using this feature know that it is working for their instance?
183+
184+
- [x] Other (treat as last resort)
185+
- Details: Logs in kube-apiserver, kms-plugin and KMS will be logged with the corresponding UID.
186+
187+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
188+
189+
There should be no impact on the SLO with this change.
190+
191+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
192+
193+
- [x] Other (treat as last resort)
194+
- Details: Logs in kube-apiserver, kms-plugin and KMS will be logged with the corresponding UID.
195+
196+
### Dependencies
197+
198+
###### Does this feature depend on any specific services running in the cluster?
199+
200+
No.
201+
202+
### Scalability
203+
204+
###### Will enabling / using this feature result in any new API calls?
205+
206+
No.
207+
208+
###### Will enabling / using this feature result in introducing new API types?
209+
210+
No.
211+
212+
###### Will enabling / using this feature result in any new calls to the cloud provider?
213+
214+
No.
215+
216+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
217+
218+
This proposal adds a new field `UID` to the gRPC API for envelope operations.
219+
220+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
221+
222+
No.
223+
224+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
225+
226+
No.
227+
228+
### Troubleshooting
229+
230+
###### How does this feature react if the API server and/or etcd is unavailable?
231+
232+
- ETCD data encryption with external kms-plugin is unavailable
233+
234+
## Implementation History
235+
236+
<!--
237+
Major milestones in the lifecycle of a KEP should be tracked in this section.
238+
Major milestones might include:
239+
- the `Summary` and `Motivation` sections being merged, signaling SIG acceptance
240+
- the `Proposal` section being merged, signaling agreement on a proposed design
241+
- the date implementation started
242+
- the first Kubernetes release where an initial version of the KEP was available
243+
- the version of Kubernetes where the KEP graduated to general availability
244+
- when the KEP was retired or superseded
245+
-->
246+
247+
## Alternatives
248+
249+
<!--
250+
What other approaches did you consider, and why did you rule them out? These do
251+
not need to be as detailed as the proposal, but should include enough
252+
information to express the idea and why it was not acceptable.
253+
-->
254+
255+
We considered using the AuditID from the kube-apiserver request that generated the envelope operation. This approach has the following drawbacks:
256+
257+
1. AuditID can be configured by the user with the `Audit-ID` header in the API server request. Multiple requests can be sent to the kube-apiserver with the same Audit-ID.
258+
2. Not all API server requests will generate an envelope operation. The API server caches DEKs and for the DEK that's available in the cache, the kube-apiserver will not generate an envelope operation.
259+
3. Since not all calls to the KMS correspond to an audit log, using audit ID is not complete for correlating calls from kube-apiserver->kms-plugin->KMS.
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
title: KMS Observability
2+
kep-number: 3130
3+
authors:
4+
- "@aramase"
5+
owning-sig: sig-auth
6+
participating-sigs:
7+
- sig-auth
8+
status: implementable
9+
creation-date: 2022-01-12
10+
reviewers:
11+
- "@enj"
12+
- "@ritazh"
13+
approvers:
14+
- "@smarterclayton"
15+
stage: alpha
16+
latest-milestone: "v1.24"
17+
# The milestone at which this feature was, or is targeted to be, at each stage.
18+
milestone:
19+
alpha: "v1.24"
20+
beta: "v1.25"
21+
stable: "v1.26"
22+
feature-gates:
23+
- name: KMSUID
24+
components:
25+
- kube-apiserver
26+
disable-supported: true

0 commit comments

Comments
 (0)