@@ -190,8 +190,10 @@ incorrectly or objects being garbage collected mistakenly.
190
190
191
191
## Proposal
192
192
193
- API change: To the apiservices API, add an "alternates" clause, a list of
194
- apiservers which believe they can serve the group-version.
193
+ API changes:
194
+ * To the apiservices API, add an "alternates" clause, a list of
195
+ apiservers which believe they can serve the group-version.
196
+ * To ??? API, add ability to tell which apiservers can serve a resource.
195
197
196
198
API server change:
197
199
* A controller adds the apiserver to the list of alternates for its built-in
@@ -202,22 +204,34 @@ API server change:
202
204
- If the request is for a group/version the apiserver doesn't have locally, it
203
205
will proxy the request to one of the alternates instead.
204
206
205
- Unsolved problem: to be completely accurate and achive the goals in this KEP, we
206
- will need to track what resources apiservers can serve, not just what
207
- group-versions.
208
-
209
207
### User Stories (Optional)
210
208
211
- <!--
212
- Detail the things that people will be able to do if this KEP is implemented.
213
- Include as much detail as possible so that people can understand the "how" of
214
- the system. The goal here is to make this feel real for users without getting
215
- bogged down.
216
- -->
209
+ #### Garbage Collector
210
+
211
+ The garbage collector makes decisions about deleting objects when all
212
+ referencing objects are deleted. A discovery gap / apiserver mismatch, as
213
+ described above, could result in GC seeing a 404 and assuming an object has been
214
+ deleted; this could result in it deleting a subsequent object that it should
215
+ not.
217
216
218
- #### Story 1
217
+ This proposal will cause the GC to see either the correct object or get a 503
218
+ (which it handles safely).
219
219
220
- #### Story 2
220
+ #### Namespace Lifecycle Controller
221
+
222
+ This controller seeks to empty all objects from a namespace when it is deleted.
223
+ Discovery failures cause NLC to be unable to tell if objects of a given resource
224
+ are present in a namespace. It fails safe, meaning it refuses to delete the
225
+ namespace until it can verify it is empty: this causes slowness deleteing
226
+ namespaces that is a common source of complaint.
227
+
228
+ Additionally, if the NLC knows about a resource that the apiserver it is talking
229
+ to does not, it may incorrectly get a 404, assume a collection is empty, and
230
+ delete the namespace too early, leaving garbage behind in etcd. This is a
231
+ correctness problem, the garbage will reappear if a namespace of the same name
232
+ is recreated.
233
+
234
+ This proposal addresses both problems.
221
235
222
236
### Notes/Constraints/Caveats (Optional)
223
237
@@ -230,26 +244,32 @@ This might be a good place to talk about core concepts and how they relate.
230
244
231
245
### Risks and Mitigations
232
246
233
- <!--
234
- What are the risks of this proposal, and how do we mitigate? Think broadly.
235
- For example, consider both security and how this will impact the larger
236
- Kubernetes ecosystem .
247
+ Cluster admins might not read the release notes and realize they should enable
248
+ network/firewall connectivity between apiservers. In this case clients will
249
+ recieve 503s instead of transparently being proxied. 503 is still safer than
250
+ today's behavior .
237
251
238
- How will security be reviewed, and by whom?
252
+ Requests will consume egress bandwidth for 2 apiservers when proxied. We can cap
253
+ the number if needed, but upgrades aren't that frequent and few resources are
254
+ changed on releases, so these requests should not be common. We will count them
255
+ with a metric.
239
256
240
- How will UX be reviewed, and by whom?
241
-
242
- Consider including folks who also work outside the SIG or subproject.
243
- -->
257
+ TODO: security / cert stuff.
244
258
245
259
## Design Details
246
260
247
- <!--
248
- This section should contain enough information that the specifics of your
249
- change are understandable. This may include API specs (though not always
250
- required) or even code snippets. If there's any ambiguity about HOW your
251
- proposal will be implemented, this is the place to discuss them.
252
- -->
261
+ TODO: specific API change (x2)
262
+
263
+ TODO: explanation of how the handler will determine a request is for a resource
264
+ that should be proxied.
265
+
266
+ TODO: explanation of how the security handshake between apiservers works.
267
+ * What we need to fix: random processes / external users / etc should not be
268
+ able to proxy requests, so the receiving apiserver needs to be able to verify
269
+ the source apiserver.
270
+ * generate self-signed cert on startup, put pubkey in apiserver identity lease
271
+ object?
272
+
253
273
254
274
### Test Plan
255
275
0 commit comments