@@ -73,6 +73,11 @@ she doesn't have to explicitly name, and can simply trust to exist.
73
73
not in scope for this GEP, in order to have a fighting chance of getting
74
74
functionality into Gateway API 1.4.
75
75
76
+ Additionally, note that providing support for Chihiro to swap the default
77
+ Gateway without downtime may very well require supporting multiple default
78
+ Gateways at the same time, since Kubernetes does not support atomic swaps of
79
+ resources.
80
+
76
81
- Allow Ana to override Chihiro's choice for the default Gateway for a given
77
82
Route without explicitly specifying the Gateway.
78
83
@@ -161,10 +166,10 @@ Gateways.
161
166
162
167
## API
163
168
164
- Most of the API work for this GEP is TBD at this point. The challenge is to
165
- find a way to allow Ana to use Routes without requiring her to specify the
166
- Gateway explicitly, while still allowing Chihiro and Ian to retain control
167
- over the Gateway and its configuration.
169
+ The main challenge in the API design is to find a way to allow Ana to use
170
+ Routes without requiring her to specify the Gateway explicitly, while still
171
+ allowing Chihiro and Ian to retain control over the Gateway and its
172
+ configuration.
168
173
169
174
An additional concern is CD tools and GitOps workflows. In very broad terms,
170
175
these tools function by applying manifests from a Git repository to a
@@ -189,10 +194,274 @@ will need resolution before this GEP can graduate.
189
194
190
195
[ discussion ] : https://github.com/kubernetes-sigs/gateway-api/pull/3852#discussion_r2140117567
191
196
197
+ Finally, although support for multiple default Gateways is a non-goal for this
198
+ GEP, it's worth noting that allowing Chihiro full control over the default
199
+ Gateway is very much a goal, which includes giving Chihiro a clean way to swap
200
+ one default Gateway for another. This is important because a zero-downtime
201
+ swap implies having two default Gateways running at the same time, since
202
+ Kubernetes does not support any sort of atomic swap operation.
203
+
192
204
### Gateway for Ingress (North/South)
193
205
206
+ There are two main aspects to the API design for default Gateways:
207
+
208
+ 1 . Giving Ana a way to bind Routes to the default Gateway.
209
+
210
+ 2 . Giving Chihiro a way to control which Gateway is the default, and to
211
+ enumerate which Routes are bound to it.
212
+
213
+ #### 1. Binding a Route to the Default Gateway
214
+
215
+ For Ana to indicate that a Route should use the default Gateway, she MUST
216
+ leave ` parentRefs ` empty in the ` spec ` of the Route, for example:
217
+
218
+ ``` yaml
219
+ apiVersion : gateway.networking.k8s.io/v1
220
+ kind : HTTPRoute
221
+ metadata :
222
+ name : my-route
223
+ spec :
224
+ rules :
225
+ - backendRefs :
226
+ - name : my-service
227
+ port : 80
228
+ ` ` `
229
+
230
+ would route _all_ HTTP traffic arriving at the default Gateway to ` my-service`
231
+ on port 80.
232
+
233
+ Note that Ana MUST omit `parentRefs` entirely : specifying an empty array for
234
+ ` parentRefs` MUST fail validation. If a Route with an empty array for
235
+ ` parentRefs` somehow exists in the cluster, all Gateways in the cluster MUST
236
+ refuse to accept it. (Omitting `parentRefs` entirely will work much more
237
+ cleanly with GitOps tools than specifying an empty array.)
238
+
239
+ Note also that if Ana specifies _any_ `parentRefs`, the default Gateway MUST
240
+ NOT claim the Route unless of the `parentRefs` explicitly names the default
241
+ Gateway. To do otherwise makes it impossible for Ana to define mesh-only
242
+ Routes, or to specify a Route that is meant to use only a specific Gateway
243
+ that is not the default. This implies that for Ana to specify a Route intended
244
+ to serve both north/south and east/west roles, she MUST explicitly specify the
245
+ Gateway in `parentRefs`, even if that Gateway happens to be the default
246
+ Gateway.
247
+
248
+ All other characteristics of a Route using the default Gateway MUST behave the
249
+ same as if the default Gateway were explicitly specified in `parentRefs`.
250
+
251
+ The default Gateway MUST use `status.parents` to announce that it has bound
252
+ the Route, for example :
253
+
254
+ ` ` ` yaml
255
+ status:
256
+ parents:
257
+ - name: my-default-gateway
258
+ namespace: default
259
+ controllerName: gateway.networking.k8s.io/some-gateway-controller
260
+ conditions:
261
+ - type: Accepted
262
+ status: "True"
263
+ lastTransitionTime: "2025-10-01T12:00:00Z"
264
+ message: "Route is bound to default Gateway"
265
+ ` ` `
266
+
267
+ The default Gateway MUST NOT rewrite the `parentRefs` of a Route using the
268
+ default Gateway; it MUST leave `parentRefs` empty. This becomes important if
269
+ the default Gateway changes, or (in some situations) if GitOps tools are in
270
+ play.
271
+
272
+ # #### Enumerating Routes Bound to the Default Gateway
273
+
274
+ To enumerate Routes bound to the default Gateway, Ana can look for Routes with
275
+ no `parentRefs` specified, and then check the `status.parents` of those Routes
276
+ to see if the Route has been claimed. This will also tell Ana which Gateway is
277
+ the default, even if she doesn't have RBAC to query Gateway resources
278
+ directly.
279
+
280
+ While this is possible with `kubectl get -o yaml`, it's not exactly a friendly
281
+ user experience, so adding this functionality to a tool like `gwctl` would be
282
+ a dramatic improvement. In fact, looking at the `status` of a Route is very
283
+ much something that we should expect Ana to do often, whether or not default
284
+ Gateways are in play; `gwctl` or something similar SHOULD be able to show her
285
+ which Routes are bound to which Gateways in every case, not just with default
286
+ Gateways.
287
+
288
+ **Open Questions:**
289
+
290
+ Should the Gateway also add a `condition` explicitly expressing that the Route
291
+ has been claimed by the default Gateway, perhaps with `type : DefaultGateway`?
292
+ This could help tooling like `gwctl` more easily enumerate Routes bound to the
293
+ default Gateway.
294
+
295
+ # ### 2. Controlling which Gateway is the Default
296
+
297
+ Since Chihiro must be able to control which Gateway is the default, selecting
298
+ the default Gateway must be an active configuration step taken by Chihiro,
299
+ rather than any kind of implicit behavior. To that end, the Gateway resource
300
+ will gain a new field, `spec.isDefault` :
301
+
302
+ ` ` ` go
303
+ type GatewaySpec struct {
304
+ // ... other fields ...
305
+ IsDefault *bool ` json:"isDefault,omitempty"`
306
+ }
307
+ ```
308
+
309
+ If ` spec.isDefault ` is set to ` true ` , the Gateway MUST claim Routes that have
310
+ specified no ` parentRefs ` (subject to the usual Gateway API rules about which
311
+ Routes may be bound to a Gateway), and it MUST update its own ` status ` to with
312
+ a ` condition ` of type ` DefaultGateway ` and ` status ` true to indicate that it
313
+ is the default Gateway, for example:
314
+
315
+ ``` yaml
316
+ status :
317
+ conditions :
318
+ - type : DefaultGateway
319
+ status : " True"
320
+ lastTransitionTime : " 2025-10-01T12:00:00Z"
321
+ message : " Gateway is the default Gateway"
322
+ ` ` `
323
+
324
+ If ` spec.isDefault` is not present or is set to `false`, the Gateway MUST NOT
325
+ claim those Routes and MUST NOT set the `DefaultGateway` condition in its
326
+ ` status` .
327
+
328
+ # #### Access to the Default Gateway
329
+
330
+ The rules for which Routes may bind to a Gateway do not change for the default
331
+ Gateway. In particular, if a default Gateway should accept Routes from other
332
+ namespaces, then it MUST include the appropriate `AllowedRoutes` definition,
333
+ and without such an `AllowedRoutes`, a default Gateway MUST accept only Routes
334
+ from its own namespace.
335
+
336
+ # #### Behavior with No Default Gateway
337
+
338
+ If no Gateway has `spec.isDefault` set to `true`, then the behavior is exactly
339
+ the same as for Gateway API 1.3 : all Routes MUST specify `parentRefs` in order
340
+ to function, and no Gateway will claim Routes that do not specify
341
+ ` parentRefs` .
342
+
343
+ # #### Deleting the Default Gateway
344
+
345
+ Deleting the default Gateway MUST behave the same as deleting any other
346
+ Gateway : all Routes that were bound to the default Gateway MUST be unbound,
347
+ and the `Accepted` conditions in the `status` of those Routes SHOULD be
348
+ removed.
349
+
350
+ # #### Multiple Default Gateways
351
+
352
+ Support for multiple default Gateways in a cluster is not one of the original
353
+ goals of this GEP. However, allowing Chihiro to control which Gateway is the
354
+ default - including being able to switch which Gateway is the default at
355
+ runtime, without requiring downtime - is a goal.
356
+
357
+ Kubernetes itself will not prevent setting `spec.isDefault` to `true` on
358
+ multiple Gateways in a cluster, and it also doesn't support any atomic swap
359
+ mechanisms. If we want to enforce only a single default Gateway, the Gateway
360
+ controllers will have to implement that enforcement logic. There are three
361
+ possible options here.
362
+
363
+ 1. Don't bother with any enforcement logic.
364
+
365
+ In this case, a Route with no `parentRefs` specified will be bound to _all_
366
+ Gateways that have `spec.isDefault` set to `true`. Since Gateway API
367
+ already allows a Route to be bound to multiple Gateways, and the Route
368
+ ` status` is already designed for it, this should function without
369
+ difficulty.
370
+
371
+ 2. Treat multiple Gateways with `spec.isDefault` set to `true` as if no
372
+ Gateway has `spec.isDefault` set to `true`.
373
+
374
+ If we assume that all Gateway controllers in a cluster can see all the
375
+ Gateways in the cluster, then detecting that multiple Gateways have
376
+ ` spec.isDefault` set to `true` is relatively straightforward.
377
+
378
+ For option 2, every Gateway with `spec.isDefault` set to `true` can simply
379
+ refuse to accept Routes with no `parentRefs` specified, behaving as if no
380
+ Gateway has been chosen as the default. Each Gateway would also update its
381
+ ` status` with a `condition` of type `DefaultGateway` and `status` false to
382
+ indicate that it is not the default Gateway, for example :
383
+
384
+ ` ` ` yaml
385
+ status:
386
+ conditions:
387
+ - type: DefaultGateway
388
+ status: "False"
389
+ lastTransitionTime: "2025-10-01T12:00:00Z"
390
+ message: "Multiple Gateways are marked as default"
391
+ ` ` `
392
+
393
+ 3. Perform conflict resolution as with Routes.
394
+
395
+ In this case, the oldest Gateway with `spec.isDefault` set to `true` will
396
+ be considered the only default Gateway. That oldest Gateway will accept all
397
+ Routes with no `parentRefs` specified, while all other Gateways with
398
+ ` spec.isDefault` set to `true` will ignore those Routes.
399
+
400
+ The oldest default Gateway will update its `status` to reflect that it the
401
+ default Gateway; all other Gateways with `spec.isDefault` set to `true`
402
+ will update their `status` as in Option 2.
403
+
404
+ Unfortunately, option 2 will almost certainly cause downtime in any case where
405
+ Chihiro wants to change the default Gateway :
406
+
407
+ - If Chihiro deletes the default Gateway before creating the new one, then all
408
+ routes using the default Gateway will be unbound during the time that
409
+ there's no default Gateway, resulting in errors for any requests using those
410
+ Routes.
411
+
412
+ - If Chihiro creates the new default Gateway before deleting the old one, then
413
+ all Routes using the default Gateway are still unbound during the time that
414
+ both Gateways exist.
415
+
416
+ Option 3 gives Chihiro a way to change the default Gateway without downtime :
417
+ when they create the new default Gateway, it will not take effect until the
418
+ old default Gateway is deleted. However, it doesn't give Chihiro any way to
419
+ test the Routes through the new default Gateway before deleting the old
420
+ Gateway.
421
+
422
+ Reluctantly, we must therefore conclude that option 1 is the only viable
423
+ choice. Therefore : Gateways MUST NOT attempt to enforce a single default
424
+ Gateway, and MUST allow Routes with no `parentRefs` to bind to _all_ Gateways
425
+ that have `spec.isDefault` set to `true`. This is simplest to implement, it
426
+ permits zero-downtime changes to the default Gateway, and it allows for
427
+ testing of the new default Gateway before the old one is deleted.
428
+
429
+ # #### Changes in Functionality
430
+
431
+ If Chihiro changes the default Gateway to a different implementation that does
432
+ not support all the functionality of the previous default Gateway, then the
433
+ Routes that were bound to the previous default Gateway will no longer function
434
+ as expected. This is not a new problem : it already exists when Ana changes a
435
+ Route's `parentRefs`, or when Chihiro changes the implementation of a Gateway
436
+ that is explicitly specified in a Route's `parentRefs`.
437
+
438
+ At present, we do not propose any solution to this problem, other than to note
439
+ that `gwctl` or similar tools SHOULD be able to show Ana not just the Gateways
440
+ to which a Route is bound, but also the features supported by those Gateways,
441
+ to at least help Ana understand if she is trying to use Gateways that don't
442
+ support a feature that she needs. This is a definitely an area for future
443
+ work, and it is complicated by the fact that Ana may not have access to read
444
+ Gateway resources in the cluster at all.
445
+
446
+ # #### Listeners, ListenerSets, and Merging
447
+
448
+ Setting `spec.isDefault` on a Gateway affects which Routes will bind to the
449
+ Gateway, not where the Gateway listens for traffic. As such, setting
450
+ ` spec.isDefault` MUST NOT alter a Gateway's behavior with respect to
451
+ Listeners, ListenerSets, or merging.
452
+
453
+ In the future, we may want to consider allowing a default ListenerSet rather
454
+ than only a default Gateway, but that is not in scope for this GEP. Even if it
455
+ is considered later, the guiding principle SHOULD be that `spec.isDefault`
456
+ SHOULD NOT affect where a Gateway listens for traffic or whether it can be
457
+ merged with other Gateways.
458
+
194
459
# ## Gateway For Mesh (East/West)
195
460
461
+ Mesh traffic is defined by using a Service as a `parentRef` rather than a
462
+ Gateway. As such, there is no case where a default Gateway would be used for
463
+ mesh traffic.
464
+
196
465
# # Conformance Details
197
466
198
467
# ### Feature Names
@@ -204,14 +473,34 @@ not seem like a good choice.
204
473
205
474
# ## Conformance tests
206
475
476
+ TBD.
477
+
207
478
# # Alternatives
208
479
209
- A possible alternative API design is to modify the behavior of Listeners or
210
- ListenerSets; rather than having a "default Gateway", perhaps we would have
211
- "[ default Listeners] ". One challenge here is that the Route ` status ` doesn't
212
- currently expose information about which Listener is being used, though it
213
- does show which Gateway is being used.
480
+ - A possible alternative API design is to modify the behavior of Listeners or
481
+ ListenerSets; rather than having a "default Gateway", perhaps we would have
482
+ " [default Listeners]" . One challenge here is that the Route `status` doesn't
483
+ currently expose information about which Listener is being used, though it
484
+ does show which Gateway is being used.
214
485
215
486
[default Listeners] : https://github.com/kubernetes-sigs/gateway-api/pull/3852#discussion_r2149056246
216
487
488
+ - We could define the default Gateway as a Gateway with a magic name, e.g.
489
+ " default" . This doesn't actually make things that much simpler for Ana
490
+ (she'd still have to specify `parentRefs`), and it raises questions about
491
+ Chihiro's ability to control which Routes can bind to the default Gateway,
492
+ as well as how namespacing would work -- it's especially unhelpful for Ana
493
+ if she has to know the namespace of the default Gateway in order to use it.
494
+
495
+ - A default Gateway could overwrite a Route's empty `parentRefs` with a
496
+ non-empty `parentRefs` pointing to the default Gateway. The main challenge
497
+ with this approach is that once the `parentRefs` are overwritten, it's no
498
+ longer possible to know that the Route was originally intended to use the
499
+ default Gateway. Using the `status` to indicate that the Route is bound to
500
+ the default Gateway instead both preserves Ana's original intent and also
501
+ makes it possible to change the default Gateway without requiring Ana to
502
+ recreate all her Routes.
503
+
217
504
# # References
505
+
506
+ TBD.
0 commit comments