|
| 1 | +# GEP-3951: Minimal Out-of-Cluster Gateway API |
| 2 | + |
| 3 | +* Issue: [#3951](https://github.com/kubernetes-sigs/gateway-api/issues/3951) |
| 4 | +* Status: Provisional |
| 5 | + |
| 6 | +See [status definitions](../overview.md#gep-states). |
| 7 | + |
| 8 | +[Chihiro]: https://gateway-api.sigs.k8s.io/concepts/roles-and-personas/#chihiro |
| 9 | +[Ian]: https://gateway-api.sigs.k8s.io/concepts/roles-and-personas/#ian |
| 10 | +[Ana]: https://gateway-api.sigs.k8s.io/concepts/roles-and-personas/#ana |
| 11 | + |
| 12 | +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", |
| 13 | +"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this |
| 14 | +document are to be interpreted as described in BCP 14 ([RFC8174]) when, and |
| 15 | +only when, they appear in all capitals, as shown here. |
| 16 | + |
| 17 | +[RFC8174]: https://www.rfc-editor.org/rfc/rfc8174 |
| 18 | + |
| 19 | +## User Story |
| 20 | + |
| 21 | +[GEP-3792] defines the rationale |
| 22 | +for allowing out-of-cluster Gateways (OCGs) |
| 23 | +to participate in a |
| 24 | +GAMMA-compliant in-cluster service mesh, |
| 25 | +and the problems that must be solved |
| 26 | +to allow them to do so. |
| 27 | +This GEP defines |
| 28 | +an extremely minimal API |
| 29 | +to permit experimentation |
| 30 | +with OCGs and |
| 31 | +in-cluster mTLS meshes. |
| 32 | + |
| 33 | +Nomenclature, |
| 34 | +background, |
| 35 | +goals, |
| 36 | +non-goals, |
| 37 | +problems that must be solved, |
| 38 | +and some discussion |
| 39 | +of possible solutions to those problems |
| 40 | +are all included |
| 41 | +in [GEP-3792]. |
| 42 | + |
| 43 | +[GEP-3792]: https://gateway-api.sigs.k8s.io/geps/gep-3792/ |
| 44 | + |
| 45 | +## Goals of the Extra-Minimal API |
| 46 | + |
| 47 | +- Allow [Chihiro] and [Ian] |
| 48 | + to configurate and operate |
| 49 | + an OCG and |
| 50 | + an in-cluster mTLS mesh |
| 51 | + that know how to work together |
| 52 | + to experiment with OCG support |
| 53 | + in Gateway API. |
| 54 | + |
| 55 | +## Non-Goals |
| 56 | + |
| 57 | +- Support production use of OCGs |
| 58 | + in Gateway API. |
| 59 | + |
| 60 | +- Solve all of the problems |
| 61 | + defined in [GEP-3792]. |
| 62 | + |
| 63 | +This is an **extra-minimal** API. |
| 64 | +Its purpose is to allow **experimentation** |
| 65 | +with OCGs and in-cluster meshes, |
| 66 | +**not** to provide |
| 67 | +a production-ready solution. |
| 68 | + |
| 69 | +Using this API in production |
| 70 | +is **guaranteed** to result |
| 71 | +in anguish, heartbreak, tears, and pain. |
| 72 | + |
| 73 | +## Overview |
| 74 | + |
| 75 | +The extra-minimal OCG API |
| 76 | +solves two of the [GEP-3792] problems |
| 77 | +in a very minimal way. |
| 78 | + |
| 79 | +- It solves the |
| 80 | + [trust problem] |
| 81 | + by extending |
| 82 | + the Mesh and Gateway resources |
| 83 | + to permit specifying |
| 84 | + a _trust bundle_ |
| 85 | + that contains the CA certificates |
| 86 | + that the OCG and the mesh |
| 87 | + will use to trust each other. |
| 88 | + |
| 89 | +- It solves the |
| 90 | + [discovery problem] |
| 91 | + by adding a label selector |
| 92 | + to the Gateway resource |
| 93 | + that indicates which Routes |
| 94 | + are meshed. |
| 95 | + |
| 96 | +- It does not solve the |
| 97 | + [protocol problem] |
| 98 | + or the |
| 99 | + [outbound behavior problem]. |
| 100 | + |
| 101 | +[trust problem]: https://gateway-api.sigs.k8s.io/geps/gep-3792/#1-the-trust-problem |
| 102 | +[protocol problem]: https://gateway-api.sigs.k8s.io/geps/gep-3792/#2-the-protocol-problem |
| 103 | +[discovery problem]: https://gateway-api.sigs.k8s.io/geps/gep-3792/#3-the-discovery-problem |
| 104 | +[outbound behavior problem]: https://gateway-api.sigs.k8s.io/geps/gep-3792/#4-the-outbound-behavior-problem |
| 105 | + |
| 106 | +### Additions to the Mesh Resource |
| 107 | + |
| 108 | +The Mesh resource |
| 109 | +gains an `ocg` stanza |
| 110 | +containing a `trustBundle` field |
| 111 | +that refers to a ConfigMap |
| 112 | +that contains the CA certificate(s) |
| 113 | +that the mesh should trust |
| 114 | +when validating connections |
| 115 | +from the OCG: |
| 116 | + |
| 117 | +```yaml |
| 118 | +... |
| 119 | +spec: |
| 120 | + ... |
| 121 | + ocg: |
| 122 | + trustBundle: |
| 123 | + name: ocg-trust-bundle |
| 124 | + namespace: ocg-namespace |
| 125 | + # Key in Configmap; defaults to "ca-bundle.crt" |
| 126 | + bundleKey: ca-bundle.crt |
| 127 | +``` |
| 128 | +
|
| 129 | +#### Additions to the Gateway Resource |
| 130 | +
|
| 131 | +The Gateway resource |
| 132 | +gains a `mesh` stanza |
| 133 | +containing two fields: |
| 134 | + |
| 135 | +- a `trustBundle` field |
| 136 | + that refers to a ConfigMap |
| 137 | + that contains the CA certificate(s) |
| 138 | + that the OCG should trust |
| 139 | + when validating connections |
| 140 | + from meshed peers |
| 141 | + |
| 142 | +- a `labelSelector` field |
| 143 | + that indicates which Routes |
| 144 | + are meshed. |
| 145 | + |
| 146 | +```yaml |
| 147 | +... |
| 148 | +spec: |
| 149 | + ... |
| 150 | + mesh: |
| 151 | + trustBundle: |
| 152 | + name: mesh-trust-bundle |
| 153 | + namespace: mesh-namespace |
| 154 | + # Key in Configmap; defaults to "ca-bundle.crt" |
| 155 | + bundleKey: ca-bundle.crt |
| 156 | + labelSelector: |
| 157 | + matchLabels: |
| 158 | + mesh: one-mesh-to-mesh-them-all |
| 159 | +``` |
| 160 | + |
| 161 | +### Trust Bundles: Solving the Trust Problem |
| 162 | + |
| 163 | +The trust problem is that |
| 164 | +both the OCG and the mesh |
| 165 | +need to be able to do mTLS verification |
| 166 | +of connections arriving from the other. |
| 167 | +The simplest solution to this problem |
| 168 | +is to add a _trust bundle_ |
| 169 | +to the Gateway resource |
| 170 | +and to the Mesh resource. |
| 171 | + |
| 172 | +- The trust bundle |
| 173 | + in the Gateway resource |
| 174 | + will define the CA certificate(s) |
| 175 | + that the OCG |
| 176 | + should accept as trusted |
| 177 | + when validating connections |
| 178 | + from meshed peers. |
| 179 | + |
| 180 | +- The trust bundle |
| 181 | + in the Mesh resource |
| 182 | + will define the CA certificate(s) |
| 183 | + that the mesh |
| 184 | + should accept as trusted |
| 185 | + when validating connections |
| 186 | + from the OCG. |
| 187 | + |
| 188 | +This is a straightforward way |
| 189 | +to permit each component |
| 190 | +to verify the identity of the other, |
| 191 | +which will provide |
| 192 | +sufficient basis for verifying identity when |
| 193 | +mTLS meshes are involved. |
| 194 | + |
| 195 | +#### The `trustBundle` Stanza |
| 196 | + |
| 197 | +The Mesh and Gateway resources |
| 198 | +both use a common `trustBundle` stanza: |
| 199 | + |
| 200 | +```yaml |
| 201 | +trustBundle: |
| 202 | + name: configmap-name |
| 203 | + namespace: configmap-namespace |
| 204 | + # Key in Configmap; defaults to "ca-bundle.crt" |
| 205 | + bundleKey: ca-bundle.crt |
| 206 | +``` |
| 207 | + |
| 208 | +The `name` field is always required. |
| 209 | +The `namespace` field is required |
| 210 | +in the Mesh resource, |
| 211 | +but may be omitted |
| 212 | +in the Gateway resource |
| 213 | +if the ConfigMap is |
| 214 | +in the same namespace |
| 215 | +as the Gateway resource. |
| 216 | + |
| 217 | +The ConfigMap referred to |
| 218 | +by the `trustBundle` stanza |
| 219 | +MUST contain |
| 220 | +a PEM-encoded trust bundle |
| 221 | +in the specified `bundleKey`, |
| 222 | +for example: |
| 223 | + |
| 224 | +```yaml |
| 225 | +apiVersion: v1 |
| 226 | +kind: ConfigMap |
| 227 | +metadata: |
| 228 | + name: ocg-trust-bundle |
| 229 | + namespace: ocg-namespace |
| 230 | +data: |
| 231 | + ca-bundle.crt: |- |
| 232 | + -----BEGIN CERTIFICATE----- |
| 233 | + ... (PEM-encoded CA certificate) ... |
| 234 | + -----END CERTIFICATE----- |
| 235 | + ... (may be repeated for multiple CA certificates) ... |
| 236 | +``` |
| 237 | + |
| 238 | +The `trustBundle` in either |
| 239 | +the Mesh resource |
| 240 | +or the Gateway resource |
| 241 | +may refer to a ConfigMap |
| 242 | +in any namespace |
| 243 | +to which RBAC permits access. |
| 244 | + |
| 245 | +##### Further Considerations |
| 246 | + |
| 247 | +The OCG and the mesh |
| 248 | +MAY share the same trust bundle, |
| 249 | +but this is not required. |
| 250 | +If they do, |
| 251 | +the Gateway and Mesh resources |
| 252 | +MAY refer to the same ConfigMap; |
| 253 | +if they do not, |
| 254 | +they must (of course) |
| 255 | +refer to different ConfigMaps |
| 256 | +that contain the same CA certificate(s). |
| 257 | + |
| 258 | +The `trustBundle` fields |
| 259 | +MAY NOT refer to Secrets. |
| 260 | +Since CA certificates are not private, |
| 261 | +they should not be stored in Secrets. |
| 262 | + |
| 263 | +An alternative to adding |
| 264 | +the `trustBundle` stanza |
| 265 | +to both the Mesh and Gateway resources |
| 266 | +would be to define a single trust bundle, |
| 267 | +requiring the OCG and the mesh |
| 268 | +to each use the same CA certificate. |
| 269 | +This adds considerable operational complexity - |
| 270 | +especially in the world of enterprise PKI - |
| 271 | +without any real benefit. |
| 272 | + |
| 273 | +### Label Selectors: Solving the Discovery Problem |
| 274 | + |
| 275 | +The discovery problem is that |
| 276 | +not every workload in the cluster |
| 277 | +is required to be meshed, |
| 278 | +and the OCG needs a way |
| 279 | +to know which Routes are meshed |
| 280 | +since it must ensure that it |
| 281 | +correctly uses mTLS |
| 282 | +for connections to meshed workloads. |
| 283 | + |
| 284 | +In practice, this isn't |
| 285 | +actually a question of _workloads_ |
| 286 | +but of _Routes_: |
| 287 | +the point of interface |
| 288 | +between a Gateway |
| 289 | +and a workload in the cluster |
| 290 | +is not a Pod or a Service, but |
| 291 | +rather a Route. |
| 292 | + |
| 293 | +The extra-minimal API |
| 294 | +solves this problem |
| 295 | +by adding a label selector |
| 296 | +to the Gateway resource |
| 297 | +that indicates which Routes |
| 298 | +are meshed. |
| 299 | +When the OCG connects |
| 300 | +to any Route |
| 301 | +that either directly matches this selector, |
| 302 | +or is in a namespace that matches this selector, |
| 303 | +it MUST use mTLS |
| 304 | +with a certificate |
| 305 | +that is ultimately signed |
| 306 | +by a CA certificate |
| 307 | +in the Mesh resource's `trustBundle`, |
| 308 | +and the OCG MUST validate |
| 309 | +that the peer presents a certificate |
| 310 | +that is ultimately signed |
| 311 | +by a CA certificate |
| 312 | +in the Gateway resource's `trustBundle`. |
| 313 | + |
| 314 | +The label selector |
| 315 | +is a simple mechanism |
| 316 | +(especially if |
| 317 | +the mesh already uses a label |
| 318 | +to indicate which resources are meshed) |
| 319 | +but it is still an active choice, |
| 320 | +rather than assuming |
| 321 | +things about the whole cluster. |
| 322 | +Additionally, it permits |
| 323 | +operating at the namespace level |
| 324 | +or at the Route level. |
| 325 | + |
| 326 | +### Other Problems |
| 327 | + |
| 328 | +The extra-minimal API |
| 329 | +does not solve the |
| 330 | +[protocol problem] |
| 331 | +or the |
| 332 | +[outbound behavior problem]. |
| 333 | +Instead, it assumes that |
| 334 | +the OCG and the mesh |
| 335 | +have prearranged |
| 336 | +protocols and behaviors |
| 337 | +that are mutually compatible. |
| 338 | + |
| 339 | +## Graduation Criteria |
| 340 | + |
| 341 | +Since [GEP-3792] mandates |
| 342 | +that any OCG API |
| 343 | +MUST solve all four problems |
| 344 | +defined in [GEP-3792] |
| 345 | +before graduating to standard, |
| 346 | +this GEP |
| 347 | +MUST NOT graduate to standard |
| 348 | +without significant further work. |
| 349 | + |
| 350 | +## Conformance Details |
| 351 | + |
| 352 | +#### Feature Names |
| 353 | + |
| 354 | +This GEP will use the feature names |
| 355 | +`GatewayExtraMinimalOCG` and |
| 356 | +`MeshExtraMinimalOCG`. |
| 357 | + |
| 358 | +### Conformance tests |
| 359 | + |
| 360 | +TBA. |
0 commit comments