Skip to content

Commit de6d664

Browse files
authored
Enhancements to the BackendTLSPolicy GEP (#3835)
* Enhancements to the BackendTLSPolicy GEP Fixes #3516 * Update GEP with new semantics * Add 'from' * revert go type changes * Address some comments * Revert changes to Go types * Address Candace's comments * Revert everything again * Minimal changes * Address comments * Reference system certs * Address TLS passthrough * clarify persona
1 parent e1310bb commit de6d664

File tree

2 files changed

+62
-54
lines changed

2 files changed

+62
-54
lines changed

geps/gep-1897/images/mesh.png

67.5 KB
Loading

geps/gep-1897/index.md

Lines changed: 62 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -22,18 +22,15 @@ the service or backend owner wants to validate the clients connecting to it, two
2222
1. The solution must satisfy the following use case: the backend pod has its own
2323
certificate and the gateway implementation client needs to know how to connect to the
2424
backend pod. (Use case #4 in [Gateway API TLS Use Cases](#references))
25-
2. In terms of the Gateway API personas, only the application developer persona applies in this
26-
solution. The application developer should control the gateway to backend TLS settings,
27-
not the cluster operator, as requiring a cluster operator to manage certificate renewals
28-
and revocations would be extremely cumbersome.
29-
3. The solution should consider client certificate settings used in the TLS handshake **from
30-
Gateway to backend**, such as server name indication, trusted certificates,
31-
and CA certificates.
25+
2. In this GEP, only the application developer persona will have control over TLS settings. This does not preclude adding other personas in future GEPs.
26+
3. The solution should consider client TLS settings used in the TLS handshake **from
27+
Gateway to backend**, such as server name indication and trusted CA certificates.
28+
4. Both Gateway and Mesh use cases may be supported, depending on the implementation, and will be covered by features in each case.
3229

3330
## Longer Term Goals
3431

3532
These are worthy goals, but deserve a different GEP for proper attention. This GEP is concerned entirely with the
36-
controlplane, i.e. the hop between gateway and backend.
33+
the hop between gateway client and backend.
3734

3835
1. [TCPRoute](../../reference/spec.md#gateway.networking.k8s.io/v1alpha2.TCPRoute) and
3936
[GRPCRoute](../../reference/spec.md#gateway.networking.k8s.io/v1alpha2.GRPCRoute) use cases
@@ -59,7 +56,16 @@ These are worthy goals, but will not be covered by this GEP.
5956
6. Controlling certificates used by more than one workload (#6 in [Gateway API TLS Use Cases](#references))
6057
7. Client certificate settings used in TLS **from external clients to the
6158
Listener** (#7 in [Gateway API TLS Use Cases](#references))
62-
8. Providing a mechanism for the cluster operator to override gateway to backend TLS settings.
59+
8. Service Mesh "mesh transport security".
60+
9. Providing a mechanism for the cluster operator to override gateway to backend TLS settings.
61+
62+
> It is very common for service mesh implementations to implement some form of transparent transport security, whether that is WireGuard, mTLS, or others.
63+
> This is completely orthogonal to the use cases being tackled by this GEP.
64+
> * The "mesh transport security" is something invisible to the user's application, and is simply used to secure communication between components in the mesh.
65+
> * This proposal, instead, explicitly calls for sending TLS **to the user's application**.
66+
> However, this does not mean service meshes are outside of scope for this proposal, merely that only the application-level TLS configuration is in scope.
67+
68+
![](images/mesh.png "Mesh transport")
6369

6470
## Already Solved TLS Use Cases
6571

@@ -83,16 +89,16 @@ Gateway API is missing a mechanism for separately providing the details for the
8389
including (but not limited to):
8490

8591
* intent to use TLS on the backend hop
86-
* client certificate of the gateway
87-
* system certificates to use in the absence of client certificates
92+
* CA certificates to trust
93+
* other properties of the TLS handshake, such as SNI and SAN validation
94+
* client certificate of the gateway (outside of scope for this GEP)
8895

8996
## Purpose - why do we want to do this?
9097

9198
This proposal is _very_ tightly scoped because we have tried and failed to address this well-known
9299
gap in the API specification. The lack of support for this fundamental concept is holding back
93-
Gateway API adoption by users that require a solution to the use case. One of the recurring themes
94-
that has held up the prior art has been interest related to service mesh, and as such this proposal
95-
focuses explicitly on the ingress use case in the initial round. Another reason for the tight scope
100+
Gateway API adoption by users that require a solution to the use case.
101+
Another reason for the tight scope
96102
is that we have been too focused on a generic representation of everything that TLS can do, which
97103
covers too much ground to address in a single GEP.
98104

@@ -150,10 +156,10 @@ Because naming is hard, a new name may be
150156
substituted without blocking acceptance of the content of the API change.
151157

152158
The selection of the applicable Gateway API persona is important in the design of BackendTLSPolicy, because it provides
153-
a way to explicitly describe the _expectations_ of the connection to the application. BackendTLSPolicy is configured
154-
by the application developer Gateway API persona to signal what the application developer _expects_ in connections to
155-
the application, from a TLS perspective. Only the application developer can know what the application expects, so it is
156-
important that this configuration be managed by that persona.
159+
a way to explicitly describe the _expectations_ of the connection to the application.
160+
In this GEP, BackendTLSPolicy will be configured only by the application developer Gateway API persona to tell gateway clients how to connect to
161+
the application, from a TLS perspective.
162+
Future iterations *may* expand this to additionally allow consumer overrides; see [Future plans](#future-plans).
157163

158164
During the course of discussion of this proposal, we did consider allowing the cluster operator persona to have some access
159165
to Gateway cert validation, but as mentioned, BackendTLSPolicy is used primarily to signal what the application
@@ -170,18 +176,14 @@ as a TLS Client:
170176

171177
- An explicit signal that TLS should be used by this connection.
172178
- A hostname the Gateway should use to connect to the backend.
173-
- A reference to one or more certificates to use in the TLS handshake, signed by a CA or self-signed.
174-
- An indication that system certificates may be used.
179+
- A reference to one or more CA certificates (which could include "system certificates") to validate the server's TLS certificates.
175180

176181
BackendTLSPolicy is defined as a Direct Policy Attachment without defaults or overrides, applied to a Service that
177182
accesses the backend in question, where the BackendTLSPolicy resides in the same namespace as the Service it is
178-
applied to. The BackendTLSPolicy and the Service must reside in the same namespace in order to prevent the
179-
complications involved with sharing trust across namespace boundaries. We chose the Service resource as a target,
183+
applied to. For now, the BackendTLSPolicy and the Service must reside in the same namespace in order to prevent the
184+
complications involved with sharing trust across namespace boundaries (see [Future plans](#future-plans)). We chose the Service resource as a target,
180185
rather than the Route resource, so that we can reuse the same BackendTLSPolicy for all the different Routes that
181186
might point to this Service.
182-
For the use case where certificates are stored in their own namespace, users may create Secrets and use ReferenceGrants
183-
for a BackendTLSPolicy-to-Secret binding. Implementations must respect a ReferenceGrant for cross-namespace Secret
184-
sharing to BackendTLSPolicy, even if they don't for other cross-namespace sharing.
185187

186188
One of the areas of concern for this API is that we need to indicate how and when the API implementations should use the
187189
backend destination certificate authority. This solution proposes, as introduced in
@@ -194,6 +196,8 @@ that is appropriate, such as one of the HTTP error codes: 400 (Bad Request), 401
194196
other signal that makes the failure sufficiently clear to the requester without revealing too much about the transaction,
195197
based on established security requirements.
196198

199+
BackendTLSPolicy applies only to TCP traffic. If a policy explicitly attaches to a UDP port of a Service (that is, the `targetRef` has a `sectionName` specifying a single port or the service has only 1 port), the `Accepted: False` Condition with `Reason: Invalid` MUST be set. If the policy attaches to a mix of TCP and UDP ports, implementations SHOULD include a warning in the `Accepted` condition message (`ancestors.conditions`); the policy will only be effective for the TCP ports.
200+
197201
All policy resources must include `TargetRefs` with the fields specified
198202
in [PolicyTargetReference](https://github.com/kubernetes-sigs/gateway-api/blob/a33a934af9ec6997b34fd9b00d2ecd13d143e48b/apis/v1alpha2/policy_types.go#L24-L41).
199203
In an upcoming [extension](https://github.com/kubernetes-sigs/gateway-api/issues/2147) to TargetRefs, policy resources
@@ -238,35 +242,22 @@ Thus, the following additions would be made to the Gateway API:
238242

239243
## How a client behaves
240244

241-
This table describes the effect that a BackendTLSPolicy has on a Route. There are only two cases where the
242-
BackendTLSPolicy will signal a Route to connect to a backend using TLS, an HTTPRoute with a backend that is targeted
243-
by a BackendTLSPolicy, either with or without listener TLS configured. (There are a few other cases where it may be
244-
possible, but is implementation dependent.)
245-
246-
Every implementation that claims supports for BackendTLSPolicy should document for which Routes it is being implemented.
247-
248-
| Route Type | Gateway Config | Backend is targeted by a BackendTLSPolicy? | Connect to backend with TLS? |
249-
|------------|----------------------------|-----------------------------------------------|-------------------------------|
250-
| HTTPRoute | Listener tls | Yes | **Yes** |
251-
| HTTPRoute | No listener tls | Yes | **Yes** |
252-
| HTTPRoute | Listener tls | No | No |
253-
| HTTPRoute | No listener tls | No | No |
254-
| TLSRoute | Listener Mode: Passthrough | Yes | No |
255-
| TLSRoute | Listener Mode: Terminate | Yes | Implementation-dependent |
256-
| TLSRoute | Listener Mode: Passthrough | No | No |
257-
| TLSRoute | Listener Mode: Terminate | No | No |
258-
| TCPRoute | Listener TLS | Yes | Implementation-dependent |
259-
| TCPRoute | No listener TLS | Yes | Implementation-dependent |
260-
| TCPRoute | Listener TLS | No | No |
261-
| TCPRoute | No listener TLS | No | No |
262-
| UDPRoute | Listener TLS | Yes | No |
263-
| UDPRoute | No listener TLS | Yes | No |
264-
| UDPRoute | Listener TLS | No | No |
265-
| UDPRoute | No listener TLS | No | No |
266-
| GRPCRoute | Listener TLS | Yes | Implementation-dependent |
267-
| GRPCRoute | No Listener TLS | Yes | Implementation-dependent |
268-
| GRPCRoute | Listener TLS | No | No |
269-
| GRPCRoute | No Listener TLS | No | No |
245+
The `BackendTLSPolicy` tells a client "Connect to this service using TLS".
246+
This is unconditional to the type of traffic the gateway client is forwarding.
247+
248+
For instance, the following will all have the gateway client add TLS if the backend is targeted by a BackendTLSPolicy:
249+
250+
* A Gateway accepts traffic on an HTTP listener
251+
* A Gateway accepts and terminates TLS on an HTTPS listener
252+
* A Gateway accepts traffic on a TCP listener
253+
254+
There is no need for a Gateway that accepts traffic with `Mode: Passthrough` to do anything differently here, but implementations MAY choose to treat TLS passthrough as a special case. Implementations that do this SHOULD clearly document their approach if BackendTLSPolicy is treated differently for TLS passthrough.
255+
256+
Note that there are cases where these patterns may result in multiple layers of TLS on a single connection.
257+
There may be even cases where the gateway implementation is unaware of this; for example, processing TCPRoute traffic -- the traffic may or may not be TLS, and the gateway would be unaware.
258+
This is intentional to allow full fidelity of the API, as this is commonly desired for tunneling scenarios.
259+
When users do not want this, they should ensure that the BackendTLSPolicy is not incorrectly applied to traffic that is already TLS.
260+
The [Future Plans](#future-plans) include more controls over the API to make this easier to manage.
270261

271262
## Request Flow
272263

@@ -281,6 +272,23 @@ reverse proxy. This is shown as **bolded** additions in step 6 below.
281272
6. Lastly, the reverse proxy **optionally performs a TLS handshake** and forwards the request to one or more objects,
282273
i.e. Service, in the cluster based on backendRefs rules of the HTTPRoute **and the TargetRefs of the BackendTLSPolicy**.
283274

275+
## Future plans
276+
277+
In order to scope this GEP, some some changes are deferred to a near-future GEP.
278+
This GEP intends to add the ability for additional control by gateway clients to override TLS settings, following previously established patterns of [consumer and producer policies]([glossary](https://gateway-api.sigs.k8s.io/concepts/glossary/?h=gloss#producer-route)).
279+
Additionally, more contextual control over when to apply the policies will be explored, to enable use cases like "apply TLS only from this route" ([issue](https://github.com/kubernetes-sigs/gateway-api/issues/3856)).
280+
281+
While the details of these plans are out of scope for this GEP it is important to be aware of the future plans for the API to ensure the immediate-term plans are future-proofed against the proposed plans.
282+
283+
Implementations should plan for the existence of future fields that may be added that will control where the TLS policy applies.
284+
These may include, but are not limited to:
285+
286+
* `spec.targetRefs.namespace`
287+
* `spec.targetRefs.from`
288+
* `spec.mode`
289+
290+
While in some cases adding new fields may be seen as a backwards compatibility risk, due to older implementations not knowing to respect the fields, these fields (or similar, should future GEPs decide on new names) are pre-approved to be added in a future release, should the GEPs to add them are approved in the first place.
291+
284292
## Alternatives
285293
Most alternatives are enumerated in the section "The history of backend TLS". A couple of additional
286294
alternatives are also listed here.

0 commit comments

Comments
 (0)