Skip to content

Commit ec97d7a

Browse files
authored
Design for session persistence - Part II (#4221)
Problem: Users want sessionPersistence for their server communication Solution: Add a detailed approach on how to support sessionPersistence for NGF while being community conformant.
1 parent 8127574 commit ec97d7a

File tree

1 file changed

+171
-3
lines changed

1 file changed

+171
-3
lines changed
Lines changed: 171 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Enhancement Proposal-4051: Session Persistence for NGINX Plus and OSS
22

33
- Issue: https://github.com/nginx/nginx-gateway-fabric/issues/4051
4-
- Status: Provisional
4+
- Status: Implementable
55

66
## Summary
77

@@ -10,16 +10,184 @@ Enable NGINX Gateway Fabric to support session persistence for both NGINX Plus a
1010
## Goals
1111

1212
- Extend the Upstream Settings Policy API to allow specifying `ip_hash` load balancing method to support basic session persistence.
13-
- Design the translation of the Gateway API `sessionPersistence` specification, which can be configured on both HTTPRoute and GRPCRoute, into NGINX Plus cookie-based session persistence directives with `secure` and `httpOnly` mode enforced by default.
13+
- Design the translation of the Gateway API `sessionPersistence` specification, which can be configured on both HTTPRoute and GRPCRoute, into NGINX Plus cookie-based session persistence directives.
1414

1515
## Non-Goals
1616

1717
- Describe or implement low-level configuration details for enabling session persistence.
1818
- Extend session persistence support to TLSRoutes or other Layer 4 route types.
19-
- Supporting the `sameSite` cookie directive for NGINX Plus session persistence, which may be considered in the future as the Gateway API `sessionPersistence` specification evolves.
19+
- Supporting the `secure`, `httponly`, `sameSite` cookie directive for NGINX Plus session persistence, which may be considered in the future as the Gateway API `sessionPersistence` specification evolves.
20+
21+
## Introduction
22+
23+
For NGINX OSS, session persistence is enabled by setting `loadBalancingMethod: ip_hash` on UpstreamSettingsPolicy, which adds the `ip_hash` directive to upstreams and provides IP-based affinity.
24+
For NGINX Plus, session persistence defined on `HTTPRouteRule/GRPCRouteRule` is translated into sticky cookie upstream configuration with host-only cookies and a path derived from HTTPRoute matches (or defaulted for GRPCRoutes), so sessions stick to a chosen backend.
25+
26+
### Understanding the NGINX directives
27+
28+
**ip_hash**
29+
30+
The [ip_hash](https://nginx.org/en/docs/http/ngx_http_upstream_module.html#ip_hash) directive enables session persistence by routing requests from the same client IP address to the same upstream server. It uses the client’s IP address as a hash key to determine the target server, ensuring consistent routing for users behind a single IP. If the chosen server becomes unavailable, NGINX automatically selects the next available upstream server.
31+
32+
Syntax:
33+
34+
```bash
35+
ip_hash;
36+
```
37+
38+
**sticky (cookie method)**
39+
40+
The [sticky](https://nginx.org/en/docs/http/ngx_http_upstream_module.html#sticky) directive enables session persistence using a cookie to identify the upstream server handling a client’s session. When configured with the cookie parameter, NGINX sends a cookie token in the client response in a `Set-Cookie` header, allowing the browser to route subsequent requests with that cookie to the same upstream server.
41+
42+
Syntax:
43+
44+
```bash
45+
sticky cookie name [expires=time] [domain=domain] [httponly] [samesite=strict|lax|none|$variable] [secure] [path=path];
46+
```
47+
48+
Key Parameters:
49+
cookie <name> – Defines the session cookie name.
50+
expires=<time> – Sets cookie lifetime; omit to make it session-based. `max` – Special value for expires that sets expiry to `31 Dec 2037 23:55:55 GMT`.
51+
domain=<domain> - Sets the domain for the cookie scope.
52+
path=<path> - Sets the path for the cookie scope.
53+
samesite=[strict|lax|none|$variable] - Sets the sameSite attribute for the cookie.
54+
secure - Sets the `secure` attribute for the cookie.
55+
httponly - Sets the `httponly` attribute for the cookie.
56+
57+
## API, Customer Driven Interfaces, and User Experience
58+
59+
### Session Persistence for NGINX OSS users
60+
61+
In OSS, session persistence is provided by configuring upstreams to use the `ip_hash` load-balancing method. NGINX hashes the client IP to select an upstream server, so requests from the same IP are routed to the same upstream as long as it is available. If that server becomes unavailable, NGINX automatically selects another server in the upstream group. Session affinity quality with `ip_hash` depends on NGINX seeing the real client IP. In environments with external load balancers or proxies, operators must ensure appropriate `real_ip_header/set_real_ip_from` configuration so that `$remote_addr` reflects the end-user address otherwise, stickiness will be determined by the address of the front-end proxy rather than the actual client.
62+
63+
To surface this behavior, UpstreamSettingsPolicy is extended with a load-balancing method field:
64+
65+
```go
66+
// UpstreamSettingsPolicySpec defines the desired state of the UpstreamSettingsPolicy.
67+
type UpstreamSettingsPolicySpec struct {
68+
// ZoneSize is the size of the shared memory zone used by the upstream. This memory zone is used to share
69+
// the upstream configuration between nginx worker processes. The more servers that an upstream has,
70+
// the larger memory zone is required.
71+
// Default: OSS: 512k, Plus: 1m.
72+
// Directive: https://nginx.org/en/docs/http/ngx_http_upstream_module.html#zone
73+
//
74+
// +optional
75+
ZoneSize *Size `json:"zoneSize,omitempty"`
76+
77+
// KeepAlive defines the keep-alive settings.
78+
//
79+
// +optional
80+
KeepAlive *UpstreamKeepAlive `json:"keepAlive,omitempty"`
81+
82+
// LoadBalancingMethod specifies the load balancing algorithm to be used for the upstream.
83+
//
84+
// +optional
85+
// +kubebuilder:default:=random two least_conn
86+
LoadBalancingMethod *LoadBalancingType `json:"loadBalancingMethod,omitempty"`
87+
88+
// TargetRefs identifies API object(s) to apply the policy to.
89+
// Objects must be in the same namespace as the policy.
90+
// Support: Service
91+
//
92+
// TargetRefs must be _distinct_. The `name` field must be unique for all targetRef entries in the UpstreamSettingsPolicy.
93+
//
94+
// +kubebuilder:validation:MinItems=1
95+
// +kubebuilder:validation:MaxItems=16
96+
// +kubebuilder:validation:XValidation:message="TargetRefs Kind must be: Service",rule="self.all(t, t.kind=='Service')"
97+
// +kubebuilder:validation:XValidation:message="TargetRefs Group must be core",rule="self.exists(t, t.group=='') || self.exists(t, t.group=='core')"
98+
// +kubebuilder:validation:XValidation:message="TargetRef Name must be unique",rule="self.all(p1, self.exists_one(p2, p1.name == p2.name))"
99+
//nolint:lll
100+
TargetRefs []gatewayv1alpha2.LocalPolicyTargetReference `json:"targetRefs"`
101+
}
102+
103+
// LoadBalancingType defines supported load balancing methods.
104+
//
105+
// +kubebuilder:validation:Enum=ip_hash;random two least_conn
106+
type LoadBalancingType string
107+
108+
const (
109+
// LoadBalancingTypeIPHash enables IP hash-based load balancing,
110+
// ensuring requests from the same client IP are routed to the same upstream server.
111+
// NGINX directive: https://nginx.org/en/docs/http/ngx_http_upstream_module.html#ip_hash
112+
LoadBalancingTypeIPHash LoadBalancingType = "ip_hash"
113+
114+
// LoadBalancingTypeRandomTwoLeastConnection enables a variation of least-connections
115+
// balancing that randomly selects two servers and forwards traffic to the one with
116+
// fewer active connections.
117+
// NGINX directive least_conn: https://nginx.org/en/docs/http/ngx_http_upstream_module.html#least_conn
118+
// NGINX directive random: https://nginx.org/en/docs/http/ngx_http_upstream_module.html#random
119+
LoadBalancingTypeRandomTwoLeastConnection LoadBalancingType = "random two least_conn"
120+
)
121+
```
122+
123+
Note: `LoadBalancingMethod` is optional and defaults to `random two least_conn` in NGINX Gateway Fabric, even though NGINX itself defaults to `round_robin` load balancing. Adding this optional field is a non-breaking change and does not require a version bump in alignment with the [Kubernetes API compatibility guidelines](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md#on-compatibility).
124+
125+
### Session Persistence for NGINX Plus users
126+
127+
In NGINX Plus, session persistence is implemented with the `sticky` directive.The directive supports cookie, header, and learn modes; this design only discusses the cookie-based method and the rest are out of scope.
128+
Users can configure [sessionPersistence](https://gateway-api.sigs.k8s.io/reference/spec/?h=sessionpersistence#sessionpersistence) on HTTPRouteRule or GRPCRouteRule, and NGINX Gateway Fabric will map that configuration to `sticky cookie` and associated cookie attributes as described below. The current specification for Session Persistence can be found [here](https://gateway-api.sigs.k8s.io/reference/spec/#sessionpersistence).
129+
130+
#### Mapping the Gateway API fields to NGINX directives
131+
132+
| Spec Field | NGINX Directive | Notes / Limitations |
133+
|----------------------------------------|----------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
134+
| `sessionName` | `name` | Direct mapping to `sticky cookie` name. |
135+
| `absoluteTimeout` | `expires` | Only used when `cookieConfig.lifetimeType=Permanent`; not enforced for `Session` cookies. |
136+
| `idleTimeout` | _not supported_ | NGINX does not support idle-based invalidation for sticky cookies. Sessions expire only when the cookie expires or the session ends.|
137+
| `type` | `cookie` | Only cookie-based persistence is supported. If Header is specified, the sessionPersistence spec is ignored and a warning/status message is reported on the route, but the route itself remains valid. |
138+
| `cookieConfig.lifetimeType=Session` | _no `expires` set_ | Session cookies expire when the browser session ends. |
139+
| `cookieConfig.lifetimeType=Permanent` | `expires=<absoluteTimeout>`| Cookie persists until the specified timeout. `absoluteTimeout` is required when `lifetimeType` is `Permanent`. |
140+
| no matching spec field | _no `domain` attribute_ | Cookies are host-only for both `HTTPRoute` and `GRPCRoute`. |
141+
| no matching spec field | `path` | Behavior is described separately for `HTTPRoute` below. |
142+
143+
#### Domain and Path selection for Routes
144+
145+
Cookies use the [domain](https://datatracker.ietf.org/doc/html/rfc6265?#section-5.1.3) and [path](https://datatracker.ietf.org/doc/html/rfc6265?#section-5.1.4) attributes to control when the browser sends them back to the server. Domain limits the cookie to a host (and its subdomains, if set), while path limits it to URLs under a specific path prefix. Together they control where the browser sends the cookie, and therefore where session persistence actually applies.
146+
147+
For **HTTPRoutes**, we do not set the `domain` attribute. Deriving a broader domain (for example, a common suffix across hostnames or a parent domain) would widen the cookie scope to sibling subdomains and increase the risk of cross-host leakage. Since users cannot explicitly configure this field, inferring a shared domain would also be vulnerable to abuse. Leaving domain unset ensures each cookie is scoped to the exact host that issued it.
148+
149+
To determine the cookie `path` for HTTPRoutes, we handle the simple case where there is a single path match as follows:
150+
151+
| Path Value | Path Match Type | Cookie `Path` Value | Cookie Match Expectations |
152+
|-------------------------------------|-----------------|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|
153+
| `/hello-exact` | Exact | `/hello-exact` | Cookie header is sent for `/hello-exact` path only. |
154+
| `/hello-prefix` | Prefix | `/hello-prefix` | Cookie header is sent for `/hello-prefix` and any subpath starting with `/hello-prefix` (e.g. `/hello-prefix/foo`). |
155+
| `/hello-regex/[a-zA-Z0-9_-]+$` | Regex | `/hello-regex` | Cookie header is sent for any request whose path starts with `/hello-regex` and matches the regex in the location block (e.g. `/hello-regex/a`, `/hello-regex/abc123`). The regex still determines which requests match the route on the server side. |
156+
157+
When there are multiple path matches that share the same sessionPersistence configuration, we derive a single cookie path by computing the longest common prefix that ends on a path-segment boundary `/`. If no non-empty common prefix on a segment boundary exists, we fall back to `/` which is allowing all paths.
158+
159+
For **GRPCRoutes**, we do not set explicit cookie `domain` or `path` attributes. Leaving `domain` unset keeps cookies host-only, and omitting `path` means the user agent applies its default path derivation. This avoids guessing a cookie scope from gRPC routing metadata. gRPC routing is driven by a combination of listener hostnames, methods, and header matches, none of which map cleanly onto a single stable cookie scope: methods are too granular, hostnames may be broad or wildcarded, and header-based matches are inherently dynamic. Any attempt to derive a `domain` or `path` from this information would likely be ambiguous or over-scoped.
160+
161+
These decisions let HTTPRoute traffic benefit from path-scoped cookies while keeping cookie domain to host-only for both HTTPRoutes and GRPCRoutes to avoid cross-host leakage.
162+
For GRPCRoutes, we only provide basic sessionPersistence because typical gRPC clients do not implement browser-style cookie storage and replay. Cookies are treated as ordinary headers, so applications must handle them explicitly rather than relying on an automatic client-side cookie store.
163+
164+
## Use Cases
165+
166+
This enhancement targets apps that need straightforward session persistence, such as keeping a user on the same backend across multiple requests or supporting stateful services that keep session data in memory. Session persistence keeps a client pinned to one upstream while it’s healthy instead of re-randomizing on each request.
167+
168+
## Testing
169+
170+
There are no existing conformance tests for session persistence, so we will add functional tests to verify end-to-end behavior for both OSS and Plus. For OSS, tests will confirm that `ip_hash` keeps a client pinned to a single upstream while it is healthy. For Plus, tests will verify that `sessionPersistence` produces the expected `sticky cookie` configuration for both HTTPRoute and GRPCRoute and that requests with a valid session cookie are routed consistently to the same upstream.
171+
172+
## Security Considerations
173+
174+
The main security concern is how far session cookies reach. This design keeps cookies host-only by never setting the domain attribute, and for HTTPRoutes it scopes cookies by route path (or `/` when no safe common prefix exists). That limits both cross-host and cross-path leakage and reduces the impact of a compromised cookie.
175+
176+
### Edge Cases
177+
178+
- If an implementation routes through Service IPs, any Gateway-level session persistence must be rejected when Service-level session affinity is enabled. In our case, the data plane routes directly to pod IPs, so Service affinity does not interfere with session persistence between the gateway and backends.
179+
- For traffic-splitting configurations, if cookie-based session persistence is enabled, sessions must remain pinned consistently across the split backends.
180+
181+
### Future work
182+
183+
- Define clear precedence and additional restrictions when SessionPersistence is configured via a separate policy.
184+
- Add support for the `sameSite`, `secure`, `httponly` cookie attribute in a way that remains compliant with the Gateway API specification.
20185

21186
## Useful Links
22187

23188
- Session Persistence [specification](https://gateway-api.sigs.k8s.io/reference/spec/#sessionpersistence).
24189
- Extended Session Persistence [GEP](https://gateway-api.sigs.k8s.io/geps/gep-1619).
25190
- RFC standard for [Set-Cookie](https://datatracker.ietf.org/doc/html/rfc6265#section-4.1) header.
191+
- [Security risks with subdomain](https://blog.stackademic.com/session-security-risks-with-subdomains-2802c56d681f).
192+
- [Cookie Security](https://www.appsecmonkey.com/blog/cookie-security), read the section `Malicious Subdomains`.
193+
- [gRPC Metadata](https://grpc.io/docs/guides/metadata/).

0 commit comments

Comments
 (0)