Releases: apollographql/router
v2.10.1
🐛 Fixes
Enforce feature restrictions for warning-state licenses
The router now enforces license restrictions even when a license is in a warning state. Previously, warning-state licenses could bypass enforcement for restricted features.
If your deployment uses restricted features, the router returns an error instead of continuing to run.
By @aaronArinder in #8768
🧪 Experimental
Add experimental_hoist_orphan_errors to control orphan error path assignment
The GraphQL specification requires that errors include a path pointing to the most specific field where the error occurred. When a subgraph returns entity errors without valid paths, the router's default behavior is its closest attempt at spec compliance: it assigns each error to every matching entity path in the response. This is the correct behavior when subgraphs respond correctly.
However, when a subgraph returns a large number of entity errors without valid paths — for example, 2000 errors for 2000 expected entities — this causes a multiplicative explosion in the error array that can lead to significant memory pressure and out-of-memory kills. The root cause is the subgraph: a spec-compliant subgraph includes correct paths on its entity errors, and fixing the subgraph is the right long-term solution.
The new experimental_hoist_orphan_errors configuration provides an important mitigation while you work toward that fix. When enabled, the router assigns each orphaned error to the nearest non-array ancestor path instead of duplicating it across every entity. This trades spec-precise path assignment for substantially reduced error volume in the response — a conscious trade-off, not a strict improvement.
To target a specific subgraph:
experimental_hoist_orphan_errors:
subgraphs:
my_subgraph:
enabled: trueTo target all subgraphs:
experimental_hoist_orphan_errors:
all:
enabled: trueTo target all subgraphs except one:
experimental_hoist_orphan_errors:
all:
enabled: true
subgraphs:
noisy_one:
enabled: falsePer-subgraph settings override all. Note that this feature reduces the number of propagated errors but doesn't impose a hard cap — if your subgraph returns an extremely large number of errors, the router still processes all of them.
You'll likely know if you need this. Use it sparingly, and enable it only if you're affected and have been advised to do so. The behavior of this option is expected to change in a future release.
For full configuration reference and additional examples, see the experimental_hoist_orphan_errors documentation.
By @aaronArinder in #8998
v2.10.1-rc.2
2.10.1-rc.2
v2.10.1-rc.1
2.10.1-rc.1
v2.12.0
🚀 Features
Support Unix domain socket (UDS) communication for coprocessors (Issue #5739)
Many coprocessor deployments run side-by-side with the router, typically on the same host (for example, within the same Kubernetes pod).
This change brings coprocessor communication to parity with subgraphs by adding Unix domain socket (UDS) support. When the router and coprocessor are co-located, communicating over a Unix domain socket bypasses the full TCP/IP network stack and uses shared host memory instead, which can meaningfully reduce latency compared to HTTP.
Add redact_query_validation_errors supergraph config option (PR #8888)
The new redact_query_validation_errors option in the supergraph configuration section replaces all query validation errors with a single generic error:
{
"message": "invalid query",
"extensions": {
"code": "UNKNOWN_ERROR"
}
}Support multiple @listSize directives on the same field (PR #8872)
Warning
Multiple @listSize directives on a field only take effect after Federation supports repeatable @listSize in the supergraph schema. Until then, composition continues to expose at most one directive per field. This change makes the router ready for that Federation release.
The router now supports multiple @listSize directives on a single field, enabling more flexible cost estimation when directives from different subgraphs are combined during federation composition.
- The router processes all
@listSizedirectives on a field (stored asVec<ListSizeDirective>instead ofOption<ListSizeDirective>). - When multiple directives specify
assumedSizevalues, the router uses the maximum value for cost calculation. - Existing schemas with single directives continue to work exactly as before.
This change prepares the router for federation's upcoming support for repeatable @listSize directives, and maintains full compatibility with current non-repeatable directive schemas.
Add parser recursion and lexical token metrics (PR #8845)
The router now emits two new metrics: apollo.router.operations.recursion for the recursion level reached, and apollo.router.operations.lexical_tokens for the number of lexical tokens in a query.
Support subgraph-level demand control (PR #8829)
Subgraph-level demand control lets you enforce per-subgraph query cost limits in the router, in addition to the existing global cost limit for the whole supergraph. This helps you protect specific backend services that have different capacity or cost profiles from being overwhelmed by expensive operations.
When a subgraph-specific cost limit is exceeded, the router:
- Still runs the rest of the operation, including other subgraphs whose cost is within limits.
- Skips calls to only the over-budget subgraph, and composes the response as if that subgraph had returned null, instead of rejecting the entire query.
Per-subgraph limits apply to the total work for that subgraph in a single operation. For each request, the router tracks the aggregate estimated cost per subgraph across the entire query plan. If the same subgraph is fetched multiple times (for example, through entity lookups, nested fetches, or conditional branches), those costs are summed together and the subgraph's limit is enforced against that total.
Configuration
demand_control:
enabled: true
mode: enforce
strategy:
static_estimated:
max: 10
list_size: 10
actual_cost_mode: by_subgraph
subgraphs: # everything from here down is new (all fields optional)
all:
max: 8
list_size: 10
subgraphs:
products:
max: 6
# list_size omitted, 10 implied because of all.list_size
reviews:
list_size: 50
# max omitted, 8 implied because of all.maxExample
Consider a topProducts query that fetches a list of products from a products subgraph and then performs an entity lookup for each product in a reviews subgraph. Assume the products cost is 10 and the reviews cost is 5, leading to a total estimated cost of 15 (10 + 5).
Previously, you could only restrict that query via demand_control.static_estimated.max:
- If you set it to 15 or higher, the query executes.
- If you set it below 15, the query is rejected.
Subgraph-level demand control enables much more granular control. In addition to demand_control.static_estimated.max, which operates as before, you can also set per-subgraph limits.
For example, if you set max = 20 and reviews.max = 2, the query passes the aggregate check (15 < 20) and executes on the products subgraph (no limit specified), but doesn't execute against the reviews subgraph (5 > 2). The result is composed as if the reviews subgraph had returned null.
By @carodewig in #8829
Improve @listSize directive parsing and nested path support (PR #8893)
Demand control cost calculation now supports:
- Array-style parsing for
@listSizesizing (for example, list arguments) - Nested input paths when resolving list size from query arguments
- Nested field paths in the
sizedFieldsargument on@listSizefor more accurate cost estimation
These changes are backward compatible with existing schemas and directives.
Add coprocessor hooks for connector request and response stages (PR #8869)
You can now configure a coprocessor hook for the ConnectorRequest and ConnectorResponse stages of the router lifecycle.
coprocessor:
url: http://localhost:3007
connector:
all:
request:
uri: true
headers: true
body: true
context: all
service_name: true
response:
headers: true
body: true
context: all
service_name: trueBy @andrewmcgivery in #8869
🐛 Fixes
Pass variables to introspection queries (PR #8816)
Introspection queries now receive variables, enabling @include and @skip directives during introspection.
Log warning instead of returning error for non-UTF-8 headers in externalize_header_map (PR #8828)
- The router now emits a warning log with the name of the header instead of returning an error.
- The remaining valid headers are returned, which is more consistent with the router's default behavior when a coprocessor isn't used.
By @rohan-b99 in #8828
Place http_client span attributes on the http_request span (PR #8798)
Attributes configured under telemetry.instrumentation.spans.http_client are now added to the http_request span instead of subgraph_request.
Given this config:
telemetry:
instrumentation:
spans:
http_client:
attributes:
http.request.header.content-type:
request_header: "content-type"
http.response.header.content-type:
response_header: "content-type"Both attributes are now placed on the http_request span.
By @rohan-b99 in #8798
Validate ObjectValue variable fields against input type definitions (PR #8821 and PR #8884)
The router now validates individual fields of input object variables against their type definitions. Previously, variable validation checked that the variable itself was present but didn't validate the fields within the object.
Example:
## schema ##
input MessageInput {
content: String
author: String
}
type Receipt {
id: ID!
}
type Query{
send(message: MessageInput): Receipt
}
## query ##
query(: MessageInput) {
send(message: ) {
id
}
}
## input variables ##
{"msg":
{
"content": "Hello",
"author": "Me",
"unknownField": "unknown",
}
}This request previously passed validation because the variable msg was present in the input, but the fields of msg weren't validated against the MessageInput type.
Warning
To opt out of this behavior, set the supergraph.strict_variable_validation config option to measure.
Enabled:
supergraph:
strict_variable_validation: enforceDisabled:
supergraph:
strict_variable_validation: measureBy @conwuegb in #8821 and #8884
Increase internal Redis timeout from 5s to 10s (PR #8863)
Because mTLS handshakes can be slow in some environments, the internal Redis timeout is now 10s (previously 5s). The connection "unresponsive" threshold is also increased from 5s to 10s.
By @aaronari...
v2.12.0-rc.1
2.12.0-rc.1
v2.12.0-rc.0
2.12.0-rc.0
v2.12.0-alpha.0
2.12.0-alpha.0
v2.11.0
🚀 Features
Support client awareness metadata via HTTP headers (PR #8503)
Clients can now send library name and version metadata for client awareness and enhanced client awareness using HTTP headers. This provides a consistent transport mechanism instead of splitting values between headers and request.extensions.
By @calvincestari in #8503
Reload OCI artifacts when a tag reference changes (PR #8805)
You can now configure tag-based OCI references in the router. When you use a tag reference such as artifacts.apollographql.com/my-org/my-graph:prod, the router polls and reloads when that tag points to a new artifact.
This also applies to automatically generated variant tags and custom tags.
By @graytonio in #8805
Add memory limit option for cooperative cancellation (PR #8808)
The router now supports a memory_limit option on experimental_cooperative_cancellation to cap memory allocations during query planning. When the memory limit is exceeded, the router:
- In
enforcemode, cancels query planning and returns an error to the client. - In
measuremode, records the cancellation outcome in metrics and allows query planning to complete.
The memory limit works alongside the existing timeout option. Whichever limit is reached first triggers cancellation.
This feature is only available on Unix platforms when the global-allocator feature is enabled and dhat-heap is not enabled.
Example configuration:
supergraph:
query_planning:
experimental_cooperative_cancellation:
enabled: true
mode: enforce # or "measure" to only record metrics
memory_limit: 50mb # Supports formats like "50mb", "1gb", "1024kb", etc.
timeout: 5s # Optional: can be combined with memory_limitBy @rohan-b99 in #8808
Add memory tracking metrics for requests (PR #8717)
The router now emits two histogram metrics to track memory allocation activity during request processing:
apollo.router.request.memory: Memory activity across the full request lifecycle (including parsing, validation, query planning, and plugins)apollo.router.query_planner.memory: Memory activity for query planning work in the compute job thread pool
Each metric includes:
allocation.type:allocated,deallocated,zeroed, orreallocatedcontext: The tracking context name (for example,router.requestorquery_planning)
This feature is only available on Unix platforms when the global-allocator feature is enabled and dhat-heap is not enabled.
By @rohan-b99 in #8717
🐛 Fixes
Support nullable @key fields in response caching (PR #8767)
Response caching can now use nullable @key fields. Previously, the response caching feature rejected nullable @key fields, which prevented caching in schemas that use them.
When you cache data keyed by nullable fields, keep your cache keys simple and avoid ambiguous null values.
By @aaronArinder in #8767
Return 429 instead of 503 when enforcing a rate limit (PR #8765)
In v2.0.0, the router changed the rate-limiting error from 429 (TOO_MANY_REQUESTS) to 503 (SERVICE_UNAVAILABLE). This change restores 429 to align with the router error documentation.
By @carodewig in #8765
Add status code and error type attributes to http_request spans (PR #8775)
The router now always adds the http.response.status_code attribute to http_request spans (for example, for router -> subgraph requests). The router also conditionally adds error.type for non-success status codes.
By @rohan-b99 in #8775
Report response cache invalidation failures as errors (PR #8813)
The router now returns an error when response cache invalidation fails. Previously, an invalidation attempt could fail without being surfaced as an error.
After you upgrade, you might see an increase in the apollo.router.operations.response_cache.invalidation.error metric.
Reuse response cache Redis connections for identical subgraph configuration (PR #8764)
The response cache now reuses Redis connection pools when subgraph-level configuration resolves to the same Redis configuration as the global all setting. Previously, the router could create redundant Redis connections even when the effective configuration was identical.
Impact: If you configure response caching at both the global and subgraph levels, you should see fewer Redis connections and lower connection overhead.
Prevent TLS connections from hanging when a handshake stalls (PR #8779)
The router listener loop no longer blocks while waiting for a TLS handshake to complete. Use server.http.tls_handshake_timeout to control how long the router waits before terminating a connection (default: 10s).
By @rohan-b99 in #8779
Emit cardinality overflow metrics for more OpenTelemetry error formats (PR #8740)
The router now emits the apollo.router.telemetry.metrics.cardinality_overflow metric for additional OpenTelemetry cardinality overflow error formats.
Propagate trace context on WebSocket upgrade requests (PR #8739)
The router now injects trace propagation headers into the initial HTTP upgrade request when it opens WebSocket connections to subgraphs. This preserves distributed trace continuity between the router and subgraph services.
Trace propagation happens during the HTTP handshake only. After the WebSocket connection is established, headers cannot be added to individual messages.
Stop query planning compute jobs when the parent task is canceled (PR #8741)
Query planning compute jobs now stop when cooperative cancellation cancels the parent task.
By @rohan-b99 in #8741
Reject invalidation requests with unknown fields (PR #8752)
The response cache invalidation endpoint now rejects request payloads that include unknown fields. When unknown fields are present, the router returns HTTP 400 (Bad Request).
Restore plugin access to SubscriptionTaskParams in execution::Request builders (PR #8771)
Plugins and other external crates can use SubscriptionTaskParams with execution::Request builders again. This restores compatibility for plugin unit tests that construct subscription requests.
By @aaronArinder in #8771
Support JWT tokens with multiple audiences (PR #8780)
When issuers or audiences is included in the router's JWK configuration, the router will check each request's JWT for iss or aud and reject requests with mismatches.
Expected behavior:
- If present, the
issclaim must be specified as a string.- ✅ The JWK's
issuersis empty. - ✅ The
issis a string and is present in the JWK'sissuers. - ✅ The
issis null. - ❌ The
issis a string but is not present in the JWK'sissuers. - ❌ The
issis not a string or null.
- ✅ The JWK's
- If present, the
audclaim can be specified as either a string or an array of strings.- ✅ The JWK's
audiencesis empty. - ✅ The
audis a string and is present in the JWK'saudiences. - ✅ The
audis an array of strings and at least one of those strings is present in the JWK'saudiences. - ❌ The
audis not a string or array of strings (i.e., null).
- ✅ The JWK's
Behavior prior to this change:
- If the
isswas not null or a string, it was permitted (regardless of its value). - If the
audwas an array, it was rejected (regardless of its value).
By @carodewig in #8780
Enforce feature restrictions for warning-state licenses (PR #8768)
The router now enforces license restrictions even when a license is in a warning state. Previously, warning-state licenses could bypass enforcement for restricted features.
If your deployment uses restricted features, the router returns an error instead of continuing to run.
By @aaronArinder in #8768
🛠 Maintenance
Warn at startup when OTEL_EXPORTER_OTLP_ENDPOINT is set (PR #8729)
The ...
v2.11.0-rc.0
2.11.0-rc.0
v2.11.0-abstract.1
2.11.0-abstract.1