feat: support per-route timeout overrides for ingress path rules#2250
feat: support per-route timeout overrides for ingress path rules#2250
Conversation
Changes the additional_api_paths_handled_by_ingress variable from
list(string) to list(object({ paths, timeout_sec? })). Each entry now
creates its own path_rule, and when timeout_sec is set, it adds a
route_action.timeout that overrides the ingress backend default (80s).
Backwards compatible: existing callers just need to wrap their path
lists, e.g.:
# Before
additional_api_paths_handled_by_ingress = ["/v1/foo", "/v1/bar"]
# After
additional_api_paths_handled_by_ingress = [
{ paths = ["/v1/foo", "/v1/bar"] }
]
# Or with a custom timeout
additional_api_paths_handled_by_ingress = [
{ paths = ["/v1/foo"], timeout_sec = 300 },
{ paths = ["/v1/bar"] }
]
PR SummaryMedium Risk Overview Written by Cursor Bugbot for commit 7b8b8d2. This will update automatically on new commits. Configure here. |
| service = google_compute_backend_service.ingress.self_link | ||
|
|
||
| dynamic "route_action" { | ||
| for_each = path_rule.value.timeout_sec != null ? [path_rule.value.timeout_sec] : [] |
There was a problem hiding this comment.
The != null guard allows timeout_sec = 0 to pass through, producing timeout { seconds = 0 }. GCP rejects URL map timeouts of 0 seconds at apply time. Consider adding a variable validation rule:
validation {
condition = alltrue([
for r in var.additional_api_paths_handled_by_ingress :
r.timeout_sec == null || r.timeout_sec > 0
])
error_message = "timeout_sec must be greater than 0 when specified."
}There was a problem hiding this comment.
technically true, but not meaningfully true. gcp will happily provide an error message.
| for_each = path_rule.value.timeout_sec != null ? [path_rule.value.timeout_sec] : [] | ||
|
|
||
| content { | ||
| timeout { |
There was a problem hiding this comment.
The timeout block inside route_action is missing the nanos field. While nanos is optional in the provider (defaults to 0), the GCP API may still reject the configuration for certain provider versions. Explicitly setting nanos = 0 avoids any ambiguity:
timeout {
seconds = route_action.value
nanos = 0
}There was a problem hiding this comment.
technically true, but not meaningfully true. gcp will happily provide an error message.
| variable "additional_api_paths_handled_by_ingress" { | ||
| type = list(string) | ||
| type = list(object({ | ||
| paths = list(string) | ||
| timeout_sec = optional(number) | ||
| })) | ||
| } |
There was a problem hiding this comment.
🔴 The new list(object({paths, timeout_sec})) variable type lacks validation in all three variable files, leaving three input classes unguarded: an entry with paths = [] generates a path_rule with an empty paths list that GCP's URL map API rejects at apply time; timeout_sec = 0 (already flagged in a PR inline comment) passes the null guard but is rejected by GCP; and timeout_sec = 1.5 is accepted by Terraform's number type but the provider's seconds field is int64, causing a plan-time type error or silent truncation. A single validation block in each of the three variable files would address all cases: condition = alltrue([for r in var.additional_api_paths_handled_by_ingress : length(r.paths) > 0 && (r.timeout_sec == null || (r.timeout_sec > 0 && floor(r.timeout_sec) == r.timeout_sec))]).
Extended reasoning...
What the bug is and how it manifests
The PR changes additional_api_paths_handled_by_ingress from list(string) to list(object({ paths = list(string), timeout_sec = optional(number) })). This structural change introduces three new unguarded input classes, none of which are caught by Terraform before reaching the GCP API or provider type system:
-
Empty
pathslist — Terraform'slist(string)type permits zero-element lists, so{ paths = [], timeout_sec = null }is syntactically valid. The dynamicpath_ruleblock atmain.tf:272iterates directly over the variable (for_each = var.additional_api_paths_handled_by_ingress) and emits apath_ruleblock withpaths = []. GCP's URL map API requires at least one path perpath_ruleand will return a 400 error at apply time. -
timeout_sec = 0— The\!= nullguard atmain.tf:280(for_each = path_rule.value.timeout_sec \!= null ? [...] : []) allows zero through, producingtimeout { seconds = 0 }. GCP rejects URL map route timeouts of zero seconds. This was noted in an existing inline comment on the PR but the fix was suggested inmain.tf; the correct place to enforce it is in the variable definitions. -
Fractional
timeout_sec—optional(number)is Terraform's float64. A caller can passtimeout_sec = 1.5. The GCP Terraform provider defines thesecondsfield insideroute_action.timeoutasTypeInt(int64). Depending on the provider version, this either causes a plan-time type coercion error with a confusing message, or silently truncates to 1 second — both unexpected outcomes for the caller.
The specific code path
variables.tf:106-111 (all three files) declares the variable with no validation blocks. At plan time, Terraform accepts any list(object({...})) that satisfies the type constraint. The value flows to main.tf:273 (for_each = var.additional_api_paths_handled_by_ingress), then path_rule.value.paths is emitted as the paths attribute and route_action.value (the float64 timeout_sec) is emitted as seconds.
Why existing code doesn't prevent it
The old code used for_each = length(var.additional_api_paths_handled_by_ingress) > 0 ? [{}] : [] with a flat list(string), meaning an empty outer list produced no path_rule at all — the empty-paths case was structurally impossible. The new per-entry iteration moves the responsibility of ensuring non-empty paths to each individual entry, but no validation block enforces it. Similarly, the timeout_sec sub-field is entirely new; there is no prior guard anywhere.
Step-by-step proof for empty paths
- Caller sets:
additional_api_paths_handled_by_ingress = [{ paths = [], timeout_sec = null }] - Terraform type-checks:
list(object({ paths = list(string), ... }))— ✅ valid (empty list is a validlist(string)) - No validation block exists → no plan-time error
for_each = var.additional_api_paths_handled_by_ingressiterates once withpath_rule.value = { paths = [], timeout_sec = null }- Emitted HCL:
path_rule { paths = [] service = ... } - GCP API call at apply time → 400 Bad Request: "path_rule must have at least one path"
Impact
The empty-paths case produces a confusing apply-time GCP error rather than a clear Terraform validation message, blocking the entire apply. The fractional timeout_sec case can silently truncate (e.g., a 1.5s timeout becomes 1s) without any warning, leading to subtly wrong routing behavior.
How to fix
Add a single validation block to each of the three variable definitions (and remove the separate timeout_sec > 0 suggestion from the inline main.tf comment):
validation {
condition = alltrue([
for r in var.additional_api_paths_handled_by_ingress :
length(r.paths) > 0 &&
(r.timeout_sec == null || (r.timeout_sec > 0 && floor(r.timeout_sec) == r.timeout_sec))
])
error_message = "Each entry must have at least one path, and timeout_sec (when set) must be a positive integer."
}Addressing the refutation on duplicate paths (bug_003)
One verifier argued that cross-entry duplicate paths is not a meaningful regression since the old list(string) also allowed duplicate paths within the single flat path_rule. This is correct — duplicate-path detection across entries is a separate, lower-priority concern and would require nontrivial setproduct/flatten logic. The core bugs here (empty paths and invalid timeout_sec) are independent of this and clearly should be addressed.
There was a problem hiding this comment.
technically true, but not meaningfully true. gcp will happily provide an error message.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| exposedByDefault = false | ||
| [providers.consulCatalog.endpoint] | ||
| address = "${consul_endpoint}" | ||
| token = "${consul_token}" |
There was a problem hiding this comment.
Access logging silently dropped in Traefik config migration
Medium Severity
The original CLI args included --accesslog=true, but the new traefik.toml has no [accessLog] section. Traefik disables HTTP access logging by default, so this migration silently turns off access logs for all ingress traffic. Every other original CLI flag maps to an equivalent TOML setting, but accessLog is the only one missing.
|
|
||
| variable "additional_traefik_arguments" { | ||
| type = list(string) | ||
| default = [] |
There was a problem hiding this comment.
Missing default breaks AWS nomad module consumers
Low Severity
The old additional_traefik_arguments variable in provider-aws/nomad had default = [], making it optional. The replacement traefik_config_files has no default, making it required. Any direct caller of this module that previously relied on the implicit empty default now gets a Terraform error unless they explicitly pass traefik_config_files = {}.


Changes the additional_api_paths_handled_by_ingress variable from list(string) to list(object({ paths, timeout_sec? })). Each entry now creates its own path_rule, and when timeout_sec is set, it adds a route_action.timeout that overrides the ingress backend default (80s).
Also move traefik configuration to a config file, and support extra config files.
Backwards compatible: existing callers just need to wrap their path lists, e.g.:
Before
additional_api_paths_handled_by_ingress = ["/v1/foo", "/v1/bar"]
After
additional_api_paths_handled_by_ingress = [
{ paths = ["/v1/foo", "/v1/bar"] }
]
Or with a custom timeout
additional_api_paths_handled_by_ingress = [
{ paths = ["/v1/foo"], timeout_sec = 300 },
{ paths = ["/v1/bar"] }
]