Nil pointer panic in BackendTrafficPolicy when HTTPRoute has invalid cross-namespace reference

**Description:**

The Envoy Gateway controller panics with a nil pointer dereference when processing BackendTrafficPolicies for HTTPRoutes that have cross-namespace backend references without a matching ReferenceGrant.

The invalid reference is correctly detected and logged as an error, but the controller then panics instead of gracefully skipping the route. This causes the gateway-api reconciliation loop to restart repeatedly (~every 5 seconds), which can delay xDS updates to Envoy proxies.

**Expected behavior:** When an HTTPRoute has an invalid cross-namespace reference, the controller should log the error, skip applying BackendTrafficPolicy features to that route, and continue processing other routes without panicking.

---

**Repro steps:**

1. Create an HTTPRoute in namespace `envoy-gateway-mtls-app` that references a Service in namespace `default`:
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: reflector
  namespace: envoy-gateway-mtls-app
spec:
  parentRefs:
    - name: gateway-mtls-reflector
      namespace: envoy-gateway-mtls-app
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: "/"
      backendRefs:
        - name: reflector
          namespace: default  # cross-namespace reference
          port: 80
```

2. Do NOT create a ReferenceGrant allowing this cross-namespace reference

3. Have any BackendTrafficPolicy in the cluster (doesn't need to target this route)

4. Observe controller logs - panic occurs on every reconciliation

---

**Environment:**

- Envoy Gateway version: v1.6.0
- Kubernetes: GKE 1.31
- Go version (from stack trace): 1.25.3
- Single controller managing multiple Gateways across namespaces

---

**Logs:**

First the ReferenceGrant error is logged:
```
ERROR provider kubernetes/routes.go:269 failed to process BackendRef for HTTPRoute
{"runner": "provider",
 "httpRoute": {"name":"reflector","namespace":"envoy-gateway-mtls-app"},
 "backendRef": {"group":"","kind":"Service","name":"reflector","namespace":"default","port":80},
 "error": "no matching ReferenceGrants found: from HTTPRoute/envoy-gateway-mtls-app to Service/default"}
```

Then immediately the panic:
```
ERROR watchable message/watchutil.go:57 observed a panic
{"runner": "gateway-api",
 "error": "runtime error: invalid memory address or nil pointer dereference",
 "stackTrace": "goroutine 216 [running]:
runtime/debug.Stack()
    /opt/hostedtoolcache/go/1.25.3/x64/src/runtime/debug/stack.go:26 +0x5e
github.com/envoyproxy/gateway/internal/message.handleWithCrashRecovery[...].func1()
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:58 +0x1fe
panic({0x3552d00?, 0xb615380?})
    /opt/hostedtoolcache/go/1.25.3/x64/src/runtime/panic.go:783 +0x132
github.com/envoyproxy/gateway/internal/gatewayapi.(*Translator).applyTrafficFeatureToRoute(...)
    /home/runner/work/gateway/gateway/internal/gatewayapi/backendtrafficpolicy.go:765 +0x768
github.com/envoyproxy/gateway/internal/gatewayapi.(*Translator).translateBackendTrafficPolicyForRoute(...)
    /home/runner/work/gateway/gateway/internal/gatewayapi/backendtrafficpolicy.go:635 +0x2ca
github.com/envoyproxy/gateway/internal/gatewayapi.(*Translator).processBackendTrafficPolicyForRoute(...)
    /home/runner/work/gateway/gateway/internal/gatewayapi/backendtrafficpolicy.go:301 +0xa0b
github.com/envoyproxy/gateway/internal/gatewayapi.(*Translator).ProcessBackendTrafficPolicies(...)
    /home/runner/work/gateway/gateway/internal/gatewayapi/backendtrafficpolicy.go:107 +0x197c
github.com/envoyproxy/gateway/internal/gatewayapi.(*Translator).Translate(...)
    /home/runner/work/gateway/gateway/internal/gatewayapi/translator.go:284 +0x848
github.com/envoyproxy/gateway/internal/gatewayapi/runner.(*Runner).subscribeAndTranslate.func1(...)
    /home/runner/work/gateway/gateway/internal/gatewayapi/runner/runner.go:176 +0x571"}
```

**Workaround:** Create a ReferenceGrant to allow the cross-namespace reference, or move the HTTPRoute and Service to the same namespace.

**Observability:**

The panic is observable via the `watchable_panics_recovered_total` metric:

```promql
rate(watchable_panics_recovered_total{runner="gateway-api", status="failure"}[5m]) > 0
```

Note that standard controller-runtime metrics don't capture this panic:
- `controller_runtime_reconcile_panics_total` stays at 0 (different code path)
- `controller_runtime_reconcile_errors_total` stays at 0
- Pod does not restart

The `watchable_panics_recovered_total` metric only increments when reconciliation is triggered (e.g., on resource changes). A cluster can be in a broken steady state with a flat counter if no changes occur.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nil pointer panic in BackendTrafficPolicy when HTTPRoute has invalid cross-namespace reference #7847

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nil pointer panic in BackendTrafficPolicy when HTTPRoute has invalid cross-namespace reference #7847

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions