Skip to content

Conversation

@rajsinghtech
Copy link
Contributor

@rajsinghtech rajsinghtech commented Nov 16, 2025

Description

Fixes a bug where HTTPRoutes referencing gateways with multiple different GatewayClasses would have incomplete status conditions.

Problem

When an HTTPRoute references gateways using different GatewayClasses (e.g., private, public, home-ts), the status showed all parentRefs with controllerName set, but only some had actual conditions (Accepted, ResolvedRefs). Others had just the controllerName field with no conditions array.

Example:

spec:
  parentRefs:
  - name: private    # GatewayClass: private
  - name: ts         # GatewayClass: home-ts  
  - name: public     # GatewayClass: public
status:
  parents:
  - controllerName: gateway.envoyproxy.io/gatewayclass-controller
    parentRef:
      name: ts
    # NO conditions field
  - controllerName: gateway.envoyproxy.io/gatewayclass-controller
    parentRef:
      name: private
    # NO conditions field
  - conditions:     # Only this one has conditions
    - type: Accepted
      status: "True"
    controllerName: gateway.envoyproxy.io/gatewayclass-controller
    parentRef:
      name: public

Root Cause

When processing EnvoyExtensionPolicy and SecurityPolicy, the code called GetRouteParentContext() for ALL parentRefs in the route, including those referencing gateways with different GatewayClasses not managed by the current translator instance.

GetRouteParentContext() creates a skeleton RouteParentStatus entry (with just controllerName and parentRef) when called on a parentRef that hasn't been processed yet. Since all GatewayClass instances use the same controller name (gateway.envoyproxy.io/gatewayclass-controller), these skeleton entries persisted without conditions.

Solution

Check if a parentRef context already exists before attempting to apply policy configuration. If it doesn't exist, skip it - this means the parentRef references a gateway not managed by this translator instance.

Fixes #7537

@rajsinghtech rajsinghtech requested a review from a team as a code owner November 16, 2025 19:12
@rajsinghtech rajsinghtech force-pushed the fix-route-status-multiple-gatewayclass branch from 50ea1e8 to f9767f0 Compare November 16, 2025 19:14
@codecov
Copy link

codecov bot commented Nov 16, 2025

Codecov Report

❌ Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.16%. Comparing base (c853061) to head (9b97c08).
⚠️ Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
internal/gatewayapi/envoyextensionpolicy.go 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7536      +/-   ##
==========================================
- Coverage   72.29%   71.16%   -1.14%     
==========================================
  Files         231      274      +43     
  Lines       34084    34862     +778     
==========================================
+ Hits        24641    24809     +168     
- Misses       7670     8259     +589     
- Partials     1773     1794      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

When processing policies (EnvoyExtensionPolicy, SecurityPolicy), the translator
was calling GetRouteParentContext for ALL parentRefs in a route, even those
referencing gateways with different GatewayClasses not managed by this translator.

GetRouteParentContext creates a skeleton RouteParentStatus entry with just the
controllerName when called on a parentRef that hasn't been processed yet. Since
all GatewayClass instances share the same controller name, these skeleton entries
persisted in status without conditions.

The fix checks if a parentRef context already exists before attempting to apply
policy configuration to it. If the context doesn't exist, it means this parentRef
wasn't processed by this translator and should be skipped.

Signed-off-by: Raj Singh <[email protected]>
@rajsinghtech rajsinghtech force-pushed the fix-route-status-multiple-gatewayclass branch from f9767f0 to 255539c Compare November 16, 2025 20:37
@zirain
Copy link
Member

zirain commented Nov 17, 2025

can you add a test case for this?

@arkodg
Copy link
Contributor

arkodg commented Nov 17, 2025

does this fix have any intersection with #7307 (haven't looked closely yet)

@arkodg
Copy link
Contributor

arkodg commented Nov 18, 2025

thanks @rajsinghtech, I took a real look at this and changes look good
can you add test yaml under internal/gatewayapi/testdata

also is a change needed in BTP as well ?

cc @y-rabie @zhaohuabing

@rajsinghtech
Copy link
Contributor Author

Will add the test shortly, wasn't sure if I captured the full bug fix.

@arkodg
Copy link
Contributor

arkodg commented Nov 18, 2025

yeah codex says, needs some changes in BTP

> - internal/gatewayapi/backendtrafficpolicy.go:284 – The described bug isn’t fully addressed. processBackendTrafficPolicyForRoute still calls GetRouteParentContext(targetedRoute, p, …) for every
  parentRef, so a BackendTrafficPolicy targeting an HTTPRoute that references multiple GatewayClasses will continue to create skeleton RouteParentStatus entries for the Gateways owned by other
  controllers. To make the fix complete you need the same guard you added elsewhere: fetch the context via targetedRoute.GetRouteParentContext(p), skip when it’s nil, and make sure the later loops
  that walk parentRefCtxs handle missing contexts. Otherwise the status bloat/problem remains whenever a BackendTrafficPolicy is present.

@zhaohuabing
Copy link
Member

does this fix have any intersection with #7307 (haven't looked closely yet)

No, #7307 addresses another issue: it aggregates status across all GatewayClasses and performs a single consolidated update at the end, ensuring we can both preserve valid parent statuses and correctly remove those that no longer exist.

The same issue exists in BackendTrafficPolicy route processing - calling
GetRouteParentContext for all parentRefs creates skeleton status entries.

Apply the same fix: check if parentRef context exists before adding to list.

Signed-off-by: Raj Singh <[email protected]>
@rajsinghtech
Copy link
Contributor Author

Added fix for BackendTrafficPolicy as well in commit 9b97c08.

Regarding tests: The existing test framework () simulates a single translator instance with a hardcoded GatewayClassName. The bug only manifests when multiple translators (one per GatewayClass) process the same HTTPRoute, which isn't directly testable in the current unit test structure.

However, the fix is straightforward:

  • Before: Called GetRouteParentContext() which creates skeleton status entries
  • After: Call route.GetRouteParentContext() which returns nil if the parentRef wasn't processed by this translator

The change prevents skeleton entries from being created when policies are applied to routes with multiple GatewayClasses.

@arkodg
Copy link
Contributor

arkodg commented Nov 19, 2025

Added fix for BackendTrafficPolicy as well in commit 9b97c08.

Regarding tests: The existing test framework () simulates a single translator instance with a hardcoded GatewayClassName. The bug only manifests when multiple translators (one per GatewayClass) process the same HTTPRoute, which isn't directly testable in the current unit test structure.

However, the fix is straightforward:

  • Before: Called GetRouteParentContext() which creates skeleton status entries
  • After: Call route.GetRouteParentContext() which returns nil if the parentRef wasn't processed by this translator

The change prevents skeleton entries from being created when policies are applied to routes with multiple GatewayClasses.

you could achieve something similar in a test by adding a parentRef thats not part of the input, there will be failure in route status but also the before / after policy status should look different

@y-rabie
Copy link
Contributor

y-rabie commented Nov 19, 2025

@arkodg I think tests in https://github.com/envoyproxy/gateway/pull/7558/files#diff-c7323da3d9dfb378a30bb7b5a00c13ab9255ac810e2adfd9ec6dec6c9ef88a67R60-R116 cover this? The sequence is:

  • Have a spec parentRef of ours and confirm it's in the status.
  • Inject a status parentRef of another controller's, and confirm it lives alongside ours in the status (our controller doesn't remove it).
  • Remove our spec parentRef and add the other controller's spec parentRef.
  • Confirm that the status only contains the other controller's parentRef (we didn't remove it and we didn't add another one).

If they do, then I suggest we merge this PR without tests and use the tests in the other one, just for less conflicts and duplicated tests.

Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@arkodg arkodg requested review from a team November 19, 2025 22:23
@zhaohuabing zhaohuabing merged commit ff13742 into envoyproxy:main Nov 20, 2025
32 checks passed
zhaohuabing pushed a commit to zhaohuabing/gateway that referenced this pull request Dec 5, 2025
…es (envoyproxy#7536)

* fix: prevent skeleton route status entries for unmanaged GatewayClasses

When processing policies (EnvoyExtensionPolicy, SecurityPolicy), the translator
was calling GetRouteParentContext for ALL parentRefs in a route, even those
referencing gateways with different GatewayClasses not managed by this translator.

GetRouteParentContext creates a skeleton RouteParentStatus entry with just the
controllerName when called on a parentRef that hasn't been processed yet. Since
all GatewayClass instances share the same controller name, these skeleton entries
persisted in status without conditions.

The fix checks if a parentRef context already exists before attempting to apply
policy configuration to it. If the context doesn't exist, it means this parentRef
wasn't processed by this translator and should be skipped.

Signed-off-by: Raj Singh <[email protected]>

* fix: also prevent skeleton entries in BackendTrafficPolicy processing

The same issue exists in BackendTrafficPolicy route processing - calling
GetRouteParentContext for all parentRefs creates skeleton status entries.

Apply the same fix: check if parentRef context exists before adding to list.

Signed-off-by: Raj Singh <[email protected]>

---------

Signed-off-by: Raj Singh <[email protected]>
(cherry picked from commit ff13742)
Signed-off-by: Huabing Zhao <[email protected]>
jukie pushed a commit to jukie/gateway that referenced this pull request Dec 5, 2025
…es (envoyproxy#7536)

* fix: prevent skeleton route status entries for unmanaged GatewayClasses

When processing policies (EnvoyExtensionPolicy, SecurityPolicy), the translator
was calling GetRouteParentContext for ALL parentRefs in a route, even those
referencing gateways with different GatewayClasses not managed by this translator.

GetRouteParentContext creates a skeleton RouteParentStatus entry with just the
controllerName when called on a parentRef that hasn't been processed yet. Since
all GatewayClass instances share the same controller name, these skeleton entries
persisted in status without conditions.

The fix checks if a parentRef context already exists before attempting to apply
policy configuration to it. If the context doesn't exist, it means this parentRef
wasn't processed by this translator and should be skipped.

Signed-off-by: Raj Singh <[email protected]>

* fix: also prevent skeleton entries in BackendTrafficPolicy processing

The same issue exists in BackendTrafficPolicy route processing - calling
GetRouteParentContext for all parentRefs creates skeleton status entries.

Apply the same fix: check if parentRef context exists before adding to list.

Signed-off-by: Raj Singh <[email protected]>

---------

Signed-off-by: Raj Singh <[email protected]>
Signed-off-by: jukie <[email protected]>
jukie pushed a commit to jukie/gateway that referenced this pull request Dec 5, 2025
…es (envoyproxy#7536)

* fix: prevent skeleton route status entries for unmanaged GatewayClasses

When processing policies (EnvoyExtensionPolicy, SecurityPolicy), the translator
was calling GetRouteParentContext for ALL parentRefs in a route, even those
referencing gateways with different GatewayClasses not managed by this translator.

GetRouteParentContext creates a skeleton RouteParentStatus entry with just the
controllerName when called on a parentRef that hasn't been processed yet. Since
all GatewayClass instances share the same controller name, these skeleton entries
persisted in status without conditions.

The fix checks if a parentRef context already exists before attempting to apply
policy configuration to it. If the context doesn't exist, it means this parentRef
wasn't processed by this translator and should be skipped.

Signed-off-by: Raj Singh <[email protected]>

* fix: also prevent skeleton entries in BackendTrafficPolicy processing

The same issue exists in BackendTrafficPolicy route processing - calling
GetRouteParentContext for all parentRefs creates skeleton status entries.

Apply the same fix: check if parentRef context exists before adding to list.

Signed-off-by: Raj Singh <[email protected]>

---------

Signed-off-by: Raj Singh <[email protected]>
Signed-off-by: jukie <[email protected]>
jukie pushed a commit to jukie/gateway that referenced this pull request Dec 5, 2025
…es (envoyproxy#7536)

* fix: prevent skeleton route status entries for unmanaged GatewayClasses

When processing policies (EnvoyExtensionPolicy, SecurityPolicy), the translator
was calling GetRouteParentContext for ALL parentRefs in a route, even those
referencing gateways with different GatewayClasses not managed by this translator.

GetRouteParentContext creates a skeleton RouteParentStatus entry with just the
controllerName when called on a parentRef that hasn't been processed yet. Since
all GatewayClass instances share the same controller name, these skeleton entries
persisted in status without conditions.

The fix checks if a parentRef context already exists before attempting to apply
policy configuration to it. If the context doesn't exist, it means this parentRef
wasn't processed by this translator and should be skipped.

Signed-off-by: Raj Singh <[email protected]>

* fix: also prevent skeleton entries in BackendTrafficPolicy processing

The same issue exists in BackendTrafficPolicy route processing - calling
GetRouteParentContext for all parentRefs creates skeleton status entries.

Apply the same fix: check if parentRef context exists before adding to list.

Signed-off-by: Raj Singh <[email protected]>

---------

Signed-off-by: Raj Singh <[email protected]>
Signed-off-by: jukie <[email protected]>
jukie added a commit that referenced this pull request Dec 5, 2025
* fix(xds-server): clear snapshot on stream close (#6618)

* fix(xds-server): clear snapshot on stream close

Signed-off-by: Zachary Vacura <[email protected]>

* check if there are other active connections before clearning the snapshot

Signed-off-by: Zachary Vacura <[email protected]>
Signed-off-by: jukie <[email protected]>

* fix: oidc authentication endpoint was overwritten by discovered value (#7460)

fix: oid authentication endpoint was overriden by discovered value

Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: jukie <[email protected]>

* ci: add script to free disk space (#7534)

* feat: free disk space

Signed-off-by: Shreemaan Abhishek <[email protected]>

* lint

Signed-off-by: Shreemaan Abhishek <[email protected]>

* cleanup

Signed-off-by: Shreemaan Abhishek <[email protected]>

* make target and tools/hack

Signed-off-by: Shreemaan Abhishek <[email protected]>

* lint

Signed-off-by: Shreemaan Abhishek <[email protected]>

* modular action

Signed-off-by: Shreemaan Abhishek <[email protected]>

---------

Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: jukie <[email protected]>

* treat too many addresses as programmed (#7542)

Signed-off-by: cong <[email protected]>
Signed-off-by: jukie <[email protected]>

* feat: reclaim space in release pipeline (#7587)

Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: jukie <[email protected]>

* chore: bump golang.org/x/crypto (#7588)

* chore: bump golang.org/x/crypto

Signed-off-by: zirain <[email protected]>

* fix gen

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
Signed-off-by: jukie <[email protected]>

* findOwningGateway should return controller based on linked GatewayClass (#7611)

* fix: filter Gateway by controller in findOwningGateway

Prevent cross-controller Gateway mutations by validating GatewayClass

Signed-off-by: Sudipto Baral <[email protected]>
Signed-off-by: jukie <[email protected]>

* fix: use default when namespace is unset (#7612)

* fix: use default when namespace is unset

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* fix test

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
Signed-off-by: jukie <[email protected]>

* fix: prevent skeleton route status entries for unmanaged GatewayClasses (#7536)

* fix: prevent skeleton route status entries for unmanaged GatewayClasses

When processing policies (EnvoyExtensionPolicy, SecurityPolicy), the translator
was calling GetRouteParentContext for ALL parentRefs in a route, even those
referencing gateways with different GatewayClasses not managed by this translator.

GetRouteParentContext creates a skeleton RouteParentStatus entry with just the
controllerName when called on a parentRef that hasn't been processed yet. Since
all GatewayClass instances share the same controller name, these skeleton entries
persisted in status without conditions.

The fix checks if a parentRef context already exists before attempting to apply
policy configuration to it. If the context doesn't exist, it means this parentRef
wasn't processed by this translator and should be skipped.

Signed-off-by: Raj Singh <[email protected]>

* fix: also prevent skeleton entries in BackendTrafficPolicy processing

The same issue exists in BackendTrafficPolicy route processing - calling
GetRouteParentContext for all parentRefs creates skeleton status entries.

Apply the same fix: check if parentRef context exists before adding to list.

Signed-off-by: Raj Singh <[email protected]>

---------

Signed-off-by: Raj Singh <[email protected]>
Signed-off-by: jukie <[email protected]>

* lint

Signed-off-by: jukie <[email protected]>

---------

Signed-off-by: Zachary Vacura <[email protected]>
Signed-off-by: jukie <[email protected]>
Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: cong <[email protected]>
Signed-off-by: zirain <[email protected]>
Signed-off-by: Sudipto Baral <[email protected]>
Signed-off-by: Raj Singh <[email protected]>
Co-authored-by: Zach Vacura <[email protected]>
Co-authored-by: Huabing (Robin) Zhao <[email protected]>
Co-authored-by: shreealt <[email protected]>
Co-authored-by: 聪 <[email protected]>
Co-authored-by: zirain <[email protected]>
Co-authored-by: Sudipto Baral <[email protected]>
Co-authored-by: Raj Singh <[email protected]>
zhaohuabing added a commit that referenced this pull request Dec 5, 2025
* fix: oidc authentication endpoint was overwritten by discovered value (#7460)

fix: oid authentication endpoint was overriden by discovered value

Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
(cherry picked from commit 50dcb15)
Signed-off-by: Huabing Zhao <[email protected]>

* fix: do not return 500 for all requests when part of BackendRefs are invalid (#7488)

* do not return 500 for all requests when part of BackendRefs are invalid

Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
(cherry picked from commit 2899416)
Signed-off-by: Huabing Zhao <[email protected]>

* fix: prevent skeleton route status entries for unmanaged GatewayClasses (#7536)

* fix: prevent skeleton route status entries for unmanaged GatewayClasses

When processing policies (EnvoyExtensionPolicy, SecurityPolicy), the translator
was calling GetRouteParentContext for ALL parentRefs in a route, even those
referencing gateways with different GatewayClasses not managed by this translator.

GetRouteParentContext creates a skeleton RouteParentStatus entry with just the
controllerName when called on a parentRef that hasn't been processed yet. Since
all GatewayClass instances share the same controller name, these skeleton entries
persisted in status without conditions.

The fix checks if a parentRef context already exists before attempting to apply
policy configuration to it. If the context doesn't exist, it means this parentRef
wasn't processed by this translator and should be skipped.

Signed-off-by: Raj Singh <[email protected]>

* fix: also prevent skeleton entries in BackendTrafficPolicy processing

The same issue exists in BackendTrafficPolicy route processing - calling
GetRouteParentContext for all parentRefs creates skeleton status entries.

Apply the same fix: check if parentRef context exists before adding to list.

Signed-off-by: Raj Singh <[email protected]>

---------

Signed-off-by: Raj Singh <[email protected]>
(cherry picked from commit ff13742)
Signed-off-by: Huabing Zhao <[email protected]>

* treat too many addresses as programmed (#7542)

Signed-off-by: cong <[email protected]>
(cherry picked from commit 7cb5f72)
Signed-off-by: Huabing Zhao <[email protected]>

* bechmark: fix cpu sampling (#7581)

use fixed duration for cpu rate

Signed-off-by: Huabing Zhao <[email protected]>
(cherry picked from commit 536486f)
Signed-off-by: Huabing Zhao <[email protected]>

* chore: bump golang.org/x/crypto (#7588)

* chore: bump golang.org/x/crypto

Signed-off-by: zirain <[email protected]>

* fix gen

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
(cherry picked from commit 70fa59a)
Signed-off-by: Huabing Zhao <[email protected]>

* findOwningGateway should return controller based on linked GatewayClass (#7611)

* fix: filter Gateway by controller in findOwningGateway

Prevent cross-controller Gateway mutations by validating GatewayClass

Signed-off-by: Sudipto Baral <[email protected]>
(cherry picked from commit ba8e0e2)
Signed-off-by: Huabing Zhao <[email protected]>

* fix: use default when namespace is unset (#7612)

* fix: use default when namespace is unset

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* fix test

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
(cherry picked from commit be2cc73)
Signed-off-by: Huabing Zhao <[email protected]>

* bump Gateway API v1.4.1 (#7653)

Signed-off-by: zirain <[email protected]>
(cherry picked from commit 0fa26d7)
Signed-off-by: Huabing Zhao <[email protected]>

* update release note

Signed-off-by: Huabing Zhao <[email protected]>

* fix gen check

Signed-off-by: Huabing Zhao <[email protected]>

* ci: add script to free disk space (#7534)

* feat: free disk space

Signed-off-by: Shreemaan Abhishek <[email protected]>

* lint

Signed-off-by: Shreemaan Abhishek <[email protected]>

* cleanup

Signed-off-by: Shreemaan Abhishek <[email protected]>

* make target and tools/hack

Signed-off-by: Shreemaan Abhishek <[email protected]>

* lint

Signed-off-by: Shreemaan Abhishek <[email protected]>

* modular action

Signed-off-by: Shreemaan Abhishek <[email protected]>

---------

Signed-off-by: Shreemaan Abhishek <[email protected]>
(cherry picked from commit 4312f38)
Signed-off-by: Huabing Zhao <[email protected]>

---------

Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Raj Singh <[email protected]>
Signed-off-by: cong <[email protected]>
Signed-off-by: zirain <[email protected]>
Signed-off-by: Sudipto Baral <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Co-authored-by: Raj Singh <[email protected]>
Co-authored-by: 聪 <[email protected]>
Co-authored-by: zirain <[email protected]>
Co-authored-by: Sudipto Baral <[email protected]>
Co-authored-by: shreealt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTTPRoute status missing conditions for parentRefs with different GatewayClasses

5 participants