Skip to content

Conversation

salonichf5
Copy link
Contributor

@salonichf5 salonichf5 commented Sep 18, 2025

Proposed changes

Write a clear and concise description that helps reviewers understand the purpose and impact of your changes. Use the
following format:

Problem: When nginx reload fails, all routes are marked invalid

Solution: Remove adding GatewayNotProgrammed condition to routes since the error should only be reflected on Gateway Listener conditions

Testing: manual tests and unit tests

Tested with external name svc example where i have a route attached to invalid snippet filter and one route attached to valid snippet filter

Both have accepted condition: true on them but Gateway is not programmed for both listeners

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: external-gateway
spec:
  gatewayClassName: nginx
  infrastructure:
    parametersRef:
      group: gateway.nginx.org
      kind: NginxProxy
      name: external-nginx-config
  listeners:
  - name: http
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: All
  - name: tls
    port: 443
    protocol: TLS
    tls:
      mode: Passthrough
    allowedRoutes:
      namespaces:
        from: All
---
apiVersion: gateway.nginx.org/v1alpha1
kind: SnippetsFilter
metadata:
  name: rate-limiting-sf
spec:
  snippets:
    - context: http
      value: limit_req_zone;

Gateway condition

    Last Transition Time:  2025-09-18T17:57:53Z
    Message:               Gateway is accepted
    Observed Generation:   1
    Reason:                Accepted
    Status:                True
    Type:                  Accepted
    Last Transition Time:  2025-09-18T17:57:53Z
    Message:               ParametersRef resource is resolved
    Observed Generation:   1
    Reason:                ResolvedRefs
    Status:                True
    Type:                  ResolvedRefs
    Last Transition Time:  2025-09-18T17:57:53Z
    Message:               The Gateway is not programmed due to a failure to reload nginx with the configuration: msg: Config apply failed, rolling back config; error: failed to parse config invalid number of arguments in "limit_req_zone" directive in /etc/nginx/includes/SnippetsFilter_http_default_rate-limiting-sf.conf:1
    Observed Generation:   1
    Reason:                Invalid
    Status:                False
    Type:                  Programmed

NOTE: The route attached is configured to TRUE and ACCEPTED but not configured due to reload failures.

Please focus on (optional): If you any specific areas where you would like reviewers to focus their attention or provide
specific feedback, add them here.

Closes #3866

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that all unit tests pass after adding my changes
  • I have updated necessary documentation
  • I have rebased my branch onto main
  • I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

Release notes

If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.

Fixed an issue where a failed configuration reload caused all HTTPRoutes to be marked as invalid (Accepted: false). This led to external-dns removing DNS records even though the configuration had been rolled back. Routes now retain their valid state during reload failures, preventing unnecessary DNS disruptions.

@github-actions github-actions bot added bug Something isn't working tests Pull requests that update tests labels Sep 18, 2025
Copy link

codecov bot commented Sep 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.85%. Comparing base (36788a1) to head (847d031).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3936      +/-   ##
==========================================
+ Coverage   86.82%   86.85%   +0.02%     
==========================================
  Files         128      128              
  Lines       16575    16559      -16     
  Branches       62       62              
==========================================
- Hits        14392    14382      -10     
+ Misses       2004     1998       -6     
  Partials      179      179              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@salonichf5
Copy link
Contributor Author

I verified running the external name svc example where doing nslookup for the external name service pod, entries still exists when NGF has failed to reload

kubectl run -it dns-debug --image=busybox --restart=Never -- nslookup httpbin.default.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10:53

httpbin.default.svc.cluster.local canonical name = httpbin.org
Name: httpbin.org
Address: 13.222.46.84
Name: httpbin.org
Address: 34.238.12.187

httpbin.default.svc.cluster.local canonical name = httpbin.org

@salonichf5 salonichf5 marked this pull request as ready for review September 18, 2025 18:52
@salonichf5 salonichf5 requested a review from a team as a code owner September 18, 2025 18:52
@salonichf5
Copy link
Contributor Author

Will update the compatibility docs for this as well

Copy link
Contributor

@bjee19 bjee19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@salonichf5 salonichf5 enabled auto-merge (squash) September 18, 2025 20:49
@salonichf5 salonichf5 merged commit 698a369 into main Sep 18, 2025
44 checks passed
@salonichf5 salonichf5 deleted the bug/httproutes branch September 18, 2025 21:19
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in NGINX Gateway Fabric Sep 18, 2025
salonichf5 added a commit that referenced this pull request Sep 18, 2025
Problem: When nginx reload fails, all routes are marked invalid

Solution: Remove adding GatewayNotProgrammed condition to routes since the error should only be reflected on Gateway Listener conditions
salonichf5 added a commit that referenced this pull request Sep 19, 2025
…3938)

Problem: When nginx reload fails, all routes are marked invalid

Solution: Remove adding GatewayNotProgrammed condition to routes since the error should only be reflected on Gateway Listener conditions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working release-notes tests Pull requests that update tests

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

All HTTPRoutes marked as invalid when config apply fails

3 participants