Skip to content

Conversation

shaneutt
Copy link
Member

@shaneutt shaneutt commented Oct 6, 2025

What type of PR is this?
/kind gep

What this PR does / why we need it:

This takes the first provisional step of proposing firewall support for Gateway API, which has been very popular as per engagement in #3614.

Which issue(s) this PR fixes:

This supports, but does not resolve #3614.

Does this PR introduce a user-facing change?:

NONE

@shaneutt shaneutt added the kind/gep PRs related to Gateway Enhancement Proposal(GEP) label Oct 6, 2025
@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Oct 6, 2025
@k8s-ci-robot k8s-ci-robot requested review from candita and kflynn October 6, 2025 14:43
@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 6, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shaneutt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 6, 2025
@shaneutt shaneutt requested a review from rikatz October 6, 2025 14:44
@rikatz
Copy link
Member

rikatz commented Oct 6, 2025

/assign

@shaneutt shaneutt changed the title docs: add provisional GEP for Gateway Firewall Support docs: add provisional GEP for Gateway API Firewall Support Oct 6, 2025
@shaneutt shaneutt force-pushed the provisional-firewall branch from 480dd54 to 1f798e2 Compare October 6, 2025 16:27
@shaneutt shaneutt requested a review from fzipi October 6, 2025 16:28
Comment on lines 38 to 57
* Enable attaching firewall engines to a `Gateway`
* Enable `Gateway`-level firewall rule enforcement
* Enable `HTTPRoute`-level firewall rule enforcement
* Enable simple IP allow/deny lists
* Provide documentation and best practices for implementations which describe
how firewall engines and rules can best be integrated into a Gateway API
implementation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what all is at scope here.
ie, what is a firewall engine? Is this WAF like functionality?

I assume that attaching firewall engines to a Gateway is leaving the default of said engine to the implementation?

I also assume that Enable simple IP allow/deny lists means that at minimum, any/all firewall engines will support some kind of IP allow/deny lists.

Copy link
Member Author

@shaneutt shaneutt Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what all is at scope here. ie, what is a firewall engine? Is this WAF like functionality?

Yes. I added a definition of "Firewall Engine". The engine itself is not in scope, but providing the extension points to integrate it are.

I assume that attaching firewall engines to a Gateway is leaving the default of said engine to the implementation?

For now, I think its a yes. Alternatively, we could decide that we want to take a deeper and more actively role in defining the engines and their rules themselves, as opposed to just providing the extension points where they can be added. I'm open to suggestions about adding more scope in this way, but I would need to see pretty strong support from stakeholders that this is what they want.

I also assume that Enable simple IP allow/deny lists means that at minimum, any/all firewall engines will support some kind of IP allow/deny lists.

This was called out as a specific example due to feedback from interested folks, but no I don't think we can say definitively that any extension must support this. I've removed this as a goal, since it's covered and further implied by the user stories.

update for new threats over time.
* As a cluster operator I want to be able to block traffic to gateways from
specific geographical regions, or only allow specific regions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually would like to wait on this one as I have every expectation this one will be contentious if we were to add any specific allowances in our specification for it. I think it would be best as a separate, follow-up iteration so that it can get its own focus and discussion.

@shaneutt shaneutt force-pushed the provisional-firewall branch 3 times, most recently from e4c206a to 4165994 Compare October 6, 2025 20:26
@shaneutt shaneutt force-pushed the provisional-firewall branch from 4165994 to 108977d Compare October 6, 2025 20:48
@shaneutt shaneutt requested a review from fzipi October 6, 2025 20:49
@shaneutt shaneutt force-pushed the provisional-firewall branch from 108977d to 629e1d8 Compare October 7, 2025 11:22
@shaneutt shaneutt requested a review from rikatz October 7, 2025 11:24
@shaneutt shaneutt force-pushed the provisional-firewall branch from 629e1d8 to 079f8ed Compare October 7, 2025 11:59

### Definitions

* "Firewall Engine" - A processor of request payloads and applies rulesets to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: I'm not 100% familiar with the naming used, but does "payloads" in this context include everything from the corresponding layer? E.g. layer7/http would include headers and not only body, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct.

@shaneutt shaneutt requested a review from fzipi October 7, 2025 14:46
`Gateway` as a sidecar, integrated natively as part of the `Gateway`, or
deployed in front of the `Gateway` as part of the networking path.

### User Stories
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about the ability to actually fulfill the goals in this GEP. Just from reading the title, I was worried about running into the lowest-common-denominator problem: that there is almost no overlap between each firewall configuration surface, so we end up with an API that is not useful (or, we end up debating the API surface indefinitely).

Gateway API actually does already have a solution for this, with policy attachment. I suspect you may disagree, but when we are faced with a problem set that doesn't have much overlap, policy attachment of implementation specific policies is a much better experience for users than attempting to make an API that doesn't work.

After reading the GEP, my concerns are far greater, though. This set of goals is impossible to reasonable tackle. While the title if "Firewall", in typical products the feature set here actually spans 3-4 API surfaces: WAF, Authorization (typically separate from WAF!!), rate limiting, auditing, DLP. And I saw in the AI WG, LLM guardrails was also something of interest as part of this effort (which, again, differs from traditional WAF). I am very worried we are biting off way to much work with this effort and it will make it impossible to proceed.

Just as an anecdote, even just rate limiting is ridiculously complex to design an API around. I suspect it would be harder than BackendTLSPolicy was...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about the ability to actually fulfill the goals in this GEP. Just from reading the title, I was worried about running into the lowest-common-denominator problem: that there is almost no overlap between each firewall configuration surface, so we end up with an API that is not useful (or, we end up debating the API surface indefinitely).

I am anticipating that we wont actually try to add significant API surface for this, see this conversation for some context.

In any case, I'm open to the possibility that this GEP moves to Withdrawn if we can't find the right stakeholders, and common surface.

Gateway API actually does already have a solution for this, with policy attachment. I suspect you may disagree, but when we are faced with a problem set that doesn't have much overlap, policy attachment of implementation specific policies is a much better experience for users than attempting to make an API that doesn't work.

This is getting into the "How?" we do things, which we're not ready for yet.

After reading the GEP, my concerns are far greater, though. This set of goals is impossible to reasonable tackle. While the title if "Firewall", in typical products the feature set here actually spans 3-4 API surfaces: WAF, Authorization (typically separate from WAF!!), rate limiting, auditing, DLP. And I saw in the AI WG, LLM guardrails was also something of interest as part of this effort (which, again, differs from traditional WAF). I am very worried we are biting off way to much work with this effort and it will make it impossible to proceed.

I will make sure you are considered a stakeholder for reviews, and that your concerns are incorporated 👍

Just as an anecdote, even just rate limiting is ridiculously complex to design an API around. I suspect it would be harder than BackendTLSPolicy was...

Agreed.

This is getting into the "How?", which we will get stuck on if we discuss this now. The important thing for this iteration is to align on the motivation and goals at a high level. If we do that, and then we move to the implementation details and we simply can not produce something that's effective within a reasonable scope, and is supported by multiple stakeholders, it is OK to consider this Withdrawn and keep it for posterity so that the community knows we looked into it, and what our reasons were for not continuing.

Copy link
Contributor

@mikemorris mikemorris Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am anticipating that we wont actually try to add significant API surface for this

In this case, I'm also wondering how this provides value that is not already available through HTTPRoute extensionRef custom filters or policy attachment, or what is missing from the current extension points to support this use case?

Something like first-class support for OWASP CRS configuration rules to provide WAF functionality in Gateway API might feel contentious and a scope stretch, but would at least feel more aligned with standardizing functionality within the Gateway API specification.

This does feel like it could be a valuable proprietary feature or product positioning for a Gateway API implementation, but I don't really see a purpose-built extension point enabling this as appropriate for the spec, when existing generic extension points may be sufficient (and potentially support the same or related use cases, such as using WASM modules to mutate or drop requests or responses as @jcchavezs mentioned in #4148 (comment)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I'm also wondering how this provides value that is not already available through HTTPRoute
extensionRef custom filters

Filters could be viable, we need to decide. This is an important part of the exercise of this GEP. There could be challenges. One challenge is the should language around the order of processing filters, which is not conducive to security systems.

or policy attachment

Policy Attachment could be viable, we need to decide. This too is an important part of the exercise. There will be challenges, as it comes with many caveats. Ordering policies is one thing that can be challenging to get right, among others.


We are into implementation details however.

I want to be extra clear that if we get to the "How?" and there is no consensus, I am not shy about slapping a Withdrawn on it and providing a written explanation as to why. That way people who are looking at least know that we've tried, what we've tried, and what the difficulties are.

If possible however, the bare minimum I would like for this effort to result in a memorandum that provides some guidance to implementations that want to integrate firewalls with their Gateways, as I see this as a pretty common desire from users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/gep PRs related to Gateway Enhancement Proposal(GEP) release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
Development

Successfully merging this pull request may close these issues.

GEP: Firewall
8 participants