Skip to content

Conversation

@qmonnet
Copy link
Member

@qmonnet qmonnet commented Dec 19, 2025

The new flow-filter crate introduced in this commit contains a new pipeline stage for validating that a packet matches an existing peering connection, as defined in the configuration provided by the user. All packets that do not have a source IP, port and destination IP, port corresponding to existing, valid connections between the prefixes in exposed lists of peerings, get dropped.

This allows us to enforce that traffic matches the peering rule:

  • Generically, whether or not the appropriate rules are set
  • For stateless NAT, which would just let the packet go un-NATed if there was no NAT rule to be found, so far
  • In the future, for stateful NAT, allowing us to remove some checks in the stateful NAT code.

Because the new stage gets the whole context about the authorized connections (as per the peering configuration), it can also replace the destination VPC lookup, to avoid redundant structures and lookups.

In the future, we could also easily adjust and use the stage to mark whether packets should be processed for stateless or stateful NAT, which would allow us to drop the "NAT exempt list" hack used for stateful NAT.

PR is not ready to be merged because some points need discussion, but it's ready for review and comments.

@qmonnet qmonnet added this to the GW R2 milestone Dec 19, 2025
@qmonnet qmonnet requested a review from mvachhar December 19, 2025 17:54
@qmonnet qmonnet self-assigned this Dec 19, 2025
@qmonnet qmonnet added the dont-merge Do not merge this Pull Request label Dec 19, 2025
@qmonnet qmonnet requested a review from a team as a code owner December 19, 2025 17:54
@qmonnet qmonnet added the area/nat Related to Network Address Translation (NAT) label Dec 19, 2025
@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter branch 3 times, most recently from 0e29c5a to 999f163 Compare January 5, 2026 15:16
@qmonnet qmonnet added the ci:-upgrade Disable VLAB upgrade tests label Jan 6, 2026
Copy link
Contributor

@mvachhar mvachhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really nice. When can we remove the dst_vpcd_lookup stage?

@qmonnet
Copy link
Member Author

qmonnet commented Jan 6, 2026

This looks really nice. When can we remove the dst_vpcd_lookup stage?

There's no real point in keeping the two. I can add a commit to this PR to remove it, if you like the changes.

@Fredi-raspall Fredi-raspall self-requested a review January 9, 2026 12:46
Add two methods insert_public_ip() and public_ips_mut() to VpcExpose in
order to manipulate "public IPs" for the expose, that is, the exposed
prefix if NAT is not in use (or empty), or the translated prefixes
otherwise. These methods will be used in a follow-up commit.

Signed-off-by: Quentin Monnet <[email protected]>
We'll have a merge() method for several objects in the next commit, and
this method will return the merged version regardless of the order of
the two objects. In preparation for this, let's rename the current
"merge()" methods as "extend_right()", to avoid name conflicts and to
make it more explicit that self is lower than the object we merge into
it.

Signed-off-by: Quentin Monnet <[email protected]>
Add a merge() method to trait IpRangeWithPorts and implement it for
various objects implementing the trait (PrefixWithPorts,
PrefixWithOptionalPorts), or not implementing it (Prefix, PortRange).

This method will be used in follow-up work to manipulate lists of
prefixes.

Signed-off-by: Quentin Monnet <[email protected]>
This trie is used with IP prefixes associated with port ranges, to
determine what prefix + optional port range a given tuple (IP address,
port) belongs to. We need such data structures in a few locations in the
code for NAT or destination VPC lookup, so we should as well make it
generic so that we can optimise or replace it in an easy fashion, if
needed.

Values for the trie must implement trait ValueWithAssociatedRanges for
the lookups to succeed.

Signed-off-by: Quentin Monnet <[email protected]>
@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter branch from 999f163 to 65e69b0 Compare January 10, 2026 02:02
Rather than implementing longest-prefix-match lookups for IP prefixes
with associated port ranges, reuse the generic struct that we recently
introduced in the flow-filter crate. There should be no functional
change.

Signed-off-by: Quentin Monnet <[email protected]>
@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter branch from 65e69b0 to e131863 Compare January 10, 2026 02:12
@qmonnet
Copy link
Member Author

qmonnet commented Jan 10, 2026

So I have reworked the new pipeline stage. Here's what we're looking at:

  • The logic for the lookup hasn't changed
  • The structure of the context table has been mostly re-written. There was a bug in the way we built the table, this has been fixed.
  • We get more tests.
  • We get proper support for finding the right destination VPC when either 1) we're using a destination IP/port that falls within a non-overlapping section of an overlapping prefix, or 2) we have an overlapping prefix but can disambiguate using the source IP/port for the packet.
  • We have a new commit to remove the legacy dst_vpcd_lookup stage
  • Consequence of the rewrite, I haven't spent too much time decoupling the different phases of the new stage into separate commits (the phases are a bit more intricate than in the previous version), so the whole stage + overlap handling + new dst_vpcd lookup now come in a single commit
  • The code, however, should be much clearer overall

THIS PR DOES NOT ACCOUNT FOR TODAY'S DISCUSSION ON THE SHARED IPS. The related adjustments will come on top of this (can be the same PR, although it's maybe easier have them in a new one). But given that I have this code ready with support for destination VPC lookup for some edge cases, I think it's worth committing this code to the Git history, even if we remove some of the processing right after.

@qmonnet qmonnet requested a review from mvachhar January 10, 2026 02:13
qmonnet and others added 3 commits January 10, 2026 02:22
The new flow-filter crate introduced in this commit contains a new
pipeline stage that addresses multiple purposes:

- It validates that a packet matches an existing peering connection
- It retrieves the destination VPC for the packet
- It improves prefix overlap handling, when compared to the existing
  destination VPC lookup stage

Flow validation
---------------

This flow-filter stage validates that a packet matches an existing
peering connection, as defined in the configuration provided by the
user. All packets that do not have a source IP, port and destination IP,
port corresponding to existing, valid connections between the prefixes
in exposed lists of peerings, get dropped.

This allows us to enforce that traffic matches the peering rule:

- Generically, whether or not the appropriate rules are set
- For stateless NAT, which would just let the packet go un-NATed if
  there was no NAT rule to be found, so far
- For stateful NAT, allowing us to remove the dirty implementation of
  similar checks in the stateful NAT code (in future work).

Destination VPC lookup
----------------------

Since filtering flow involves retrieving the source VPC, source IP/port
and destination IP/port, then looking in a table, we've got everything
we need to reimplement a destination VPC lookup stage. It's trivial to
store the destination VPC in the context table, and to attach it to the
packet's metadata on successful lookups. Doing so, we replicate the
function of the dst_vpcd_lookup stage: the legacy, redundant stage will
be removed in a follow-up commit. This will make one stage fewer in the
pipeline, and will improve accuracy, as described in the next section.

Improved resiliency for "shared IPs"
------------------------------------

When looking up for the destination VPC for a packet, we need a special
treatment when IP/port ranges are shared across peerings. In the legacy
dst_vpcd_lookup stage, we already had the checks in place to make sure
that the lookup stage does not return an answer when there are multiple
possible matching destination VPCs (based on source VPC and destination
address).

However, the restriction in that stage is too strong: when we have
overlapping, but distinct IP ranges shared across peerings, we can tell
the destination VPC for some portion of the ranges.

Consider the following case, for example:

    VPC A        <->        VPC B        <->         VPC C
    1.0.0.0/24     2.0.0.0/24   2.0.0.0/24           1.0.0.0/24

In this case, we just can't tell, for packets coming from VPC B, what
the destination VPC is (unless there's some stateful NAT session in
place, but that's beyond the destination VPC lookup stage's scope).

But if, instead, we have:

    VPC A        <->        VPC B        <->         VPC C
    1.0.0.0/23     2.0.0.0/24   2.0.0.0/24           1.0.0.0/24

Then for packets going to 1.0.1.0/24, we know the destination is VPC A.
Before this commit, the destination VPC lookup would not account for the
unambiguous portion of prefixes.

Similarly, in a configuration such as this one:

    VPC A        <->        VPC B        <->         VPC C
    1.0.0.0/24     5.0.0.0/24   2.0.0.0/24           1.0.0.0/24

Then we can find the relevant destination VPC for packets coming from B,
because even if the destination prefixes are the same, we can look at
the source address too, and find which (non-overlapping, in our example)
prefix it belongs to. Before this commit, the destination VPC lookup
would never account for the source address.

This commit does not change the legacy stage, but the flow-filter stage
is able to identify the correct destination VPC in both situations.

Unit tests for the new pipeline stage will come in a follow-up commit.

Signed-off-by: Quentin Monnet <[email protected]>
Add unit tests to test the various modules in the recent flow-filter
crate.

Many of these were generated by Claude, with manual control and
adjustments. Some were added without Claude's intervention.

Co-Authored-By: Claude <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Now that we've transitioned to the new, more powerful/accurate
flow-filter stage for looking up the destination VPC discriminant for
packets, remove the legacy, redundant dst_vpcd_lookup stage.

Signed-off-by: Quentin Monnet <[email protected]>
@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter branch from e131863 to 4dc9978 Compare January 10, 2026 02:22
@qmonnet qmonnet removed the dont-merge Do not merge this Pull Request label Jan 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/nat Related to Network Address Translation (NAT) ci:-upgrade Disable VLAB upgrade tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants