Support multiple UDP source ports (multiport) by wadey · Pull Request #768 · slackhq/nebula

wadey · 2022-10-17T17:10:34Z

The goal of this work is to send packets between two hosts using more than one 5-tuple. When running on networks like AWS where the underlying network driver and overlay fabric makes routing, load balancing, and failover decisions based on the flow hash, this enables more than one flow between pairs of hosts.

Multiport spreads outgoing UDP packets across multiple UDP send ports, which allows nebula to work around any issues on the underlay network. Some example issues this could work around:

UDP rate limits on a per flow basis.
Partial underlay network failure in which some flows work and some don't

Agreement is done during the handshake to decide if multiport mode will be used for a given tunnel (one side must have tx_enabled set, the other side must have rx_enabled set)

NOTE: you cannot use multiport on a host if you are relying on UDP hole punching to get through a NAT or firewall.

NOTE: Linux only (uses raw sockets to send). Also currently only works with IPv4 underlay network remotes.

This is implemented by opening a raw socket and sending packets with a source port that is based on a hash of the overlay source/destiation port. For ICMP and Nebula metadata packets, we use a random source port.

Example configuration:

multiport:
  # This host support sending via multiple UDP ports.
  tx_enabled: false

  # This host supports receiving packets sent from multiple UDP ports.
  rx_enabled: false

  # How many UDP ports to use when sending. The lowest source port will be
  # listen.port and go up to (but not including) listen.port + tx_ports.
  tx_ports: 100

  # NOTE: All of your hosts must be running a version of Nebula that supports
  # multiport if you want to enable this feature. Older versions of Nebula
  # will be confused by these multiport handshakes.
  #
  # If handshakes are not getting a response, attempt to transmit handshakes
  # using random UDP source ports (to get around partial underlay network
  # failures).
  tx_handshake: false

  # How many unresponded handshakes we should send before we attempt to
  # send multiport handshakes.
  tx_handshake_delay: 2

The goal of this work is to send packets between two hosts using more than one 5-tuple. When running on networks like AWS where the underlying network driver and overlay fabric makes routing, load balancing, and failover decisions based on the flow hash, this enables more than one flow between pairs of hosts. Multiport spreads outgoing UDP packets across multiple UDP send ports, which allows nebula to work around any issues on the underlay network. Some example issues this could work around: - UDP rate limits on a per flow basis. - Partial underlay network failure in which some flows work and some don't Agreement is done during the handshake to decide if multiport mode will be used for a given tunnel (one side must have tx_enabled set, the other side must have rx_enabled set) NOTE: you cannot use multiport on a host if you are relying on UDP hole punching to get through a NAT or firewall. NOTE: Linux only (uses raw sockets to send). Also currently only works with IPv4 underlay network remotes. This is implemented by opening a raw socket and sending packets with a source port that is based on a hash of the overlay source/destiation port. For ICMP and Nebula metadata packets, we use a random source port. Example configuration: multiport: # This host support sending via multiple UDP ports. tx_enabled: false # This host supports receiving packets sent from multiple UDP ports. rx_enabled: false # How many UDP ports to use when sending. The lowest source port will be # listen.port and go up to (but not including) listen.port + tx_ports. tx_ports: 100 # NOTE: All of your hosts must be running a version of Nebula that supports # multiport if you want to enable this feature. Older versions of Nebula # will be confused by these multiport handshakes. # # If handshakes are not getting a response, attempt to transmit handshakes # using random UDP source ports (to get around partial underlay network # failures). tx_handshake: false # How many unresponded handshakes we should send before we attempt to # send multiport handshakes. tx_handshake_delay: 2

brad-defined · 2023-02-13T18:33:21Z

Branch now has conflicts, needs updating

1.8.2 Release

1.9.4 Release

We need to compare just the IPs here and not the IP+Port This is a regression with the merge of v1.9.4 and the change to netip.AddrPort - dabce8a

dioss-Machiel · 2025-03-21T13:57:00Z

I thought of the following use cases where hashing does not seem to be ideal, but I might be missing something and hashing is just better in general?

Suppose I have a HTTP stream from Node 1 to Node 2, it appears it will always get hashed to the same UDP port, so if that UDP flow is rate limited, the HTTP stream will also be limited. I would assume round-robin over the UDP source ports could perform better here?

In another case, when some UDP ports end up being (temporarily) blocked then round-robin seems to also perform better, since retransmissions will likely be sent via a different UDP port (assuming you are running TCP on top)

rawdigits · 2025-08-16T03:40:58Z

Just wanted to +1 this. I suspected that verizon was throttling individual udp streams and this proved that to be true. I went from 500mbit to 1gbit by using 4x multiport. IMO we should consider mainlining this if it isn't a heavy burden.

edit:
just adding some 24 hour data here to back up the discussion (disregard the weirdness/drops, which were unrelated to the actual change. The steady states before and after are the imporrtant bit.

for context, this is a long running data backup between two synology hosts that are 1000 miles apart, over the public internet.

wadey · 2025-09-15T17:57:50Z

Main issues to consider before merging I think:

it breaks down badly if you have a NAT or firewall. This has bitten us a few times at Slack because it’s hard to diagnose (handshakes work but then only a percentage of packets flow afterwards). We could either just warn folks to deal with this, or maybe hole punch with all flows (but then also punchy with all flows?).
I think it opens up more chances for race conditions bugs, since there are now multiple routines happening per hostinfo. Not bad just need to track down all the edge cases.
Maybe a simpler implementation when the number of flows is small. The raw socket method is overkill unless you are doing hundreds or thousands of flows, Maybe should provide a simpler implementation when there are single digits or tens of flows of just opening sockets for each. The way we hack in the overhead for the raw socket in this PR is the main reason I didn’t want to merge it yet. It’s very gross and I want to find a better way, like an Interface for how out buffers are managed that can deal with the differences between normal and raw sockets easier.

wadey added this to the v1.7.0 milestone Oct 17, 2022

wadey mentioned this pull request Oct 17, 2022

WIP: multi source port #497

Closed

fix up run of multiport smoke tests

6d8e939

nbrownus mentioned this pull request Dec 7, 2022

Unable to achieve 10 Gbit/s throughput on Hetzner server #637

Open

wadey added the slack label Jan 20, 2023

wadey added needs-defined-net-review Review needed from a Defined Networking team member needs-slack-review Needs review from a Slack team member labels Mar 13, 2023

wadey added 2 commits March 13, 2023 15:07

Merge remote-tracking branch 'origin/master' into multiport

aec7f5f

Merge remote-tracking branch 'origin/master' into multiport

e71059a

nbrownus modified the milestones: v1.7.0, v1.8.0 Apr 3, 2023

wadey added 4 commits May 3, 2023 10:50

Merge remote-tracking branch 'origin/master' into multiport

28ecfcb

Merge branch 'master' into multiport

0e593ad

Merge remote-tracking branch 'origin/master' into multiport

a2b9747

Merge remote-tracking branch 'origin/master' into multiport

f2aef0d

nbrownus modified the milestones: v1.8.0, v1.9.0 Oct 30, 2023

wadey added 4 commits January 26, 2024 10:45

Merge tag 'v1.8.2' into multiport

659d7fe

1.8.2 Release

fix android builds

b033267

fix e2e

05405bc

fix boringcrypto e2e

6606124

nbrownus modified the milestones: v1.9.0, v1.10.0 Apr 22, 2024

wadey added 5 commits May 8, 2024 11:22

Merge remote-tracking branch 'origin/master' into multiport

b445d14

Merge remote-tracking branch 'origin/master' into multiport

6b78e9c

Merge tag 'v1.9.4' into multiport

dabce8a

1.9.4 Release

fix roaming check

7ac51c1

We need to compare just the IPs here and not the IP+Port This is a regression with the merge of v1.9.4 and the change to netip.AddrPort - dabce8a

Merge remote-tracking branch 'origin/master' into multiport

f36db37

Merge remote-tracking branch 'origin/master' into multiport

4eb86af

nbrownus mentioned this pull request Mar 13, 2025

Implement ECMP for unsafe_routes #1332

Merged

wadey added 2 commits July 11, 2025 12:57

Merge remote-tracking branch 'origin/master' into multiport

ae9de47

Merge remote-tracking branch 'origin/master' into multiport

0496ef1

nbrownus modified the milestones: v1.10.0, backlog Oct 8, 2025

wadey added 2 commits December 4, 2025 15:22

Merge remote-tracking branch 'origin/master' into multiport

510a891

Merge remote-tracking branch 'origin/master' into multiport

0824035

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple UDP source ports (multiport)#768

Support multiple UDP source ports (multiport)#768
wadey wants to merge 22 commits intomasterfrom
multiport

wadey commented Oct 17, 2022

Uh oh!

brad-defined commented Feb 13, 2023

Uh oh!

dioss-Machiel commented Mar 21, 2025

Uh oh!

rawdigits commented Aug 16, 2025 •

edited

Loading

Uh oh!

wadey commented Sep 15, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

wadey commented Oct 17, 2022

Uh oh!

brad-defined commented Feb 13, 2023

Uh oh!

dioss-Machiel commented Mar 21, 2025

Uh oh!

rawdigits commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wadey commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

rawdigits commented Aug 16, 2025 •

edited

Loading

wadey commented Sep 15, 2025 •

edited

Loading