Skip to content

Conversation

@amastbaum
Copy link
Contributor

@amastbaum amastbaum commented Nov 10, 2025

What?

Added default gateway support to the routing table reachability check
https://redmine.mellanox.com/issues/4659780

Summary by CodeRabbit

  • New Features

    • Detect IPoIB (InfiniBand) network interfaces.
    • Option to control whether default-gateway routes are considered during route checks.
  • Improvements

    • Device addresses now indicate allowance of default-gateway routing for non-IPoIB paths.
    • Reachability checks respect link-layer specifics and honor the new default-gateway option.
  • Bug Fixes

    • Corrected route parsing and destination initialization to avoid incorrect handling and errors.

@coderabbitai
Copy link

coderabbitai bot commented Nov 10, 2025

Walkthrough

Adds allow_default_gw control to netlink route lookups and propagates rtm_dst_len into route parsing; fixes memset sizing and conditional route-rule initialization with rollback; adds ucs_netif_is_ipoib(); introduces a TCP device flag to allow default-gateway matching; updates callers to pass allow_default_gw.

Changes

Cohort / File(s) Change Summary
Netlink routing infrastructure
src/ucs/sys/netlink.h, src/ucs/sys/netlink.c
Added allow_default_gw to ucs_netlink_route_info_t; changed ucs_netlink_route_exists signature to accept allow_default_gw; propagate rtm_dst_len into parsing; skip default-gateway routes when allow_default_gw is false; fix memset sizing for new_rule->dest and only initialize it when dst_in_addr is non-NULL with rollback on failure.
IPoIB interface detection
src/ucs/sys/sys.h, src/ucs/sys/sys.c
Added public API ucs_netif_is_ipoib(const char *if_name) using SIOCGIFHWADDR/ARPHRD_INFINIBAND; added #include <ucs/sys/sock.h>.
TCP device/address flags
src/uct/tcp/tcp.h
Added UCT_TCP_DEVICE_ADDR_FLAG_ALLOW_DEFAULT_GW to uct_tcp_device_addr_flags_t enumeration.
TCP iface routing behavior
src/uct/tcp/tcp_iface.c
Include ucs/sys/sys.h; set ALLOW_DEFAULT_GW flag for non-IPoIB devices in device address; derive allow_default_gw from device_addr flags and pass it to ucs_netlink_route_exists when checking reachability.
IB iface RoCE routing check
src/uct/ib/base/ib_iface.c
Calls to ucs_netlink_route_exists updated to pass the new allow_default_gw argument (1) for RoCE routability checks.

Sequence Diagram(s)

sequenceDiagram
    participant TCP as TCP iface init
    participant Sys as ucs_netif_is_ipoib
    participant Netlink as ucs_netlink_route_exists
    participant Parser as route parser

    TCP->>Sys: ucs_netif_is_ipoib(if_name)
    alt IPoIB
        Sys-->>TCP: true
        TCP->>Netlink: ucs_netlink_route_exists(..., allow_default_gw=0)
    else not IPoIB
        Sys-->>TCP: false
        TCP->>Netlink: ucs_netlink_route_exists(..., allow_default_gw=1)
    end

    Netlink->>Parser: parse route attrs (rtm_dst_len)
    alt route is default GW and allow_default_gw==0
        Parser-->>Netlink: skip route (trace)
    else
        Parser-->>Netlink: include route
    end
    Netlink-->>TCP: route exists result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Inspect propagation and validation of rtm_dst_len in parsing callbacks.
  • Verify the corrected memset sizing and the conditional initialization/rollback for new_rule->dest.
  • Review ucs_netif_is_ipoib ioctl usage, return values, and logging.
  • Confirm all callers updated for the new ucs_netlink_route_exists signature and consistent allow_default_gw semantics.

Poem

🐰 I hopped the routes both wide and thin,
I checked for gateways, skipped the default din,
Found InfiniBand beneath the skin,
A tiny flag, a careful spin,
The network hummed — my rabbit grin!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Added default gateway support to the routing table reachability check' accurately and concisely describes the main changes across the pull request, which introduce default gateway handling throughout the routing validation logic.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb97742 and fbcadb4.

📒 Files selected for processing (1)
  • src/ucs/sys/netlink.c (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/ucs/sys/netlink.c (1)
src/ucs/sys/sock.c (1)
  • ucs_sockaddr_set_inet_addr (732-746)
🔇 Additional comments (8)
src/ucs/sys/netlink.c (8)

32-33: LGTM! Clear field addition for default gateway control.

The new allow_default_gw field is well-documented and integrates cleanly with the existing structure.


192-195: Correct validation logic for default vs. non-default routes.

The validation properly distinguishes between default routes (prefix length 0, no destination required) and non-default routes (must have a destination address). The logic correctly allows dst_in_addr to be NULL only when rtm_dst_len == 0.


237-245: Proper initialization with correct error rollback.

The conditional initialization correctly handles both default routes (where dst_in_addr is NULL) and non-default routes. The ucs_array_pop_back rollback on failure ensures proper cleanup. The memset sizing is now correct.


247-247: Essential fix: subnet_prefix_len now initialized for all routes.

The unconditional assignment ensures subnet_prefix_len is always initialized, including for default routes (where it will be 0). This addresses the critical issue from past reviews and enables deterministic default gateway detection at line 268.


267-272: Correct default gateway filtering logic.

The implementation properly identifies default routes by checking subnet_prefix_len == 0 and skips them when not allowed. The trace logging aids debugging. This is the core feature that prevents inappropriate default gateway matching for cross-fabric scenarios (e.g., IPoIB).


284-285: Function signature correctly extended with new parameter.

The allow_default_gw parameter enables callers to control default gateway matching behavior, which is essential for the PR's goal of supporting default routes selectively based on network fabric compatibility.


304-307: Complete struct initialization.

All fields, including the new allow_default_gw, are properly initialized before the route lookup. The data flow from parameter to struct to lookup logic is clean and correct.


214-215: Correct propagation of prefix length for validation.

Passing rt_msg->rtm_dst_len to the route info extraction function enables proper validation of default vs. non-default routes, completing the data flow needed for correct route parsing.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/ucs/sys/netlink.c (1)

285-319: API compatibility shim for header change (if adopted).

If you keep the header refactor suggested earlier, add these impl shims:

-int ucs_netlink_route_exists(int if_index, const struct sockaddr *sa_remote,
-                             int allow_default_gw)
+int ucs_netlink_route_exists_ex(int if_index, const struct sockaddr *sa_remote,
+                                int allow_default_gw)
 {
     ...
     return info.found;
 }
 
+int ucs_netlink_route_exists(int if_index, const struct sockaddr *sa_remote)
+{
+    return ucs_netlink_route_exists_ex(if_index, sa_remote, 1);
+}

Also update in-tree call sites that pass the 3rd argument to use _ex.

src/uct/tcp/tcp_iface.c (1)

213-279: Don’t base allow_default_gw solely on the remote flag; include local link layer.

If the local NIC is IPoIB but the remote peer is older and didn’t set the flag, we’ll incorrectly allow default GW and may accept unroutable peers. Compute the flag using both local interface and remote flags.

Apply this diff:

-    /* Default gateway is not relevant for IPoIB interfaces */
-    allow_default_gw = !(tcp_dev_addr->flags &
-                         UCT_TCP_DEVICE_ADDR_FLAG_LINK_LAYER_IB);
+    /* Default gateway is not relevant for IPoIB on either side */
+    {
+        int local_is_ipoib = ucs_netif_is_ipoib(iface->if_name);
+        int remote_is_ipoib = !!(tcp_dev_addr->flags &
+                                 UCT_TCP_DEVICE_ADDR_FLAG_LINK_LAYER_IB);
+        allow_default_gw = !(local_is_ipoib || remote_is_ipoib);
+    }
 
-    if (!ucs_netlink_route_exists(ndev_index,
-                                  (const struct sockaddr *)&remote_addr,
-                                  allow_default_gw)) {
+    if (!ucs_netlink_route_exists_ex(ndev_index,
+                                     (const struct sockaddr *)&remote_addr,
+                                     allow_default_gw)) {
         ...
     }

If you keep the existing 3‑arg API name, drop the _ex rename in the snippet accordingly.

🧹 Nitpick comments (2)
src/ucs/sys/netlink.c (1)

177-197: Unused parameter in ucs_netlink_get_route_info.

rtm_dst_len is not used in this function; either drop it from the signature or add a comment explaining why it’s threaded through.

src/ucs/sys/sys.c (1)

181-194: IPoIB detection implementation LGTM; consider portability comment.

Works on Linux via SIOCGIFHWADDR + ARPHRD_INFINIBAND. Add a brief comment noting Linux dependency to align with header guard suggestion.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14219b1 and aa29ae9.

📒 Files selected for processing (7)
  • src/ucs/sys/netlink.c (8 hunks)
  • src/ucs/sys/netlink.h (1 hunks)
  • src/ucs/sys/sys.c (2 hunks)
  • src/ucs/sys/sys.h (1 hunks)
  • src/uct/ib/base/ib_iface.c (2 hunks)
  • src/uct/tcp/tcp.h (1 hunks)
  • src/uct/tcp/tcp_iface.c (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
src/ucs/sys/sys.h (1)
src/ucs/sys/sys.c (1)
  • ucs_netif_is_ipoib (181-194)
src/ucs/sys/sys.c (1)
src/ucs/sys/sock.c (1)
  • ucs_netif_ioctl (75-101)
src/uct/ib/base/ib_iface.c (1)
src/ucs/sys/netlink.c (1)
  • ucs_netlink_ethernet_device_route_exists (315-319)
src/ucs/sys/netlink.c (1)
src/ucs/sys/sock.c (2)
  • ucs_sockaddr_set_inet_addr (732-746)
  • ucs_sockaddr_is_same_subnet (969-999)
src/ucs/sys/netlink.h (1)
src/ucs/sys/netlink.c (2)
  • ucs_netlink_route_exists (285-313)
  • ucs_netlink_ethernet_device_route_exists (315-319)
src/uct/tcp/tcp_iface.c (2)
src/ucs/sys/sys.c (1)
  • ucs_netif_is_ipoib (181-194)
src/ucs/sys/netlink.c (1)
  • ucs_netlink_route_exists (285-313)
🔇 Additional comments (4)
src/uct/tcp/tcp.h (1)

298-303: New LINK_LAYER_IB flag looks fine; confirm cross-version behavior.

Peers running older UCX will ignore unknown bits. Here, reachability depends on this bit in the remote address. If the remote is older (doesn’t set the bit) but either side is actually IPoIB, we may wrongly allow default-GW. See tcp_iface.c fix suggested.

src/uct/ib/base/ib_iface.c (1)

713-726: Switch to ethernet-only route check LGTM.

Semantics preserved and clearer intent via ucs_netlink_ethernet_device_route_exists.

src/uct/tcp/tcp_iface.c (2)

142-145: Marking IPoIB in device address LGTM.

Flagging UCT_TCP_DEVICE_ADDR_FLAG_LINK_LAYER_IB during packing is correct and minimal.


276-279: Verification complete: no legacy 2-arg calls found.

All calls to ucs_netlink_route_exists in the codebase use the 3-argument version. The function definition at src/ucs/sys/netlink.h:60-61 confirms the current signature requires three arguments. The two existing call sites—src/uct/tcp/tcp_iface.c:276 and src/ucs/sys/netlink.c:318—both pass the correct number of arguments.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/ucs/sys/netlink.c (1)

192-194: Consider refining validation for default routes.

For default routes (rtm_dst_len == 0), dst_in_addr is extracted but never used (lines 236-242 skip address assignment). The validation could be more precise by only requiring dst_in_addr for non-default routes.

Apply this diff to make validation conditional:

-    if ((*if_index_p == -1) || (*dst_in_addr == NULL)) {
+    if (*if_index_p == -1) {
+        return UCS_ERR_INVALID_PARAM;
+    }
+
+    if ((rtm_dst_len != 0) && (*dst_in_addr == NULL)) {
         return UCS_ERR_INVALID_PARAM;
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa29ae9 and 37f3128.

📒 Files selected for processing (1)
  • src/ucs/sys/netlink.c (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/ucs/sys/netlink.c (1)
src/ucs/sys/sock.c (2)
  • ucs_sockaddr_set_inet_addr (732-746)
  • ucs_sockaddr_is_same_subnet (969-999)
🔇 Additional comments (5)
src/ucs/sys/netlink.c (5)

32-33: LGTM! Clear field addition.

The allow_default_gw field is well-named with a clear comment explaining its purpose for controlling default gateway route matching.


210-212: LGTM! Correct parameter passing.

Passing rt_msg->rtm_dst_len enables the function to distinguish default routes (dst_len == 0) from normal routes.


234-244: Excellent fixes addressing past review comments!

Line 234 now correctly uses sizeof(new_rule->dest) instead of the previously buggy sizeof(sizeof(new_rule->dest)).

Line 244 now unconditionally sets subnet_prefix_len = rt_msg->rtm_dst_len, ensuring it's always initialized (0 for default routes, eliminating the nondeterministic read flagged in past reviews).

Lines 236-242 correctly skip address assignment for default routes while preserving the error path for non-default routes.


253-283: LGTM! Default gateway filtering logic is correct.

The implementation properly identifies default gateways (subnet_prefix_len == 0), skips them when allow_default_gw is false with a clear trace message, and uses correct short-circuit evaluation for matching (default GW matches immediately, otherwise check subnet).


285-319: LGTM! Clean API extension.

The signature change adds necessary allow_default_gw control. The single call site in src/uct/tcp/tcp_iface.c:276 has been properly updated with the new parameter, and the new ucs_netlink_ethernet_device_route_exists helper provides a clear ethernet-specific convenience function.

@amastbaum amastbaum requested a review from shasson5 November 10, 2025 13:38
@amastbaum amastbaum requested a review from shasson5 November 11, 2025 12:51
@amastbaum amastbaum force-pushed the add_default_gateway_support branch from 960656e to d2f227e Compare November 17, 2025 12:19
@amastbaum amastbaum requested a review from gleon99 November 17, 2025 21:23
gleon99
gleon99 previously approved these changes Nov 18, 2025
Copy link
Contributor

@gleon99 gleon99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe yosefe enabled auto-merge (squash) November 19, 2025 11:55
@yosefe yosefe merged commit 03245a7 into openucx:master Nov 19, 2025
148 checks passed
gleon99 pushed a commit to gleon99/ucx that referenced this pull request Nov 20, 2025
brminich pushed a commit that referenced this pull request Nov 21, 2025
…1019)

UCS/SYS: Added default gateway support to the routing table reachability check (#11000)

Co-authored-by: amastbaum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants