Skip to content

Conversation

@QxBytes
Copy link
Contributor

@QxBytes QxBytes commented Aug 1, 2025

Reason for Change:

CNS currently programs snat iptables rules only on NNC update/create. This PR makes it so that it also does so during cns bootup. Additionally, CNS will now attempt to "reconcile" the iptables on the node to what it expects. This allows us to "rollback" future updates to CNS iptables rules to match the expected state each time CNS reboots or restarts.

It is currently possible for CNI to add iptables rules to the node using chain "SWIFT". We want to jump to the cns-controlled chain "SWIFT-POSTROUTING" before we jump to the chain "SWIFT". The first piece of logic is to migrate to that:

Case 1: CNI already added a jump to SWIFT chain in nat POSTROUTING chain and it is at a higher priority than any SWIFT-POSTROUTING jump we may have added: We find the position of the SWIFT jump rule, and insert our SWIFT-POSTROUTING jump at that position, pushing the SWIFT jump rule to a lower priority

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT
jump to SWIFT-POSTROUTING

becomes

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING
jump to SWIFT

or

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT

becomes

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING
jump to SWIFT

Case 2: The SWIFT-POSTROUTING jump rule exists before the SWIFT jump rule, or exists without any SWIFT jump rule: We do not modify the nat POSTROUTING chain

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING
jump to SWIFT

or

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING

stays the same

Case 3: The nat POSTROUTING chain does not contain a SWIFT jump rule nor a SWIFT-POSTROUTING rule: We append the SWIFT-POSTROUTING jump rule to the nat POSTROUTING chain

POSTROUTING
jump to KUBE POSTROUTING
...

becomes

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING

Now, packets will go to the nat SWIFT-POSTROUTING chain first. We then take a look at the SWIFT-POSTROUTING chain.
Case 1: The nat SWIFT-POSTROUTING chain does not contain every single rule we expect exactly, or the # of rules in the SWIFT-POSTROUTING chain does not match how many rules we expect: Flush the chain and reinstall the iptables rules we expect.

Case 2: The nat SWIFT-POSTROUTING chain contains every single rule we expect exactly, and the # of rules matches how many rules we expect: We do not modify or flush the SWIFT-POSTROUTING chain

Issue Fixed:

Requirements:

Notes:

  • Check iptables insert of bounds just in case
  • When we list iptables rules, the first rule is always -N ... for custom chains or -P ... for built in chains (not shown in iptables -nvL -t nat). Feel free to check-- the library calls iptables -t <table> -S <chain> under the hood.
  • When we list iptables rules, the rules use -A syntax when listed no matter how they were added (via insert, append, etc.)
  • IPTables is 1-indexed, so insert at position 1 will be the first rule in the chain-- so when using iptables -nvL -t nat the first rule that shows up in a chain is at index 1.

@QxBytes QxBytes self-assigned this Aug 1, 2025
@QxBytes QxBytes added the cns Related to CNS. label Aug 1, 2025
@QxBytes QxBytes requested a review from Copilot August 1, 2025 23:54

This comment was marked as outdated.

@QxBytes QxBytes requested a review from Copilot August 2, 2025 00:01

This comment was marked as outdated.

@QxBytes QxBytes force-pushed the alew/add-cns-reconciliation branch from a00472e to f4af45b Compare August 2, 2025 00:09
@QxBytes QxBytes requested a review from Copilot August 2, 2025 00:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds iptables reconciliation functionality to CNS, enabling it to restore iptables rules to the expected state during startup and handle migration from legacy SWIFT chains to SWIFT-POSTROUTING chains. The changes ensure CNS can recover from external iptables modifications and properly sequence rule execution.

  • Adds iptables reconciliation during CNS bootup to restore expected SNAT rules
  • Implements migration logic to transition from legacy SWIFT chain to SWIFT-POSTROUTING chain with proper priority ordering
  • Enhances the iptables mock implementation to support new operations needed for testing

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
cns/restserver/util.go Adds reconciliation logic during CNS startup to reprogram SNAT rules for all containers
cns/restserver/restserver.go Extends iptables interface with List, ClearChain, and Delete methods
cns/restserver/internalapi_linux.go Implements comprehensive iptables reconciliation and migration logic from SWIFT to SWIFT-POSTROUTING
cns/restserver/internalapi_linux_test.go Adds extensive tests covering migration scenarios and rule reconciliation
cns/fakes/iptablesfake.go Enhances mock implementation with proper iptables simulation and tracking
Comments suppressed due to low confidence (1)

cns/fakes/iptablesfake.go:192

  • [nitpick] The method name 'ClearChainCallCount' is ambiguous - it could mean either getting the count or clearing the count. Consider renaming to 'GetClearChainCallCount' to clearly indicate it's a getter method.
func (c *IPTablesMock) ClearChainCallCount() int {

@QxBytes QxBytes marked this pull request as ready for review August 4, 2025 16:36
@QxBytes QxBytes requested a review from a team as a code owner August 4, 2025 16:36
@QxBytes QxBytes requested a review from santhoshmprabhu August 4, 2025 16:36
@QxBytes QxBytes requested a review from rbtr August 4, 2025 17:22
santhoshmprabhu
santhoshmprabhu previously approved these changes Aug 4, 2025
@santhoshmprabhu
Copy link
Contributor

I think we could consider eliminating that blip from clearing the chain, but not a big deal.

@rbtr
Copy link
Collaborator

rbtr commented Aug 6, 2025

/azp run Azure Container Networking PR

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@QxBytes QxBytes enabled auto-merge August 7, 2025 04:20
@QxBytes QxBytes added this pull request to the merge queue Aug 7, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Aug 7, 2025
@paulyufan2 paulyufan2 added this pull request to the merge queue Aug 7, 2025
Merged via the queue into master with commit d3c4985 Aug 7, 2025
14 of 15 checks passed
@paulyufan2 paulyufan2 deleted the alew/add-cns-reconciliation branch August 7, 2025 17:17
NihaNallappagari pushed a commit to NihaNallappagari/azure-container-networking that referenced this pull request Sep 4, 2025
* add iptables reconciliation to cns

* add test to check if test case flushes chain

* rename chain variable to reflect its value

* address linter

* nolint logger usage for now

* address lll linter

* remove code from startup as reconcile nnc always runs on cns startup
sivakami-projects pushed a commit that referenced this pull request Oct 23, 2025
* add iptables reconciliation to cns

* add test to check if test case flushes chain

* rename chain variable to reflect its value

* address linter

* nolint logger usage for now

* address lll linter

* remove code from startup as reconcile nnc always runs on cns startup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cns Related to CNS.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants