Skip to content

Conversation

@champtar
Copy link
Contributor

@champtar champtar commented Oct 23, 2025

  • portmap: fix CHECK for nftables backend

  • portmap: ensure nftables backend only intercept local traffic

    portmap iptables backend uses -m addrtype --dst-type LOCAL
    and a common chain (CNI-HOSTPORT-DNAT) for both hostPort and hostIP/hostPort.

    Before this commit, nftables backend was using 2 separate chains,
    hostip_hostports and hostports. The goal was to avoid using
    fib daddr type local before we jump to hostip_hostports,
    but this is a behavior change compared to iptables backend,
    and a security issue (hostIP: 1.1.1.1 / hostPort: 53).
    Also while switching from input to prerouting hook, we forgot to
    add the fib lookup for hostports, rendering the nftables backend half broken.

    To allow transparent upgrades and avoid running the fib lookup twice,
    we use an intermediate chain (hostports_all)

    chain hostports_all {
        jump hostip_hostports
        jump hostports
    }
    

    Long-term we want to remove hostip_hostports,
    so all new rules are created in the hostports chain.

    We can't use implicit chains (jump { jump hostip_hostports; jump hostports })
    as it's not supported by knftables.Fake yet.

Fixes 9296c5f
Fixes 01a94e1

Fixes #1209

@champtar champtar force-pushed the fix-portmap_nftables branch 2 times, most recently from a50bdb6 to 409f4a2 Compare October 23, 2025 04:00
@champtar champtar force-pushed the fix-portmap_nftables branch from 409f4a2 to c8c1a68 Compare November 4, 2025 02:47
@champtar champtar marked this pull request as ready for review November 4, 2025 02:58
@champtar
Copy link
Contributor Author

champtar commented Nov 4, 2025

@danwinship @squeed @s1061123 ready for review / tests

@champtar
Copy link
Contributor Author

champtar commented Nov 4, 2025

@agusdallalba if you want to test

Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, seems right

@champtar champtar force-pushed the fix-portmap_nftables branch from c8c1a68 to e990380 Compare November 4, 2025 19:59
@danwinship
Copy link
Contributor

It seems like it would be simpler to just always create and jump to the hostip_hostports chain, even if it's eventually going to end up always being empty everywhere.

Though if we're keeping the chain, it would probably make more sense to keep the rules split between the two chains still too, for semantic reasons even if it's no longer logically necessary.

@champtar
Copy link
Contributor Author

champtar commented Nov 4, 2025

The cost of a jump might be negligeable, I have honestly no idea.
We can also keep both chains, add new rules only in hostports, and in some releases only jump to hostports / remove the hostip_hostports chain.

@squeed @s1061123 extra opinions ?

@s1061123
Copy link
Contributor

s1061123 commented Nov 5, 2025

For now, I agree @danwinship comment to keep the old one for now, but we should mention in the code(comment) that old one is not used actually. We don't look previous PR discussion always of course. This information should be kept in the code, otherwise we may forget this discussion...

@champtar champtar force-pushed the fix-portmap_nftables branch 3 times, most recently from 47d9fb3 to 5682acb Compare November 5, 2025 16:28
@champtar
Copy link
Contributor Author

champtar commented Nov 5, 2025

Simpler version pushed: we keep both chain, create new rules in hostports only, added comments, and fixed CHECK also

@champtar champtar requested a review from danwinship November 5, 2025 16:29
hostIPHostPorts++
}
} else {
hostPorts++
Copy link
Contributor

@danwinship danwinship Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, suppose you have a pod with hostIP hostports. They get created in hostip_hostports by the old plugin. Then you upgrade your plugins. Then the CRI does a CHECK. With this code, the CHECK would return an error. So then, presumably, the CRI will ADD again, and then forwardPorts will duplicate the hostIP rules in the hostports chain, without deleting the ones from the hostip_hostports chain...

That's not awful though; the extra rules are redundant but not incorrect. And it would be a little bit annoying to make checkPorts try to deal with accepting either the old layout or the new.

(Or would CRI be expected to do a DEL before the ADD anyway?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with the analysis, let's leave it like that :)
To encounter this bug, you would have to:

  • use the nftables backend
  • use hostIP
  • use cri-o (containerd doesn't use CHECK if I remember correctly)
  • use IPv4 only
  • not drain your node before update !!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or would CRI be expected to do a DEL before the ADD anyway?)

Yes, you must always do a DEL before ADD.

@danwinship
Copy link
Contributor

lgtm
(I don't have merge rights here though.)

Fixes 01a94e1

Signed-off-by: Etienne Champetier <[email protected]>
portmap iptables backend uses `-m addrtype --dst-type LOCAL`
and a common chain (CNI-HOSTPORT-DNAT) for both hostPort and hostIP/hostPort.

Before this commit, nftables backend was using 2 separate chains,
`hostip_hostports` and `hostports`. The goal was to avoid using
`fib daddr type local` before we jump to `hostip_hostports`,
but this is a behavior change compared to iptables backend,
and a security issue (hostIP: 1.1.1.1 / hostPort: 53).
Also while switching from input to prerouting hook, we forgot to
add the fib lookup for `hostports`, rendering the nftables backend half broken.

To allow transparent upgrades and avoid running the fib lookup twice,
we use an intermediate chain (`hostports_all`)
```
chain hostports_all {
    jump hostip_hostports
    jump hostports
}
```

Long-term we want to remove `hostip_hostports`,
so all new rules are created in the `hostports` chain.

We can't use implicit chains (`jump { jump hostip_hostports; jump hostports }`)
as it's not supported by knftables.Fake yet.

Fixes 9296c5f
Fixes 01a94e1

Signed-off-by: Etienne Champetier <[email protected]>
@champtar champtar force-pushed the fix-portmap_nftables branch from 5682acb to 853b7e8 Compare November 8, 2025 21:50
@champtar
Copy link
Contributor Author

champtar commented Nov 9, 2025

@squeed @s1061123 ping

@s1061123
Copy link
Contributor

s1061123 commented Nov 9, 2025

I couldn't make the time to review in this week...
@LionelJouin PTAL?

@squeed squeed requested a review from mlguerrero12 November 10, 2025 15:17
@mlguerrero12 mlguerrero12 merged commit 9b3772e into containernetworking:main Nov 13, 2025
6 checks passed
@champtar champtar deleted the fix-portmap_nftables branch November 17, 2025 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

portmap plugin nftables backend intercepts non local traffic

5 participants