Skip to content

Commit 2837112

Browse files
authored
Merge pull request moby#50355 from robmry/nftablesdoc
Add "nftablesdoc"
2 parents c47a4ab + 7790528 commit 2837112

20 files changed

+2010
-0
lines changed
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
## nftables for a new Daemon
2+
3+
When the daemon starts, it creates two tables, `ip docker-bridges` and
4+
`ip6 docker-bridges` for IPv4 and IPv6 rules respectively. Each table contains
5+
some base chains and empty verdict maps. Rules for the default bridge network
6+
are then added.
7+
8+
table ip docker-bridges {
9+
map filter-forward-in-jumps {
10+
type ifname : verdict
11+
elements = { "docker0" : jump filter-forward-in__docker0 }
12+
}
13+
14+
map filter-forward-out-jumps {
15+
type ifname : verdict
16+
elements = { "docker0" : jump filter-forward-out__docker0 }
17+
}
18+
19+
map nat-postrouting-in-jumps {
20+
type ifname : verdict
21+
elements = { "docker0" : jump nat-postrouting-in__docker0 }
22+
}
23+
24+
map nat-postrouting-out-jumps {
25+
type ifname : verdict
26+
elements = { "docker0" : jump nat-postrouting-out__docker0 }
27+
}
28+
29+
chain filter-FORWARD {
30+
type filter hook forward priority filter; policy accept;
31+
oifname vmap @filter-forward-in-jumps
32+
iifname vmap @filter-forward-out-jumps
33+
}
34+
35+
chain nat-OUTPUT {
36+
type nat hook output priority -100; policy accept;
37+
ip daddr != 127.0.0.0/8 fib daddr type local counter jump nat-prerouting-and-output
38+
}
39+
40+
chain nat-POSTROUTING {
41+
type nat hook postrouting priority srcnat; policy accept;
42+
iifname vmap @nat-postrouting-out-jumps
43+
oifname vmap @nat-postrouting-in-jumps
44+
}
45+
46+
chain nat-PREROUTING {
47+
type nat hook prerouting priority dstnat; policy accept;
48+
fib daddr type local counter jump nat-prerouting-and-output
49+
}
50+
51+
chain nat-prerouting-and-output {
52+
}
53+
54+
chain raw-PREROUTING {
55+
type filter hook prerouting priority raw; policy accept;
56+
}
57+
58+
chain filter-forward-in__docker0 {
59+
ct state established,related counter accept
60+
iifname "docker0" counter accept comment "ICC"
61+
counter drop comment "UNPUBLISHED PORT DROP"
62+
}
63+
64+
chain filter-forward-out__docker0 {
65+
ct state established,related counter accept
66+
counter accept comment "OUTGOING"
67+
}
68+
69+
chain nat-postrouting-in__docker0 {
70+
}
71+
72+
chain nat-postrouting-out__docker0 {
73+
oifname != "docker0" ip saddr 172.17.0.0/16 counter masquerade comment "MASQUERADE"
74+
}
75+
}
76+
77+
78+
#### filter-FORWARD
79+
80+
Chain `filter-FORWARD` is a base chain, with type `filter` and hook `forward`.
81+
_So, it's equivalent to the iptables built-in chain `FORWARD` in the `filter`
82+
table._ It's initialised with two rules that use the output and input
83+
interface names as keys in verdict maps:
84+
85+
chain filter-FORWARD {
86+
type filter hook forward priority filter; policy accept;
87+
oifname vmap @filter-forward-in-jumps
88+
iifname vmap @filter-forward-out-jumps
89+
}
90+
91+
92+
The verdict maps will be populated with an element per bridge network, each
93+
jumping to a chain containing rules for that bridge. (So, for packets that
94+
aren't going to-or-from a Docker bridge device, no jump rules are found in
95+
the verdict map, and the packets don't need any further processing by this
96+
base chain.)
97+
98+
The filter-FORWARD chain's policy shown above is `accept`. However:
99+
100+
- For IPv4, the policy is `drop` if the sysctl
101+
net.ipv4.ip_forward was not set to '1', and the daemon set it itself when
102+
an IPv4-enabled bridge network was created.
103+
- For IPv6, similar, but for sysctls "/proc/sys/net/ipv6/conf/default/forwarding"
104+
and "/proc/sys/net/ipv6/conf/all/forwarding".
105+
106+
#### Per-network filter-FORWARD rules
107+
108+
Chains added for the default bridge network are named after the base chain
109+
hook they're called from, and the network's bridge.
110+
111+
Packets processed by `filter-forward-in__*` will be delivered to the bridge
112+
network if accepted. For docker0, the chain is:
113+
114+
chain filter-forward-in__docker0 {
115+
ct state established,related counter accept
116+
iifname "docker0" counter accept comment "ICC"
117+
counter drop comment "UNPUBLISHED PORT DROP"
118+
}
119+
120+
121+
The rules are:
122+
- conntrack accept for established flows. _Note that accept only applies to the
123+
base chain, accepted packets may be processed by other base chains registered
124+
with the same hook._
125+
- accept packets originating within the network, because inter-container
126+
communication (ICC) is enabled.
127+
- drop any other packets, because no there are no containers in the network
128+
with published ports. _This means there is no dependency on the filter-FORWARD
129+
chain's default policy. Even if it is ACCEPT, packets will be dropped unless
130+
container ports/protocols are published._
131+
132+
Packets processed by `filter-forward-out__*` originate from the bridge network:
133+
134+
chain filter-forward-out__docker0 {
135+
ct state established,related counter accept
136+
counter accept comment "OUTGOING"
137+
}
138+
139+
140+
The rules in docker0's chain are:
141+
- conntrack accept for established flows.
142+
- an accept rule, containers in this network have access to external networks.
143+
144+
#### nat-POSTROUTING
145+
146+
Like the filter-FORWARD chain, nat-POSTROUTING has a jump to per-network chains
147+
for packets to and from the network.
148+
149+
chain nat-POSTROUTING {
150+
type nat hook postrouting priority srcnat; policy accept;
151+
iifname vmap @nat-postrouting-out-jumps
152+
oifname vmap @nat-postrouting-in-jumps
153+
}
154+
155+
156+
#### Per-network nat-POSTROUTING rules
157+
158+
In docker0's nat-postrouting chains, there's a single masquerade rule for packets
159+
leaving the network:
160+
161+
chain nat-postrouting-in__docker0 {
162+
}
163+
164+
chain nat-postrouting-out__docker0 {
165+
oifname != "docker0" ip saddr 172.17.0.0/16 counter masquerade comment "MASQUERADE"
166+
}
167+
Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
## Containers on user-defined --internal networks
2+
3+
These are the rules for two containers on different `--internal` networks, with and
4+
without inter-container communication (ICC).
5+
6+
Equivalent to:
7+
8+
docker network create \
9+
-o com.docker.network.bridge.name=bridgeICC \
10+
--internal \
11+
--subnet 192.0.2.0/24 --gateway 192.0.2.1 bridge1
12+
docker run --network bridgeICC --name c1 busybox
13+
14+
docker network create \
15+
-o com.docker.network.bridge.name=bridgeNoICC \
16+
-o com.docker.network.bridge.enable_icc=true \
17+
--internal \
18+
--subnet 198.51.100.0/24 --gateway 198.51.100.1 bridge1
19+
docker run --network bridgeNoICC --name c1 busybox
20+
21+
Most rules are the same as the network with [external access][0]:
22+
23+
<details>
24+
<summary>Full table ...</summary>
25+
26+
table ip docker-bridges {
27+
map filter-forward-in-jumps {
28+
type ifname : verdict
29+
elements = { "docker0" : jump filter-forward-in__docker0,
30+
"bridgeICC" : jump filter-forward-in__bridgeICC,
31+
"bridgeNoICC" : jump filter-forward-in__bridgeNoICC }
32+
}
33+
34+
map filter-forward-out-jumps {
35+
type ifname : verdict
36+
elements = { "docker0" : jump filter-forward-out__docker0,
37+
"bridgeICC" : jump filter-forward-out__bridgeICC,
38+
"bridgeNoICC" : jump filter-forward-out__bridgeNoICC }
39+
}
40+
41+
map nat-postrouting-in-jumps {
42+
type ifname : verdict
43+
elements = { "docker0" : jump nat-postrouting-in__docker0,
44+
"bridgeICC" : jump nat-postrouting-in__bridgeICC,
45+
"bridgeNoICC" : jump nat-postrouting-in__bridgeNoICC }
46+
}
47+
48+
map nat-postrouting-out-jumps {
49+
type ifname : verdict
50+
elements = { "docker0" : jump nat-postrouting-out__docker0,
51+
"bridgeICC" : jump nat-postrouting-out__bridgeICC,
52+
"bridgeNoICC" : jump nat-postrouting-out__bridgeNoICC }
53+
}
54+
55+
chain filter-FORWARD {
56+
type filter hook forward priority filter; policy accept;
57+
oifname vmap @filter-forward-in-jumps
58+
iifname vmap @filter-forward-out-jumps
59+
}
60+
61+
chain nat-OUTPUT {
62+
type nat hook output priority -100; policy accept;
63+
ip daddr != 127.0.0.0/8 fib daddr type local counter jump nat-prerouting-and-output
64+
}
65+
66+
chain nat-POSTROUTING {
67+
type nat hook postrouting priority srcnat; policy accept;
68+
iifname vmap @nat-postrouting-out-jumps
69+
oifname vmap @nat-postrouting-in-jumps
70+
}
71+
72+
chain nat-PREROUTING {
73+
type nat hook prerouting priority dstnat; policy accept;
74+
fib daddr type local counter jump nat-prerouting-and-output
75+
}
76+
77+
chain nat-prerouting-and-output {
78+
}
79+
80+
chain raw-PREROUTING {
81+
type filter hook prerouting priority raw; policy accept;
82+
}
83+
84+
chain filter-forward-in__docker0 {
85+
ct state established,related counter accept
86+
iifname "docker0" counter accept comment "ICC"
87+
counter drop comment "UNPUBLISHED PORT DROP"
88+
}
89+
90+
chain filter-forward-out__docker0 {
91+
ct state established,related counter accept
92+
counter accept comment "OUTGOING"
93+
}
94+
95+
chain nat-postrouting-in__docker0 {
96+
}
97+
98+
chain nat-postrouting-out__docker0 {
99+
oifname != "docker0" ip saddr 172.17.0.0/16 counter masquerade comment "MASQUERADE"
100+
}
101+
102+
chain filter-forward-in__bridgeICC {
103+
ct state established,related counter accept
104+
iifname != "bridgeICC" counter drop comment "INTERNAL NETWORK INGRESS"
105+
counter accept comment "ICC"
106+
}
107+
108+
chain filter-forward-out__bridgeICC {
109+
ct state established,related counter accept
110+
oifname != "bridgeICC" counter drop comment "INTERNAL NETWORK EGRESS"
111+
}
112+
113+
chain nat-postrouting-in__bridgeICC {
114+
}
115+
116+
chain nat-postrouting-out__bridgeICC {
117+
}
118+
119+
chain filter-forward-in__bridgeNoICC {
120+
ct state established,related counter accept
121+
iifname != "bridgeNoICC" counter drop comment "INTERNAL NETWORK INGRESS"
122+
counter drop comment "ICC"
123+
}
124+
125+
chain filter-forward-out__bridgeNoICC {
126+
ct state established,related counter accept
127+
oifname != "bridgeNoICC" counter drop comment "INTERNAL NETWORK EGRESS"
128+
}
129+
130+
chain nat-postrouting-in__bridgeNoICC {
131+
}
132+
133+
chain nat-postrouting-out__bridgeNoICC {
134+
}
135+
}
136+
137+
138+
</details>
139+
140+
The filter-forward-in chains have rules to drop packets originating outside
141+
the network. And, with ICC disabled, the final verdict is drop rather than
142+
accept:
143+
144+
chain filter-forward-in__bridgeICC {
145+
ct state established,related counter accept
146+
iifname != "bridgeICC" counter drop comment "INTERNAL NETWORK INGRESS"
147+
counter accept comment "ICC"
148+
}
149+
150+
chain filter-forward-in__bridgeNoICC {
151+
ct state established,related counter accept
152+
iifname != "bridgeNoICC" counter drop comment "INTERNAL NETWORK INGRESS"
153+
counter drop comment "ICC"
154+
}
155+
156+
157+
The nat-postrouting-out chains have no masquerade rules:
158+
159+
chain nat-postrouting-out__bridgeICC {
160+
}
161+
162+
chain nat-postrouting-out__bridgeNoICC {
163+
}
164+
165+
166+
[0]: usernet-portmap.md

0 commit comments

Comments
 (0)