@@ -262,20 +262,75 @@ exits with an error.
262
262
263
263
### ` nftables ` proxy mode {#proxy-mode-nftables}
264
264
265
- {{< feature-state for_k8s_version="v1.29" state="alpha " >}}
265
+ {{< feature-state feature_gate_name="NFTablesProxyMode " >}}
266
266
267
- _ This proxy mode is only available on Linux nodes._
267
+ _ This proxy mode is only available on Linux nodes, and requires kernel
268
+ 5.13 or later._
268
269
269
270
In this mode, kube-proxy configures packet forwarding rules using the
270
271
nftables API of the kernel netfilter subsystem. For each endpoint, it
271
272
installs nftables rules which, by default, select a backend Pod at
272
273
random.
273
274
274
- The nftables API is the successor to the iptables API, and although it
275
- is designed to provide better performance and scalability than
276
- iptables, the kube-proxy nftables mode is still under heavy
277
- development as of {{< skew currentVersion >}} and is not necessarily
278
- expected to outperform the other Linux modes at this time.
275
+ The nftables API is the successor to the iptables API and is designed
276
+ to provide better performance and scalability than iptables. The
277
+ ` nftables ` proxy mode is able to process changes to service endpoints
278
+ faster and more efficiently than the ` iptables ` mode, and is also able
279
+ to more efficiently process packets in the kernel (though this only
280
+ becomes noticeable in clusters with tens of thousands of services).
281
+
282
+ As of Kubernetes {{< skew currentVersion >}}, the ` nftables ` mode is
283
+ still relatively new, and may not be compatible with all network
284
+ plugins; consult the documentation for your network plugin.
285
+
286
+ #### Migrating from ` iptables ` mode to ` nftables `
287
+
288
+ Users who want to switch from the default ` iptables ` mode to the
289
+ ` nftables ` mode should be aware that some features work slightly
290
+ differently the ` nftables ` mode:
291
+
292
+ - ** NodePort interfaces** : In ` iptables ` mode, by default,
293
+ [ NodePort services] ( /docs/concepts/services-networking/service/#type-nodeport )
294
+ are reachable on all local IP addresses. This is usually not what
295
+ users want, so the ` nftables ` mode defaults to
296
+ ` --nodeport-addresses primary ` , meaning NodePort services are only
297
+ reachable on the node's primary IPv4 and/or IPv6 addresses. You can
298
+ override this by specifying an explicit value for that option:
299
+ e.g., ` --nodeport-addresses 0.0.0.0/0 ` to listen on all (local)
300
+ IPv4 IPs.
301
+
302
+ - ** NodePort services on ` 127.0.0.1 ` ** : In ` iptables ` mode, if the
303
+ ` --nodeport-addresses ` range includes ` 127.0.0.1 ` (and the option
304
+ ` --iptables-localhost-nodeports false ` option is not passed), then
305
+ NodePort services are reachable even on "localhost" (` 127.0.0.1 ` ).
306
+ In ` nftables ` mode (and ` ipvs ` mode), this will not work. If you
307
+ are not sure if you are depending on this functionality, you can
308
+ check kube-proxy's
309
+ ` iptables_localhost_nodeports_accepted_packets_total ` metric; if it
310
+ is non-0, that means that some client has connected to a NodePort
311
+ service via ` 127.0.0.1 ` .
312
+
313
+ - ** NodePort interaction with firewalls** : The ` iptables ` mode of
314
+ kube-proxy tries to be compatible with overly-agressive firewalls;
315
+ for each NodePort service, it will add rules to accept inbound
316
+ traffic on that port, in case that traffic would otherwise be
317
+ blocked by a firewall. This approach will not work with firewalls
318
+ based on nftables, so kube-proxy's ` nftables ` mode does not do
319
+ anything here; if you have a local firewall, you must ensure that
320
+ it is properly configured to allow Kubernetes traffic through
321
+ (e.g., by allowing inbound traffic on the entire NodePort range).
322
+
323
+ - ** Conntrack bug workarounds** : Linux kernels prior to 6.1 have a
324
+ bug that can result in long-lived TCP connections to service IPs
325
+ being closed with the error "Connection reset by peer". The
326
+ ` iptables ` mode of kube-proxy installs a workaround for this bug,
327
+ but this workaround was later found to cause other problems in some
328
+ clusters. The ` nftables ` mode does not install any workaround by
329
+ default, but you can check kube-proxy's
330
+ ` iptables_ct_state_invalid_dropped_packets_total ` metric to see if
331
+ your cluster is depending on the workaround, and if so, you can run
332
+ kube-proxy with the option ` --conntrack-tcp-be-liberal ` to work
333
+ around the problem in ` nftables ` mode.
279
334
280
335
### ` kernelspace ` proxy mode {#proxy-mode-kernelspace}
281
336
0 commit comments