Skip to content

Commit 5170f4a

Browse files
danwinshipTim Bannister
andcommitted
Update iptables perf discussion for 1.27
Co-authored-by: Tim Bannister <[email protected]>
1 parent e2526aa commit 5170f4a

File tree

1 file changed

+28
-32
lines changed

1 file changed

+28
-32
lines changed

content/en/docs/reference/networking/virtual-ips.md

Lines changed: 28 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,26 @@ iptables:
131131
...
132132
```
133133

134+
##### Performance optimization for `iptables` mode {#minimize-iptables-restore}
135+
136+
{{< feature-state for_k8s_version="v1.27" state="beta" >}}
137+
138+
In Kubernetes {{< skew currentVersion >}} the kube-proxy defaults to a minimal approach
139+
to `iptables-restore` operations, only making updates where Services or EndpointSlices have
140+
actually changed. This is a performance optimization.
141+
The original implementation updated all the rules for all Services on every sync; this
142+
sometimes led to performance issues (update lag) in large clusters.
143+
144+
If you are not running kube-proxy from Kubernetes {{< skew currentVersion >}}, check
145+
the behavior and associated advice for the version that you are actually running.
146+
147+
If you were previously overriding `minSyncPeriod`, you should try
148+
removing that override and letting kube-proxy use the default value
149+
(`1s`) or at least a smaller value than you were using before upgrading.
150+
You can select the legacy behavior by disabling the `MinimizeIPTablesRestore`
151+
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
152+
(you should not need to).
153+
134154
##### `minSyncPeriod`
135155

136156
The `minSyncPeriod` parameter sets the minimum duration between
@@ -142,7 +162,7 @@ things change in a small time period. For example, if you have a
142162
Service backed by a {{< glossary_tooltip term_id="deployment" text="Deployment" >}}
143163
with 100 pods, and you delete the
144164
Deployment, then with `minSyncPeriod: 0s`, kube-proxy would end up
145-
removing the Service's Endpoints from the iptables rules one by one,
165+
removing the Service's endpoints from the iptables rules one by one,
146166
for a total of 100 updates. With a larger `minSyncPeriod`, multiple
147167
Pod deletion events would get aggregated
148168
together, so kube-proxy might
@@ -154,20 +174,19 @@ The larger the value of `minSyncPeriod`, the more work that can be
154174
aggregated, but the downside is that each individual change may end up
155175
waiting up to the full `minSyncPeriod` before being processed, meaning
156176
that the iptables rules spend more time being out-of-sync with the
157-
current apiserver state.
177+
current API server state.
158178

159-
The default value of `1s` is a good compromise for small and medium
160-
clusters. In large clusters, it may be necessary to set it to a larger
161-
value. (Especially, if kube-proxy's
162-
`sync_proxy_rules_duration_seconds` metric indicates an average
163-
time much larger than 1 second, then bumping up `minSyncPeriod` may
164-
make updates more efficient.)
179+
The default value of `1s` should work well in most clusters, but in very
180+
large clusters it may be necessary to set it to a larger value.
181+
Especially, if kube-proxy's `sync_proxy_rules_duration_seconds` metric
182+
indicates an average time much larger than 1 second, then bumping up
183+
`minSyncPeriod` may make updates more efficient.
165184

166185
##### `syncPeriod`
167186

168187
The `syncPeriod` parameter controls a handful of synchronization
169188
operations that are not directly related to changes in individual
170-
Services and Endpoints. In particular, it controls how quickly
189+
Services and EndpointSlices. In particular, it controls how quickly
171190
kube-proxy notices if an external component has interfered with
172191
kube-proxy's iptables rules. In large clusters, kube-proxy also only
173192
performs certain cleanup operations once every `syncPeriod` to avoid
@@ -178,29 +197,6 @@ impact on performance, but in the past, it was sometimes useful to set
178197
it to a very large value (eg, `1h`). This is no longer recommended,
179198
and is likely to hurt functionality more than it improves performance.
180199

181-
##### Experimental performance improvements {#minimize-iptables-restore}
182-
183-
{{< feature-state for_k8s_version="v1.26" state="alpha" >}}
184-
185-
In Kubernetes 1.26, some new performance improvements were made to the
186-
iptables proxy mode, but they are not enabled by default (and should
187-
probably not be enabled in production clusters yet). To try them out,
188-
enable the `MinimizeIPTablesRestore` [feature
189-
gate](/docs/reference/command-line-tools-reference/feature-gates/) for
190-
kube-proxy with `--feature-gates=MinimizeIPTablesRestore=true,…`.
191-
192-
If you enable that feature gate and
193-
you were previously overriding
194-
`minSyncPeriod`, you should try removing that override and letting
195-
kube-proxy use the default value (`1s`) or at least a smaller value
196-
than you were using before.
197-
198-
If you notice kube-proxy's
199-
`sync_proxy_rules_iptables_restore_failures_total` or
200-
`sync_proxy_rules_iptables_partial_restore_failures_total` metrics
201-
increasing after enabling this feature, that likely indicates you are
202-
encountering bugs in the feature and you should file a bug report.
203-
204200
### IPVS proxy mode {#proxy-mode-ipvs}
205201

206202
In `ipvs` mode, kube-proxy watches Kubernetes Services and EndpointSlices,

0 commit comments

Comments
 (0)