Skip to content

Commit a298911

Browse files
authored
Merge pull request #3690 from danwinship/kep-3178-to-beta
KEP 3178: update for beta
2 parents 8c132a3 + 1a27d4f commit a298911

File tree

3 files changed

+51
-14
lines changed

3 files changed

+51
-14
lines changed

keps/prod-readiness/sig-network/3178.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,5 @@
44
kep-number: 3178
55
alpha:
66
approver: "@johnbelamaric"
7+
beta:
8+
approver: "@johnbelamaric"

keps/sig-network/3178-iptables-cleanup/README.md

Lines changed: 47 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -552,19 +552,47 @@ This section must be completed when targeting beta to a release.
552552

553553
The most likely cause of a rollout failure would be a third-party
554554
component that depended on one of the no-longer-existing IPTables
555-
chains. It is impossible to predict exactly how this third-party
556-
component would fail in this case, but it would likely impact already
557-
running workloads.
555+
chains; most likely this would be a CNI plugin (either the default
556+
network plugin or a chained plugin) or some other networking-related
557+
component (NetworkPolicy implementation, service mesh, etc).
558+
559+
It is impossible to predict exactly how this third-party component
560+
would fail in this case, but it would likely impact already running
561+
workloads.
558562

559563
###### What specific metrics should inform a rollback?
560564

561-
Any failures would be the result of third-party components being
562-
incompatible with the change, so no core Kubernetes metrics are likely
563-
to be relevant.
565+
If the default network plugin (or plugin chain) depends on the missing
566+
iptables chains, it is possible that all `CNI_ADD` calls would fail
567+
and it would become impossible to start new pods, in which case
568+
kubelet's `started_pods_errors_total` would start to climb. However,
569+
"impossible to start new pods" would likely be noticed quickly without
570+
metrics anyway...
571+
572+
For the most part, since failures would likely be in third-party
573+
components, it would be the metrics of those third-party components
574+
that would be relevant to diagnosing the problem. Since the problem is
575+
likely to manifest in the form of iptables calls failing because they
576+
reference non-existent chains, a metric for "number of iptables
577+
errors" or "time since last successful iptables update" might be
578+
useful in diagnosing problems related to this feature. (However, it is
579+
also quite possible that the third-party components in question would
580+
have no relevant metrics, and errors would be exposed only via log
581+
messages.)
564582

565583
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
566584

567-
TBD
585+
Yes.
586+
587+
When considering only core kubernetes components, and only
588+
alpha-or-later releases, upgrade/downgrade and enablement/disablement
589+
create no additional complications beyond clean installs; kube-proxy
590+
simply doesn't care about the additional rules that kubelet may or may
591+
not be creating any more.
592+
593+
Upgrades from pre-alpha to beta-or-later or downgrades from
594+
beta-or-later to pre-alpha are not supported, and for this reason we
595+
waited 2 releases after alpha to go to beta.
568596

569597
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
570598

@@ -579,16 +607,14 @@ This section must be completed when targeting beta to a release.
579607

580608
###### How can an operator determine if the feature is in use by workloads?
581609

582-
There is no simple way to do this because if the feature is working
583-
correctly there will be no difference in externally-visible behavior.
584-
(The generated iptables rules will be different, but the _effect_ of
585-
the generated iptables rules will be the same.)
610+
The feature is not "used by workloads"; when enabled, it is always in
611+
effect and affects the cluster as a whole.
586612

587613
###### How can someone using this feature know that it is working for their instance?
588614

589615
- [X] Other (treat as last resort)
590616

591-
- Details: As above, the feature is not supposed to have any
617+
- Details: The feature is not supposed to have any
592618
externally-visible effect. If anything is not working, it is
593619
likely to be a third-party component, so it is impossible to say
594620
what a failure might look like.
@@ -643,6 +669,10 @@ No
643669

644670
No
645671

672+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
673+
674+
No
675+
646676
### Troubleshooting
647677

648678
<!--
@@ -684,7 +714,12 @@ Major milestones might include:
684714

685715
- Initial proposal: 2022-01-23
686716
- Updated: 2022-03-27, 2022-04-29
717+
- Merged as `implementable`: 2022-06-10
687718
- Updated: 2022-07-26 (feature gate rename)
719+
- Alpha release (1.25): 2022-08-23
720+
- [Blog post about upcoming changes]: 2022-09-07
721+
722+
[Blog post about upcoming changes]: https://kubernetes.io/blog/2022/09/07/iptables-chains-not-api/
688723

689724
## Drawbacks
690725

keps/sig-network/3178-iptables-cleanup/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,12 @@ see-also:
1616
- "/keps/sig-node/2221-remove-dockershim"
1717

1818
# The target maturity stage in the current dev cycle for this KEP.
19-
stage: alpha
19+
stage: beta
2020

2121
# The most recent milestone for which work toward delivery of this KEP has been
2222
# done. This can be the current (upcoming) milestone, if it is being actively
2323
# worked on.
24-
latest-milestone: "v1.25"
24+
latest-milestone: "v1.27"
2525

2626
# The milestone at which this feature was, or is targeted to be, at each stage.
2727
milestone:

0 commit comments

Comments
 (0)