@@ -552,19 +552,47 @@ This section must be completed when targeting beta to a release.
552
552
553
553
The most likely cause of a rollout failure would be a third-party
554
554
component that depended on one of the no-longer-existing IPTables
555
- chains. It is impossible to predict exactly how this third-party
556
- component would fail in this case, but it would likely impact already
557
- running workloads.
555
+ chains; most likely this would be a CNI plugin (either the default
556
+ network plugin or a chained plugin) or some other networking-related
557
+ component (NetworkPolicy implementation, service mesh, etc).
558
+
559
+ It is impossible to predict exactly how this third-party component
560
+ would fail in this case, but it would likely impact already running
561
+ workloads.
558
562
559
563
###### What specific metrics should inform a rollback?
560
564
561
- Any failures would be the result of third-party components being
562
- incompatible with the change, so no core Kubernetes metrics are likely
563
- to be relevant.
565
+ If the default network plugin (or plugin chain) depends on the missing
566
+ iptables chains, it is possible that all ` CNI_ADD ` calls would fail
567
+ and it would become impossible to start new pods, in which case
568
+ kubelet's ` started_pods_errors_total ` would start to climb. However,
569
+ "impossible to start new pods" would likely be noticed quickly without
570
+ metrics anyway...
571
+
572
+ For the most part, since failures would likely be in third-party
573
+ components, it would be the metrics of those third-party components
574
+ that would be relevant to diagnosing the problem. Since the problem is
575
+ likely to manifest in the form of iptables calls failing because they
576
+ reference non-existent chains, a metric for "number of iptables
577
+ errors" or "time since last successful iptables update" might be
578
+ useful in diagnosing problems related to this feature. (However, it is
579
+ also quite possible that the third-party components in question would
580
+ have no relevant metrics, and errors would be exposed only via log
581
+ messages.)
564
582
565
583
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
566
584
567
- TBD
585
+ Yes.
586
+
587
+ When considering only core kubernetes components, and only
588
+ alpha-or-later releases, upgrade/downgrade and enablement/disablement
589
+ create no additional complications beyond clean installs; kube-proxy
590
+ simply doesn't care about the additional rules that kubelet may or may
591
+ not be creating any more.
592
+
593
+ Upgrades from pre-alpha to beta-or-later or downgrades from
594
+ beta-or-later to pre-alpha are not supported, and for this reason we
595
+ waited 2 releases after alpha to go to beta.
568
596
569
597
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
570
598
@@ -579,16 +607,14 @@ This section must be completed when targeting beta to a release.
579
607
580
608
###### How can an operator determine if the feature is in use by workloads?
581
609
582
- There is no simple way to do this because if the feature is working
583
- correctly there will be no difference in externally-visible behavior.
584
- (The generated iptables rules will be different, but the _ effect_ of
585
- the generated iptables rules will be the same.)
610
+ The feature is not "used by workloads"; when enabled, it is always in
611
+ effect and affects the cluster as a whole.
586
612
587
613
###### How can someone using this feature know that it is working for their instance?
588
614
589
615
- [X] Other (treat as last resort)
590
616
591
- - Details: As above, the feature is not supposed to have any
617
+ - Details: The feature is not supposed to have any
592
618
externally-visible effect. If anything is not working, it is
593
619
likely to be a third-party component, so it is impossible to say
594
620
what a failure might look like.
643
669
644
670
No
645
671
672
+ ###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
673
+
674
+ No
675
+
646
676
### Troubleshooting
647
677
648
678
<!--
@@ -684,7 +714,12 @@ Major milestones might include:
684
714
685
715
- Initial proposal: 2022-01-23
686
716
- Updated: 2022-03-27, 2022-04-29
717
+ - Merged as ` implementable ` : 2022-06-10
687
718
- Updated: 2022-07-26 (feature gate rename)
719
+ - Alpha release (1.25): 2022-08-23
720
+ - [ Blog post about upcoming changes ] : 2022-09-07
721
+
722
+ [ Blog post about upcoming changes ] : https://kubernetes.io/blog/2022/09/07/iptables-chains-not-api/
688
723
689
724
## Drawbacks
690
725
0 commit comments