You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-windows/5100-windows-dsr-and-overlay-support/README.md
+50-7Lines changed: 50 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -478,6 +478,11 @@ enhancement:
478
478
cluster required to make on upgrade, in order to make use of the enhancement?
479
479
-->
480
480
481
+
For DSR `--enable-dsr=true` must be passed as a kube-proxy command line switch to enable the functionality.
482
+
This means that the upgrade/downgrade strategy is the redeploy kube-proxy with the appropriate configuration.
483
+
484
+
For overlay networking mode the entire cluster must be configured for overlay networking so cluster it is not possible for upgrade / downgrade this functionality on a per-node basis.
485
+
481
486
### Version Skew Strategy
482
487
483
488
<!--
@@ -493,6 +498,8 @@ enhancement:
493
498
CRI or CNI may require updating that component before the kubelet.
494
499
-->
495
500
501
+
N/A - As long as the all nodes are configured for overlay networking mode, there is no version skew strategy required since networking APIs are not changing.
502
+
496
503
## Production Readiness Review Questionnaire
497
504
498
505
<!--
@@ -535,15 +542,31 @@ well as the [existing list] of feature gates.
-[ ] Feature gate (also fill in values in `kep.yaml`)
539
-
- Feature gate name:
540
-
- Components depending on the feature gate:
541
-
-[ ] Other
545
+
For DSR support:
546
+
547
+
-[x] Feature gate (also fill in values in `kep.yaml`)
548
+
- Feature gate name: WinDSR
549
+
- Components depending on the feature gate: kube-proxy
550
+
-[x] Other
551
+
- Describe the mechanism: DSR is enabled by passing `--enable-dsr=true` as a command line switch to the Windows kube-proxy.
552
+
- Will enabling / disabling the feature require downtime of the control
553
+
plane? no
554
+
- Will enabling / disabling the feature require downtime or reprovisioning
555
+
of a node? Yes, there will be a slight period where network traffic might not be routed correctly while kube-proxy is restarted.
556
+
Kube-proxy will rules will be re-synced with/without DSR support when kube-proxy is starting up.
557
+
Nodes that handle network traffic show be drained before toggling DSR support.
558
+
559
+
For overlay networking mode:
560
+
561
+
-[x] Feature gate (also fill in values in `kep.yaml`)
562
+
- Feature gate name: WinOverlay
563
+
- Components depending on the feature gate: kube-proxy
564
+
-[x] Other
542
565
- Describe the mechanism:
543
566
- Will enabling / disabling the feature require downtime of the control
544
-
plane?
567
+
plane? Yes and no - The HNS network used by kube-proxy must be re-created with the correct type before starting kube-proxy which can disrupt network traffic but also all nodes in a cluster must use the same network type so it is not possible to switch between overlay and bridge networking on a per-node basis.
545
568
- Will enabling / disabling the feature require downtime or reprovisioning
546
-
of a node?
569
+
of a node? See above.
547
570
548
571
###### Does enabling the feature change any default behavior?
549
572
@@ -554,7 +577,7 @@ automations, so be extremely careful here.
554
577
555
578
No.
556
579
For DSR, `--enable-dsr=true` must be passed as a kube-proxy command line switch to enable the functionality.
557
-
For Overlay networking mode, the
580
+
For overlay networking supprt, behavior changes only occur if the HNS network used by kube-proxy is of type `Overlay` which would only be done intentionally as part of joining nodes to a cluster.
558
581
559
582
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
560
583
@@ -569,8 +592,14 @@ feature.
569
592
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
570
593
-->
571
594
595
+
For DSR, yes, DSR can be disabled by passing `--enable-dsr=false` as a kube-proxy command line switch and restarting kube-proxy.
596
+
597
+
FOr Overlay, no, overlay networking mode cannot be disabled on a per-node basis. All nodes in a cluster must use the same network type so it is not possible to switch between overlay and bridge networking on a per-node basis.
598
+
572
599
###### What happens if we reenable the feature if it was previously rolled back?
573
600
601
+
For DSR, kube-proxy should resync HNS rules and start using DSR again.
602
+
574
603
###### Are there any tests for feature enablement/disablement?
575
604
576
605
<!--
@@ -586,6 +615,10 @@ You can take a look at one potential example of such test in:
For overlay, no, because the feature requires the cluster to be configured for overlay networking mode and cannot be enabled on a per-node basis.
619
+
620
+
For DSR, no, but they can be added.
621
+
589
622
### Rollout, Upgrade and Rollback Planning
590
623
591
624
<!--
@@ -604,13 +637,19 @@ rollout. Similarly, consider large clusters and how enablement/disablement
604
637
will rollout across nodes.
605
638
-->
606
639
640
+
For DSR a rollout or rollback shoudl not fail. Nodes can operator with DSR enabled or disabled per node in a cluster.
641
+
642
+
For overlay networking mode support, a rollout can fail if the CNI configuration for the node and kube-proxy configuration are not in sync. This would cause nodes to never go into the Ready state.
643
+
607
644
###### What specific metrics should inform a rollback?
608
645
609
646
<!--
610
647
What signals should users be paying attention to when the feature is young
611
648
that might indicate a serious problem?
612
649
-->
613
650
651
+
Node ready state should be monitored to ensure nodes job the cluster and are properly configured to start running pods.
652
+
614
653
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
615
654
616
655
<!--
@@ -619,12 +658,16 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
619
658
are missing a bunch of machinery and tooling and can't do that now.
620
659
-->
621
660
661
+
For DSR support yes, manual verification was done to ensure that DSR can be enabled and disabled on a node.
662
+
622
663
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
623
664
624
665
<!--
625
666
Even if applying deprecation policies, they may still surprise some users.
0 commit comments