Skip to content

Commit 2ee2040

Browse files
committed
PRR updates
1 parent 71b1fd1 commit 2ee2040

File tree

1 file changed

+50
-7
lines changed
  • keps/sig-windows/5100-windows-dsr-and-overlay-support

1 file changed

+50
-7
lines changed

keps/sig-windows/5100-windows-dsr-and-overlay-support/README.md

Lines changed: 50 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -478,6 +478,11 @@ enhancement:
478478
cluster required to make on upgrade, in order to make use of the enhancement?
479479
-->
480480

481+
For DSR `--enable-dsr=true` must be passed as a kube-proxy command line switch to enable the functionality.
482+
This means that the upgrade/downgrade strategy is the redeploy kube-proxy with the appropriate configuration.
483+
484+
For overlay networking mode the entire cluster must be configured for overlay networking so cluster it is not possible for upgrade / downgrade this functionality on a per-node basis.
485+
481486
### Version Skew Strategy
482487

483488
<!--
@@ -493,6 +498,8 @@ enhancement:
493498
CRI or CNI may require updating that component before the kubelet.
494499
-->
495500

501+
N/A - As long as the all nodes are configured for overlay networking mode, there is no version skew strategy required since networking APIs are not changing.
502+
496503
## Production Readiness Review Questionnaire
497504

498505
<!--
@@ -535,15 +542,31 @@ well as the [existing list] of feature gates.
535542
[existing list]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
536543
-->
537544

538-
- [ ] Feature gate (also fill in values in `kep.yaml`)
539-
- Feature gate name:
540-
- Components depending on the feature gate:
541-
- [ ] Other
545+
For DSR support:
546+
547+
- [x] Feature gate (also fill in values in `kep.yaml`)
548+
- Feature gate name: WinDSR
549+
- Components depending on the feature gate: kube-proxy
550+
- [x] Other
551+
- Describe the mechanism: DSR is enabled by passing `--enable-dsr=true` as a command line switch to the Windows kube-proxy.
552+
- Will enabling / disabling the feature require downtime of the control
553+
plane? no
554+
- Will enabling / disabling the feature require downtime or reprovisioning
555+
of a node? Yes, there will be a slight period where network traffic might not be routed correctly while kube-proxy is restarted.
556+
Kube-proxy will rules will be re-synced with/without DSR support when kube-proxy is starting up.
557+
Nodes that handle network traffic show be drained before toggling DSR support.
558+
559+
For overlay networking mode:
560+
561+
- [x] Feature gate (also fill in values in `kep.yaml`)
562+
- Feature gate name: WinOverlay
563+
- Components depending on the feature gate: kube-proxy
564+
- [x] Other
542565
- Describe the mechanism:
543566
- Will enabling / disabling the feature require downtime of the control
544-
plane?
567+
plane? Yes and no - The HNS network used by kube-proxy must be re-created with the correct type before starting kube-proxy which can disrupt network traffic but also all nodes in a cluster must use the same network type so it is not possible to switch between overlay and bridge networking on a per-node basis.
545568
- Will enabling / disabling the feature require downtime or reprovisioning
546-
of a node?
569+
of a node? See above.
547570

548571
###### Does enabling the feature change any default behavior?
549572

@@ -554,7 +577,7 @@ automations, so be extremely careful here.
554577

555578
No.
556579
For DSR, `--enable-dsr=true` must be passed as a kube-proxy command line switch to enable the functionality.
557-
For Overlay networking mode, the
580+
For overlay networking supprt, behavior changes only occur if the HNS network used by kube-proxy is of type `Overlay` which would only be done intentionally as part of joining nodes to a cluster.
558581

559582
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
560583

@@ -569,8 +592,14 @@ feature.
569592
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
570593
-->
571594

595+
For DSR, yes, DSR can be disabled by passing `--enable-dsr=false` as a kube-proxy command line switch and restarting kube-proxy.
596+
597+
FOr Overlay, no, overlay networking mode cannot be disabled on a per-node basis. All nodes in a cluster must use the same network type so it is not possible to switch between overlay and bridge networking on a per-node basis.
598+
572599
###### What happens if we reenable the feature if it was previously rolled back?
573600

601+
For DSR, kube-proxy should resync HNS rules and start using DSR again.
602+
574603
###### Are there any tests for feature enablement/disablement?
575604

576605
<!--
@@ -586,6 +615,10 @@ You can take a look at one potential example of such test in:
586615
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
587616
-->
588617

618+
For overlay, no, because the feature requires the cluster to be configured for overlay networking mode and cannot be enabled on a per-node basis.
619+
620+
For DSR, no, but they can be added.
621+
589622
### Rollout, Upgrade and Rollback Planning
590623

591624
<!--
@@ -604,13 +637,19 @@ rollout. Similarly, consider large clusters and how enablement/disablement
604637
will rollout across nodes.
605638
-->
606639

640+
For DSR a rollout or rollback shoudl not fail. Nodes can operator with DSR enabled or disabled per node in a cluster.
641+
642+
For overlay networking mode support, a rollout can fail if the CNI configuration for the node and kube-proxy configuration are not in sync. This would cause nodes to never go into the Ready state.
643+
607644
###### What specific metrics should inform a rollback?
608645

609646
<!--
610647
What signals should users be paying attention to when the feature is young
611648
that might indicate a serious problem?
612649
-->
613650

651+
Node ready state should be monitored to ensure nodes job the cluster and are properly configured to start running pods.
652+
614653
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
615654

616655
<!--
@@ -619,12 +658,16 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
619658
are missing a bunch of machinery and tooling and can't do that now.
620659
-->
621660

661+
For DSR support yes, manual verification was done to ensure that DSR can be enabled and disabled on a node.
662+
622663
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
623664

624665
<!--
625666
Even if applying deprecation policies, they may still surprise some users.
626667
-->
627668

669+
No
670+
628671
### Monitoring Requirements
629672

630673
<!--

0 commit comments

Comments
 (0)