Skip to content

Commit 0d00fd4

Browse files
committed
KEP-753: add PRR answers for beta
Signed-off-by: Matthias Bertschy <[email protected]>
1 parent 8669330 commit 0d00fd4

File tree

2 files changed

+171
-30
lines changed

2 files changed

+171
-30
lines changed

keps/prod-readiness/sig-node/753.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 753
22
alpha:
33
approver: "@wojtek-t"
4+
beta:
5+
approver: "@wojtek-t"

keps/sig-node/753-sidecar-containers/README.md

Lines changed: 169 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,7 @@ tags, and then generate with `hack/update-toc.sh`.
130130
- [Graduation Criteria](#graduation-criteria)
131131
- [Alpha](#alpha)
132132
- [Beta](#beta)
133+
- [GA](#ga)
133134
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
134135
- [Upgrade strategy](#upgrade-strategy)
135136
- [Downgrade strategy](#downgrade-strategy)
@@ -1102,7 +1103,7 @@ For Beta and GA, add links to added tests together with links to k8s-triage for
11021103
https://storage.googleapis.com/k8s-triage/index.html
11031104
-->
11041105

1105-
- <test>: <link to test coverage> FIXME to fill
1106+
No integration tests are planned. We'll cover this with e2e_node tests.
11061107

11071108
##### e2e tests
11081109

@@ -1116,7 +1117,8 @@ https://storage.googleapis.com/k8s-triage/index.html
11161117
We expect no non-infra related flakes in the last month as a GA graduation criteria.
11171118
-->
11181119

1119-
- <test>: <link to test coverage> FIXME to fill
1120+
- Test failures: https://storage.googleapis.com/k8s-triage/index.html?pr=1&test=SidecarContainers
1121+
- All related tests can be filtered with the SidecarContainers
11201122

11211123
##### Ready state of a sidecar container is properly used to create/delete endpoints
11221124

@@ -1128,16 +1130,20 @@ The sidecar container can expose ports and can be used to handle external traffi
11281130

11291131
##### Pod lifecycle scenarios without sidecar containers
11301132

1131-
TBD: describe test cases that can be affected by introducing sidecar containers.
1133+
- Init containers should start after the previous init container has completed
1134+
- Regular containers should start in parallel after all init containers have completed
1135+
- Restart behavior of the init containers
11321136

11331137
##### Pod lifecycle scenarios with sidecar containers
11341138

1135-
TBD: describe test cases to test with sidecar containers.
1139+
- Init containers should start after the previous restartable init container has started
1140+
- Regular containers should start in parallel after all regular init containers have completed and all restartable init containers have started
1141+
- Restartable init containers should restart always
11361142

11371143
##### Kubelet restart test cases
11381144

1139-
TBD: describe test cases on how Pod with sidecar containers behaves when kubelet
1140-
was restarted
1145+
- It should restart the containers in the right order after the node reboot
1146+
- It should not restart any completed init containers after the kubelet restart
11411147

11421148
##### API server is down: failure to update containers status during initialization
11431149

@@ -1254,13 +1260,15 @@ in back-to-back releases.
12541260
#### Beta
12551261

12561262
- Implement proper termination ordering.
1257-
- Provide defaults for `restartPolicy` field on init containers, `nil` is not
1258-
an ideal default long term.
1259-
- Allow to apply security policies on all containers in `initContainers`
1260-
collection. Example may be disabling `kubectl exec` on containers in
1261-
`initContainers` collection.
1263+
- Allow sidecar containers to restart during the shutdown of the Pod.
1264+
- Add tests with feature activation and deactivation (see [Feature Enablement and Rollback](#feature-enablement-and-rollback)).
12621265

1266+
#### GA
12631267

1268+
- Enable `livenessProbe` during the shutdown of the Pod for sidecar containers that restart.
1269+
- Allow to apply security policies on all containers in `initContainers`
1270+
collection. Example may be disabling `kubectl exec` on containers in
1271+
`initContainers` collection.
12641272

12651273
### Upgrade / Downgrade Strategy
12661274

@@ -1298,13 +1306,19 @@ Pods that has already been created will stay being scheduled after the downgrade
12981306
not be rejected by control
12991307
plane nor by kubelet. Both will treat the sidecar container as an Init container.
13001308
This may render the Pod unusable as it will stuck in initialization forever -
1301-
sidecar container are never exiting. We will document this behavior for Alpha release.
1302-
Promoting feature to Beta we will revisit the situation. If we will see this as
1303-
a major issue, we will need to wait for 3 releases so kubelet will have the logic
1309+
sidecar container are never exiting.
1310+
This behavior has been documented for Alpha release, but we don't see it as a
1311+
major issue requiring to wait for 3 releases so kubelet will have the logic
13041312
to reject such Pods when the feature gate is disabled to keep Downgrade safe.
13051313

1306-
**Note**, For the control plane and kubelet we will implement logic to reject Pods
1307-
with sidecar containers when feature gate got turned off.
1314+
**Note**, We have implemented logic for the
1315+
[kubelet](https://github.com/kubernetes/kubernetes/blob/f19b62fc0914b38941922afefd1e34eb55f87ee7/pkg/kubelet/lifecycle/predicate.go#L78-L91)
1316+
to reject Pods with sidecar containers when feature gate is turned off.
1317+
For the control plane -
1318+
[kube-apiserver](https://github.com/kubernetes/kubernetes/blob/f19b62fc0914b38941922afefd1e34eb55f87ee7/pkg/api/pod/util.go#L554-L560)
1319+
is dropping the field (if it wasn't set before) and
1320+
[kube-scheduler](https://github.com/kubernetes/kubernetes/blob/f19b62fc0914b38941922afefd1e34eb55f87ee7/pkg/scheduler/framework/plugins/noderesources/fit.go#L256-L262)
1321+
is keeping pods with the field set unschedulable.
13081322
See [Upgrade/downgrade testing](#upgradedowngrade-testing) section.
13091323

13101324
Workloads will have to be deleted and recreated with the old way of handling
@@ -1452,7 +1466,7 @@ You can take a look at one potential example of such test in:
14521466
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
14531467
-->
14541468

1455-
See [Upgrade/downgrade testing](#upgradedowngrade-testing).
1469+
Not yet, but it is planned/required to add them before graduation to Beta.
14561470

14571471
### Rollout, Upgrade and Rollback Planning
14581472

@@ -1472,13 +1486,45 @@ rollout. Similarly, consider large clusters and how enablement/disablement
14721486
will rollout across nodes.
14731487
-->
14741488

1489+
Rollout could fail for multiple reasons:
1490+
1491+
- webhooks that are not recompiled with the new field will strip it out
1492+
- bug in the resource calculation or CPU reservation logic could render the Pod unschedulable
1493+
- bug in the kubelet affecting the pod lifecycle could cause the Pod to be stuck in initialization
1494+
1495+
However, we have tried to maintain a high coverage of unit tests to ensure we catch these.
1496+
1497+
Rollback can fail if a Pod with sidecars is scheduled on a node where the feature
1498+
is disabled.
1499+
In that case the Pod will be rejected by kubelet and will be stuck in Pending state.
1500+
Therefore, we advise to first disable the feature gate on the control plane and then
1501+
proceed with the nodes.
1502+
1503+
Running workloads are not impacted.
1504+
1505+
Pods with sidecars might take a long time to exit and exceed the TGPS, a new
1506+
event should be added in beta to help administrators diagnose this issue.
1507+
Rather than rolling back the feature, they should work on the graceful termination
1508+
of their main containers to ensure sidecars have enough time to be notified
1509+
and exit on their own.
1510+
14751511
###### What specific metrics should inform a rollback?
14761512

14771513
<!--
14781514
What signals should users be paying attention to when the feature is young
14791515
that might indicate a serious problem?
14801516
-->
14811517

1518+
- [X] Metrics
1519+
- Metric name: kubelet_started_containers_errors_total
1520+
- Type: Counter
1521+
- Labels:code, container_type (should be `init_container`)
1522+
- Components exposing the metric: `kubelet-metrics`
1523+
- Symptoms: high number of errors indicates that the kubelet is unable to start the sidecar containers
1524+
- [X] Events
1525+
- Event name: TBD
1526+
- Symptoms: high number of events indicates that the TGPS has been exceeded and sidecars have been terminated not gracefully
1527+
14821528
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
14831529

14841530
<!--
@@ -1487,12 +1533,41 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
14871533
are missing a bunch of machinery and tooling and can't do that now.
14881534
-->
14891535

1536+
Upgrade->downgrade->upgrade testing was done manually using the following steps:
1537+
1538+
Kubelet specific:
1539+
1. Deploy k8s 1.29-alpha
1540+
2. Enable the `SidecarContainers` feature gate on the control plane and kubelet
1541+
3. Deploy a Pod with sidecar containers using a Deployment
1542+
4. Disable the `SidecarContainers` feature gate on the kubelet (requires a restart)
1543+
5. Drain the node
1544+
6. Pod is rejected by kubelet
1545+
7. Enable the `SidecarContainers` feature gate on the kubelet (requires a restart)
1546+
8. Pod is scheduled and works as expected
1547+
1548+
Control plane specific:
1549+
1. Deploy k8s 1.29-alpha
1550+
2. Enable the `SidecarContainers` feature gate on the control plane and kubelet
1551+
3. Deploy a Pod with sidecar containers using a Deployment
1552+
4. Disable the `SidecarContainers` feature gate on the control plane
1553+
5. Delete the Pod
1554+
6. Pod is created without the new field - init containers are not recognized as sidecars and block the Pod in initialization
1555+
7. Modify the Deployment by moving the sidecar containers to the regular containers section
1556+
8. Pod is scheduled and works (without the sidecar support)
1557+
9. Enable the `SidecarContainers` feature gate on the control plane
1558+
10. Delete the Pod
1559+
11. Pod is scheduled and works (without the sidecar support)
1560+
12. Modify the Deployment by moving the sidecar containers to the init containers section
1561+
13. Pod is scheduled and works (with the sidecar support)
1562+
14901563
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
14911564

14921565
<!--
14931566
Even if applying deprecation policies, they may still surprise some users.
14941567
-->
14951568

1569+
No.
1570+
14961571
### Monitoring Requirements
14971572

14981573
<!--
@@ -1510,6 +1585,8 @@ checking if there are objects with field X set) may be a last resort. Avoid
15101585
logs or events for this purpose.
15111586
-->
15121587

1588+
By checking if `.spec.initContainers[i].restartPolicy` is set to `OnFailure` or `Always`.
1589+
15131590
###### How can someone using this feature know that it is working for their instance?
15141591

15151592
<!--
@@ -1521,13 +1598,17 @@ and operation of this feature.
15211598
Recall that end users cannot usually observe component logs or access metrics.
15221599
-->
15231600

1524-
- [ ] Events
1525-
- Event Reason:
1526-
- [ ] API .status
1527-
- Condition name:
1528-
- Other field:
1529-
- [ ] Other (treat as last resort)
1530-
- Details:
1601+
End users can check components that are using the new feature, such as Istio, if istio-proxy runs as a sidecar container:
1602+
1603+
```
1604+
$ kubectl get pod -o "custom-columns="\
1605+
"NAME:.metadata.name,"\
1606+
"INIT:.spec.initContainers[*].name,"\
1607+
"CONTAINERS:.spec.containers[*].name"
1608+
1609+
NAME INIT CONTAINERS
1610+
sleep-7656cf8794-8fhdk istio-init,istio-proxy sleep
1611+
```
15311612

15321613
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
15331614

@@ -1546,18 +1627,33 @@ These goals will help you determine what you need to measure (SLIs) in the next
15461627
question.
15471628
-->
15481629

1630+
- number of running containers should not change by more than 10% throughout the day,
1631+
as measured by the number of running containers at the beginning and end of the day
1632+
- error rate for containers of type init_container should be less than 1%,
1633+
as measured by the number of errors divided by the total number of init_container containers
1634+
- number of events indicating that TGPS has been exceeded should be less than 10 per day,
1635+
as measured by the number of events logged in the kubelet log
1636+
- 99% of jobs with sidecars should complete successfully,
1637+
as measured by the number of jobs that complete successfully divided by the total number of jobs with sidecars
1638+
15491639
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
15501640

15511641
<!--
15521642
Pick one more of these and delete the rest.
15531643
-->
15541644

1555-
- [ ] Metrics
1556-
- Metric name:
1557-
- [Optional] Aggregation method:
1558-
- Components exposing the metric:
1559-
- [ ] Other (treat as last resort)
1560-
- Details:
1645+
- [X] Metrics
1646+
- Metric name: kubelet_running_containers
1647+
- Type: Gauge
1648+
- Labels:container_state
1649+
- Components exposing the metric: `kubelet-metrics`
1650+
- Metric name: kubelet_started_containers_errors_total
1651+
- Type: Counter
1652+
- Labels:code, container_type (should be `init_container`)
1653+
- Components exposing the metric: `kubelet-metrics`
1654+
- [X] Events
1655+
- Event name: TBD
1656+
- should not appear, unless TGPS is exceeded and sidecars are terminated
15611657

15621658
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
15631659

@@ -1566,6 +1662,8 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
15661662
implementation difficulties, etc.).
15671663
-->
15681664

1665+
No.
1666+
15691667
### Dependencies
15701668

15711669
<!--
@@ -1589,6 +1687,8 @@ and creating new ones, as well as about cluster-level services (e.g. DNS):
15891687
- Impact of its degraded performance or high-error rates on the feature:
15901688
-->
15911689

1690+
No.
1691+
15921692
### Scalability
15931693

15941694
<!--
@@ -1616,6 +1716,8 @@ Focusing mostly on:
16161716
heartbeats, leader election, etc.)
16171717
-->
16181718

1719+
No.
1720+
16191721
###### Will enabling / using this feature result in introducing new API types?
16201722

16211723
<!--
@@ -1625,6 +1727,8 @@ Describe them, providing:
16251727
- Supported number of objects per namespace (for namespace-scoped objects)
16261728
-->
16271729

1730+
No.
1731+
16281732
###### Will enabling / using this feature result in any new calls to the cloud provider?
16291733

16301734
<!--
@@ -1633,6 +1737,8 @@ Describe them, providing:
16331737
- Estimated increase:
16341738
-->
16351739

1740+
No.
1741+
16361742
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
16371743

16381744
<!--
@@ -1642,6 +1748,8 @@ Describe them, providing:
16421748
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
16431749
-->
16441750

1751+
No.
1752+
16451753
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
16461754

16471755
<!--
@@ -1653,6 +1761,10 @@ Think about adding additional work or introducing new steps in between
16531761
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
16541762
-->
16551763

1764+
Graceful Pod termination might take longer with sidecars since their exit sequence starts after the
1765+
last main container has stopped.
1766+
The impact should be negligible because the TGPS is enforced in all cases.
1767+
16561768
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
16571769

16581770
<!--
@@ -1665,6 +1777,8 @@ This through this both in small and large cases, again with respect to the
16651777
[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
16661778
-->
16671779

1780+
No.
1781+
16681782
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
16691783

16701784
<!--
@@ -1677,6 +1791,10 @@ Are there any tests that were run/should be run to understand performance charac
16771791
and validate the declared limits?
16781792
-->
16791793

1794+
No, since the KEP only enable a new way to run containers as sidecars instead of regular containers.
1795+
Resource consumption can even be lower since various tricks using emptyDir volumes to perform synchronization
1796+
(as with istio-proxy) are no longer needed.
1797+
16801798
### Troubleshooting
16811799

16821800
<!--
@@ -1692,6 +1810,8 @@ details). For now, we leave it here.
16921810

16931811
###### How does this feature react if the API server and/or etcd is unavailable?
16941812

1813+
Nothing changes compared to the current kubelet behavior.
1814+
16951815
###### What are other known failure modes?
16961816

16971817
<!--
@@ -1707,8 +1827,27 @@ For each of them, fill in the following information by copying the below templat
17071827
- Testing: Are there any tests for failure mode? If not, describe why.
17081828
-->
17091829

1830+
- Main containers don't exit within TGPS, leading to sidecars being terminated
1831+
- Detection: high number of events indicating TGPS has been exceeded
1832+
- Mitigations: ensure timely termination of main containers
1833+
- Diagnostics: Events
1834+
- Testing: TBD
1835+
- Main container or sidecar use a preStop hook consuming TGPS, leading to remaining sidecars being terminated
1836+
- Detection: high number of events indicating TGPS has been exceeded
1837+
- Mitigations: ensure preStop hooks are not delaying termination
1838+
- Diagnostics: Events
1839+
- Testing: TBD
1840+
- Sidecar container uses a preStop hook that make the container exit during Pod shutdown, sidecar is restarted, leading
1841+
to a CrashLoopBackOff
1842+
- Detection: sidecar in CrashLoopBackOff during termination
1843+
- Mitigations: ensure preStop hooks are not making the container to exit, document best practices
1844+
- Diagnostics: Events
1845+
- Testing: TBD
1846+
17101847
###### What steps should be taken if SLOs are not being met to determine the problem?
17111848

1849+
None.
1850+
17121851
## Implementation History
17131852

17141853
<!--

0 commit comments

Comments
 (0)