You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Renamed release checklist item: 1.19tbd -> 1.19 as hugepages e2e tests
have been implemented in 1.19 time frame
- Updated release checklist and implementation history for release 1.22
- Updated kep.yaml
- Added graduation criteria and test plan for HugePageStorageMediumSize
- Added PRR questionnaire for HugePageStorageMediumSize
- Promote existing HugePages E2E tests to conformance
566
+
567
+
## Production Readiness Review Questionnaire for HugePageStorageMediumSize
568
+
### Monitoring requirements
569
+
570
+
***How can an operator determine if the feature is in use by workloads?**
571
+
An operator could use hugepages-<size> resource limits and emptydir
572
+
mounts with medium: HugePage-<size> as described in the Kubernetes
573
+
documentation at https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages
574
+
575
+
***What are the SLIs (Service Level Indicators) an operator can use to determine.
576
+
the health of the service?**
577
+
-[ ] Metrics
578
+
- Metric name:
579
+
`kube_pod_resource_request` and `kube_pod_resource_limit` for hugepages-<size> resources indicates usage.
580
+
- Components exposing the metric: kube-scheduler
581
+
582
+
Workload performance can be measured by existing system metrics provided by Kubernetes components and e.g. [node_exporter](https://github.com/prometheus/node_exporter)
583
+
584
+
***What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
585
+
586
+
These will be set individually by application developers. This feature allows them to tune the performance of their workloads. See e.g. [Linux Huge Pages and virtual memory (VM) tuning](https://blog.yannickjaquier.com/linux/linux-hugepages-and-virtual-memory-vm-tuning.html)
587
+
588
+
***Are there any missing metrics that would be useful to have to improve observability.
589
+
of this feature?**
590
+
No.
591
+
592
+
### Dependencies
593
+
594
+
***Does this feature depend on any specific services running in the cluster?**
595
+
No
596
+
597
+
### Scalability
598
+
599
+
***Will enabling / using this feature result in any new API calls?**
600
+
No.
601
+
602
+
***Will enabling / using this feature result in introducing new API types?**
603
+
No
604
+
605
+
***Will enabling / using this feature result in any new calls to the cloud.
606
+
provider?**
607
+
No
608
+
609
+
***Will enabling / using this feature result in increasing size or count of.
610
+
the existing API objects?**
611
+
No
612
+
613
+
***Will enabling / using this feature result in increasing time taken by any.
614
+
operations covered by [existing SLIs/SLOs]?**
615
+
No
616
+
617
+
***Will enabling / using this feature result in non-negligible increase of.
618
+
resource usage (CPU, RAM, disk, IO, ...) in any components?**
619
+
No
620
+
621
+
### Troubleshooting
622
+
623
+
***How does this feature react if the API server and/or etcd is unavailable?**
624
+
No impact.
625
+
626
+
***What are other known failure modes?**
627
+
Not applicable.
628
+
629
+
***What steps should be taken if SLOs are not being met to determine the problem?**
630
+
A cluster admin can tune the HugePage requests allocated to a workload by changing the available sizes, use the default HugePages configuration, or disable HugePages on the workload entirely.
631
+
549
632
## Implementation History
550
633
551
634
### Version 1.8
@@ -565,9 +648,14 @@ using the feature without issue.
565
648
566
649
Extending of huge pages feature to support container isolation of huge pages and multiple sizes of huge pages.
567
650
568
-
### Version 1.19[TBD]
651
+
### Version 1.19
652
+
653
+
Extending of huge pages test suite of E2E tests and cri-tools for enhancements after GA.
654
+
655
+
### Version 1.22
569
656
570
-
Extending of huge pages test suit of E2E tests and cri-tools for enhancements after GA.
657
+
GA support of multiple huge page sizes proposed based on feedback from
658
+
user community using the feature without issue.
571
659
572
660
## Release Signoff Checklist
573
661
-\[x] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
0 commit comments