@@ -99,6 +99,9 @@ tags, and then generate with `hack/update-toc.sh`.
99
99
- [ Integration tests] ( #integration-tests )
100
100
- [ e2e tests] ( #e2e-tests )
101
101
- [ Graduation Criteria] ( #graduation-criteria )
102
+ - [ Alpha] ( #alpha )
103
+ - [ Beta] ( #beta )
104
+ - [ GA] ( #ga )
102
105
- [ Upgrade / Downgrade Strategy] ( #upgrade--downgrade-strategy )
103
106
- [ Version Skew Strategy] ( #version-skew-strategy )
104
107
- [ Production Readiness Review Questionnaire] ( #production-readiness-review-questionnaire )
@@ -428,6 +431,9 @@ extending the production code to implement this enhancement.
428
431
429
432
- ` <package> ` : ` <date> ` - ` <test coverage> `
430
433
434
+ - Unit tests to ensure that the metrics output meets expectations.
435
+ - Unit tests to ensure that the metrics deletion is functioning properly.
436
+
431
437
##### Integration tests
432
438
433
439
<!--
@@ -529,6 +535,21 @@ in back-to-back releases.
529
535
- Deprecate the flag
530
536
-->
531
537
538
+ #### Alpha
539
+
540
+ - Feature implemented behind a feature gate flag
541
+ - Add related integration and unit tests to ensure functionality and make sure there is no memory leak in
542
+ existing behavior
543
+
544
+ #### Beta
545
+
546
+ - Gather feedback from developers and surveys
547
+ - Work on feedback and add additional tests as needed
548
+
549
+ #### GA
550
+
551
+ - Decision on GA will be made based on beta feedback
552
+
532
553
### Upgrade / Downgrade Strategy
533
554
534
555
<!--
@@ -543,6 +564,8 @@ enhancement:
543
564
cluster required to make on upgrade, in order to make use of the enhancement?
544
565
-->
545
566
567
+ N/A
568
+
546
569
### Version Skew Strategy
547
570
548
571
<!--
@@ -602,16 +625,10 @@ well as the [existing list] of feature gates.
602
625
[existing list]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
603
626
-->
604
627
605
- - [ ] Feature gate (also fill in values in ` kep.yaml ` )
628
+ - [X ] Feature gate (also fill in values in ` kep.yaml ` )
606
629
- Feature gate name: InformerMetrics
607
630
- Components depending on the feature gate:
608
631
- components via client-go library
609
- - [ ] Other
610
- - Describe the mechanism:
611
- - Will enabling / disabling the feature require downtime of the control
612
- plane?
613
- - Will enabling / disabling the feature require downtime or reprovisioning
614
- of a node?
615
632
616
633
###### Does enabling the feature change any default behavior?
617
634
@@ -655,7 +672,7 @@ You can take a look at one potential example of such test in:
655
672
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
656
673
-->
657
674
658
- For now, there is no tests for feature enablement/disablement. The unit tests will be added.
675
+ For now, there is no tests for feature enablement/disablement. The unit / integration tests will be added.
659
676
660
677
### Rollout, Upgrade and Rollback Planning
661
678
@@ -675,13 +692,17 @@ rollout. Similarly, consider large clusters and how enablement/disablement
675
692
will rollout across nodes.
676
693
-->
677
694
695
+ Feature has no impact on rollout/rollback, and no impact on running workloads.
696
+
678
697
###### What specific metrics should inform a rollback?
679
698
680
699
<!--
681
700
What signals should users be paying attention to when the feature is young
682
701
that might indicate a serious problem?
683
702
-->
684
703
704
+ The memory used by this metrics continues to grow, consuming a significant amount
705
+
685
706
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
686
707
687
708
<!--
@@ -690,12 +711,16 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
690
711
are missing a bunch of machinery and tooling and can't do that now.
691
712
-->
692
713
714
+ Not yet. In the alpha releases, we could test this.
715
+
693
716
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
694
717
695
718
<!--
696
719
Even if applying deprecation policies, they may still surprise some users.
697
720
-->
698
721
722
+ This feature does not deprecate or remove any features/APIs/fields/flags/etc.
723
+
699
724
### Monitoring Requirements
700
725
701
726
<!--
@@ -713,6 +738,8 @@ checking if there are objects with field X set) may be a last resort. Avoid
713
738
logs or events for this purpose.
714
739
-->
715
740
741
+ - [x] Informer / Reflector (e.g., ` lists_total ` , ` watches_total ` ) metrics returned by the operator are populated
742
+
716
743
###### How can someone using this feature know that it is working for their instance?
717
744
718
745
<!--
@@ -724,13 +751,13 @@ and operation of this feature.
724
751
Recall that end users cannot usually observe component logs or access metrics.
725
752
-->
726
753
727
- - [ ] Events
728
- - Event Reason:
729
- - [ ] API .status
730
- - Condition name:
731
- - Other field:
732
- - [ ] Other (treat as last resort)
754
+ - [X] Other (treat as last resort)
733
755
- Details:
756
+ - The following metrics are available when ` InformerMetrics ` is enabled:
757
+ - lists_total
758
+ - watches_total
759
+ - last_resource_version
760
+ - etc.
734
761
735
762
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
736
763
@@ -749,18 +776,19 @@ These goals will help you determine what you need to measure (SLIs) in the next
749
776
question.
750
777
-->
751
778
779
+ The feature gate will increase memory usage. The memory usage should not continuously grow.
780
+ The informerMetrics / eventHandlerMetrics / reflectorMetrics memory consumption is in a stable state.
781
+
752
782
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
753
783
754
784
<!--
755
785
Pick one more of these and delete the rest.
756
786
-->
757
787
758
- - [ ] Metrics
788
+ - [X ] Metrics
759
789
- Metric name: Memory usage
760
790
- [ Optional] Aggregation method:
761
791
- Components exposing the metric: Operating System/golang pprof
762
- - [ ] Other (treat as last resort)
763
- - Details:
764
792
765
793
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
766
794
@@ -769,6 +797,8 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
769
797
implementation difficulties, etc.).
770
798
-->
771
799
800
+ Not at the moment.
801
+
772
802
### Dependencies
773
803
774
804
<!--
0 commit comments