You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[How can an operator determine if the feature is in use by workloads?](#how-can-an-operator-determine-if-the-feature-is-in-use-by-workloads)
20
+
-[What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?](#what-are-the-slis-service-level-indicators-an-operator-can-use-to-determine-the-health-of-the-service)
21
+
-[Metrics](#metrics)
22
+
-[Dependencies](#dependencies)
23
+
-[Does this feature depend on any specific services running in the cluster?](#does-this-feature-depend-on-any-specific-services-running-in-the-cluster)
24
+
-[For GA, this section is required: approvers should be able to confirm the previous answers based on experience in the field.](#for-ga-this-section-is-required-approvers-should-be-able-to-confirm-the-previous-answers-based-on-experience-in-the-field)
25
+
-[Will enabling / using this feature result in any new API calls? Describe them, providing:](#will-enabling--using-this-feature-result-in-any-new-api-calls-describe-them-providing)
26
+
-[Troubleshooting](#troubleshooting)
27
+
-[How does this feature react if the API server and/or etcd is unavailable?](#how-does-this-feature-react-if-the-api-server-andor-etcd-is-unavailable)
28
+
-[What are other known failure modes?](#what-are-other-known-failure-modes)
1.[Metrics Validation and Verification](https://github.com/kubernetes/enhancements/blob/77a84d2d55b5802a615f3fe98e7e7c9bd26c9efc/keps/sig-instrumentation/1209-metrics-stability/20190605-metrics-validation-and-verification.md)
75
+
1.[Metrics Stability to Beta](https://github.com/kubernetes/enhancements/blob/77a84d2d55b5802a615f3fe98e7e7c9bd26c9efc/keps/sig-instrumentation/1209-metrics-stability/20191028-metrics-stability-to-beta.md)
63
76
64
77
This document is not net new and ties the four together in order to document the lifecycle of this feature.
[Metrics Validation and Verification#Motivation]: keps/sig-instrumentation/1209-metrics-stability/20190605-metrics-validation-and-verification.md#motivation
83
-
[Metrics Stability to Beta#Motivation]: keps/sig-instrumentation/1209-metrics-stability/20191028-metrics-stability-to-beta.md#motivation
[Metrics Validation and Verification#Proposal]: keps/sig-instrumentation/1209-metrics-stability/20190605-metrics-validation-and-verification.md#proposal
97
-
[Metrics Stability to Beta#Proposal]: keps/sig-instrumentation/1209-metrics-stability/20191028-metrics-stability-to-beta.md#proposal
1.[Metrics Validation and Verification#Graduation Criteria]
126
141
1.[Metrics Stability to Beta#Graduation Criteria]
127
142
128
-
[Metrics Validation and Verification#Graduation Criteria]: keps/sig-instrumentation/1209-metrics-stability/20190605-metrics-validation-and-verification.md#graduation-criteria
129
-
[Metrics Stability to Beta#Graduation Criteria]: keps/sig-instrumentation/1209-metrics-stability/20191028-metrics-stability-to-beta.md#graduation-criteria
143
+
[Metrics Validation and Verification#Graduation Criteria]: 20190605-metrics-validation-and-verification.md#graduation-criteria
144
+
[Metrics Stability to Beta#Graduation Criteria]: 20191028-metrics-stability-to-beta.md#graduation-criteria
130
145
131
146
#### Beta -> GA Graduation
132
147
133
-
- Select stable metrics from control plane components
134
-
- Implement the ability to turn off individual metrics (see [here](keps/sig-instrumentation/1209-metrics-stability/20191028-metrics-stability-to-beta.md#non-goals))
135
-
136
-
**For non-optional features moving to GA, the graduation criteria must include
-[Deprecation of modified metrics from metrics overhaul KEP](keps/sig-instrumentation/1209-metrics-stability/20190605-metrics-stability-migration.md#deprecation-of-modified-metrics-from-metrics-overhaul-kep)
-[Deprecation of modified metrics from metrics overhaul KEP](20190605-metrics-stability-migration.md#deprecation-of-modified-metrics-from-metrics-overhaul-kep)
N/A - this KEP predates PRR. @logicalhan to fill this in later if desired.
170
+
#### How can this feature be enabled / disabled in a live cluster?
171
+
172
+
The metrics stability framework adds developer tooling around commit pipelines and is not a user-facing feature per se. The part that is user-facing is the annotation on metrics with a stability level.
173
+
174
+
This framework intends to increase reliability in control-plane management and so features in the metrics stability framework tend to 'fix' aspects of dev processes which lead to downstream breakages.
175
+
176
+
Rollout, Upgrade and Rollback Planning
177
+
This section must be completed when targeting beta graduation to a release.
178
+
179
+
N/A, this isn't a feature per se.
180
+
181
+
#### What specific metrics should inform a rollback?
182
+
183
+
N/A
184
+
185
+
### Monitoring Requirements
186
+
187
+
#### How can an operator determine if the feature is in use by workloads?
188
+
189
+
N/A
190
+
191
+
#### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
192
+
193
+
N/A
194
+
195
+
#### Metrics
196
+
197
+
The stability framework applies to all metrics which originate directly from the control-plane.
198
+
199
+
### Dependencies
200
+
201
+
This section must be completed when targeting beta graduation to a release.
202
+
203
+
#### Does this feature depend on any specific services running in the cluster?
204
+
205
+
N/A
206
+
207
+
#### For GA, this section is required: approvers should be able to confirm the previous answers based on experience in the field.
208
+
209
+
#### Will enabling / using this feature result in any new API calls? Describe them, providing:
210
+
211
+
No.
212
+
213
+
### Troubleshooting
214
+
215
+
#### How does this feature react if the API server and/or etcd is unavailable?
216
+
217
+
N/A (but if the component isn't available, no metrics are being scraped).
218
+
219
+
#### What are other known failure modes?
220
+
221
+
At worst, this thing can clog the commit pipeline (since it is effectively a conformance test for ensuring metric stability guarantees). In that case, we can simply turn off the verification and validation mechanism (i.e. the `hack/verify_generated_stable_metrics.sh` script) which effectively puts us back to where we were before the framework. Note that this basically allows developers to commit breaking changes to metrics and violate guarantees though.
158
222
159
223
## Implementation History
160
224
@@ -165,7 +229,7 @@ See:
165
229
1.[Metrics Validation and Verification#Implementation History]
166
230
1.[Metrics Stability to Beta#Implementation History]
0 commit comments