You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-instrumentation/2305-metrics-cardinality-enforcement/README.md
+10-18Lines changed: 10 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -270,6 +270,7 @@ This would then be interpreted by our machinery as this:
270
270
For `Alpha`, unit test to verify that the metric label will be set to "unexpected" if the metric encounters label values outside our explicit allowlist of values.
271
271
### Graduation Criteria
272
272
For `Alpha`, the allowlist of metrics can be configured via the exposed flag and the unit test is passed.
273
+
For `Beta`, the allowlist can be configured from a input file(e.g. yaml file).
273
274
### Upgrade / Downgrade strategy
274
275
N/A
275
276
### Version Skew Strategy
@@ -281,52 +282,43 @@ N/A
281
282
_This section must be completed when targeting alpha to a release._
282
283
283
284
***How can this feature be enabled / disabled in a live cluster?**
284
-
-[ ] Feature gate (also fill in values in `kep.yaml`)
285
-
- Feature gate name:
286
-
- Components depending on the feature gate:
287
-
-[x] Other
288
-
- Describe the mechanism:
289
-
New flag will be used to config the allowlist of label values for a metric.
290
-
This flag will become standard flag for all k8s components and will be added to
291
-
`k8s.io/component-base`.
292
-
- Will enabling / disabling the feature require downtime of the control
293
-
plane? Yes, the components need to restart with flag enabled.
294
-
- Will enabling / disabling the feature require downtime or reprovisioning
295
-
of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled).
296
-
Yes, the components need to restart with flag enabled.
285
+
-[x] Feature gate (also fill in values in `kep.yaml`)
286
+
- Feature gate name: MetricCardinalityEnforcement
287
+
- Components depending on the feature gate: All components that emit metrics
297
288
298
289
***Does enabling the feature change any default behavior?**
299
290
Any change of default behavior may be surprising to users or break existing
300
291
automations, so be extremely careful here.
301
-
Using this feature requires restarting the component with the flag enabled. Once enabled, the metric label will be set to "unexpected" if the metric encounters label values outside our explicit allowlist of values.
292
+
Using this feature requires restarting the component with the allowlist flag enabled. Once enabled, the metric label will be set to "unexpected" if the metric encounters label values outside our explicit allowlist of values.
302
293
303
294
***Can the feature be disabled once it has been enabled (i.e. can we roll back
304
295
the enablement)?**
305
296
Also set `disable-supported` to `true` or `false` in `kep.yaml`.
306
297
Describe the consequences on existing workloads (e.g., if this is a runtime
307
298
feature, can it break the existing applications?).
308
-
Yes, restarting the component without the allowlist flag will basically disable this feature.
299
+
Yes, disabling the feature gate can revert it back to existing behavior
309
300
310
301
***What happens if we reenable the feature if it was previously rolled back?**
311
302
The enable-disable-enable process will not cause problem. But it may be problematic during the rolled back period with the unbounded metrics value.
312
303
313
304
***Are there any tests for feature enablement/disablement?**
314
-
No.
305
+
Using unit tests to cover the combination cases w/wo feature and w/wo allowlist.
306
+
315
307
### Rollout, Upgrade and Rollback Planning
316
308
317
309
_This section must be completed when targeting beta graduation to a release._
318
310
319
311
***How can a rollout fail? Can it impact already running workloads?**
320
312
Try to be as paranoid as possible - e.g., what if some components will restart
321
313
mid-rollout?
322
-
Using this feature requires restarting the component with the flag enabled.
314
+
Using this feature requires restarting the component with the allowlist flag enabled.
323
315
***What specific metrics should inform a rollback?**
324
316
None.
325
317
***Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
326
318
Describe manual testing that was done and the outcomes.
327
319
Longer term, we may want to require automated upgrade/rollback tests, but we
328
320
are missing a bunch of machinery and tooling and can't do that now.
329
-
No.
321
+
In alpha, we can do some manual tests on enable/disable allowlist flag and enable/disable feature gate.
330
322
***Is the rollout accompanied by any deprecations and/or removals of features, APIs,
331
323
fields of API types, flags, etc.?**
332
324
A component metric flag for ingesting allowlist to be added.
0 commit comments