Skip to content

Commit ddcc483

Browse files
author
Han Kang
committed
update stability KEP for beta
1 parent c76fc10 commit ddcc483

File tree

1 file changed

+48
-24
lines changed
  • keps/sig-instrumentation/3498-extending-stability

1 file changed

+48
-24
lines changed

keps/sig-instrumentation/3498-extending-stability/README.md

Lines changed: 48 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -128,20 +128,20 @@ checklist items _must_ be updated for the enhancement to be released.
128128

129129
Items marked with (R) are required *prior to targeting to a milestone / release*.
130130

131-
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
132-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
133-
- [ ] (R) Design details are appropriately documented
134-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
135-
- [ ] e2e Tests for all Beta API Operations (endpoints)
131+
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
132+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
133+
- [X] (R) Design details are appropriately documented
134+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
135+
- [X] e2e Tests for all Beta API Operations (endpoints)
136136
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
137137
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
138138
- [ ] (R) Graduation criteria is in place
139139
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
140-
- [ ] (R) Production readiness review completed
141-
- [ ] (R) Production readiness review approved
142-
- [ ] "Implementation History" section is up-to-date for milestone
143-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
144-
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
140+
- [X] (R) Production readiness review completed
141+
- [X] (R) Production readiness review approved
142+
- [X] "Implementation History" section is up-to-date for milestone
143+
- [X] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
144+
- [X] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
145145

146146
<!--
147147
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
@@ -185,6 +185,7 @@ Additionally we propose forced upgrades of metrics stability classes in the simi
185185
### Risks and Mitigations
186186

187187
The primary risk is that these changes break our existing (and working) metrics infrastructure. The mitigation should straightfoward, i.e. rollback the changes to the metrics framework.
188+
188189
## Design Details
189190

190191
Our plan is to add functionality to our static analysis framework which is hosted in the main `k8s/k8s` repo, under `test/instrumentation`. Specifically, we will need to support:
@@ -203,6 +204,32 @@ We will not attempt to parse metrics which:
203204

204205
As an aside, much of this work has already been done, but is stashed in a local repo.
205206

207+
### Semantic of Stability Levels
208+
209+
#### Internal Metrics
210+
211+
`Internal` metrics have no stability guarantees and are **not** parseable by the static analysis framework. As such, `Internal` metrics will NOT be included in metric auto-documentation.
212+
213+
#### Alpha Metrics
214+
215+
Alpha metrics have no stability guarantees but are parseable by the static analysis framework. As such, `Alpha` metrics will be included in metric auto-documentation.
216+
217+
#### Beta Metrics
218+
219+
`Beta` metrics have *some* stability guarantees. Specifically, we guarantee that:
220+
221+
- `Beta` metrics will not be removed without first being explicitly deprecated. After deprecation, the metric will be removed in 4 months or 1 release.
222+
- Furthermore, `Beta` metrics are guaranteed to be **forward compatible** in respect to alerts and queries which may be written against them. By "forward compatible", we mean that queries and alerts which are written against the metric and its labels will continue to work in the future. We ensure forward compatibility by ensuring that **labels can only be added**, *and not removed*, from `Beta` metrics.
223+
- `Beta` metrics will be included in metric auto-documentation
224+
225+
#### Stable Metrics
226+
227+
`Stable` metrics have stability guarantees. Specifically, we guarantee that:
228+
229+
- `Stable` metrics will not be removed without first being explicitly deprecated. After deprecation, the metric will be removed in 12 months or 3 releases.
230+
- Furthermore, `Stable` metrics are guaranteed to **not change** in respect to labels. This means labels can neither be added nor removed from a `Stable` metric.
231+
- `Stable` metrics will be included in metric auto-documentation
232+
206233
### Test Plan
207234

208235
We have static analysis testing for stable metrics, we will extend our test coverage
@@ -218,12 +245,12 @@ We already have thorough testing for the stability framework which has been GA f
218245

219246
##### Unit tests
220247

221-
[ ] parsing variables
222-
[ ] multi-line strings
223-
[ ] evaluating buckets
224-
[ ] buckets which are defined via variables and consts
225-
[ ] evaluation of simple consts
226-
[ ] evaluation of simple variables
248+
[X] parsing variables
249+
[X] multi-line strings
250+
[X] evaluating buckets
251+
[X] buckets which are defined via variables and consts
252+
[X] evaluation of simple consts
253+
[X] evaluation of simple variables
227254

228255
- `test/instrumentation`: `09/20/2022` - `full coverage of existing stability framework`
229256

@@ -245,11 +272,9 @@ The statis analysis tooling runs in a precommit pipeline and is therefore exempt
245272

246273
#### Beta
247274

248-
- All instances of `Alpha` metrics will be converted to `Internal`
249-
- Kubernetes metrics framework will be enhanced to support marking `Alpha` and `Beta` metrics with a date. The semantics of this are yet to be determined. This date will be used to statically determine whether or not that metric should be decrepated automatically or promoted.
250-
- Kubernetes metrics framework will be enhanced with a script to auto-deprecate metrics which have passed their window of existence as an `Alpha` or `Beta` metric
251-
- We will determine the semantics for `Alpha` and `Beta` metrics
252-
- The `beta` stage for this framework will be a few releases. During this time, we will evaluate the utility and the ergonomics of the framework, making adjustments as necessary
275+
- Kubernetes metrics framework will be enhanced to support marking `Alpha` and `Beta` metrics with release version. The semantics of this are yet to be determined. This version will be used to statically determine whether or not that metric should be deprecated automatically or promoted.
276+
277+
For the beta version of this KEP, we begin permitting metrics to be promoted to the `Beta` stability class.
253278

254279
#### GA
255280

@@ -311,19 +336,18 @@ This should not affect upgrade/rollback paths.
311336

312337
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
313338

314-
`Alpha` metrics will be recategorized as `Internal`.
339+
No.
315340

316341
### Monitoring Requirements
317342

318343
###### How can an operator determine if the feature is in use by workloads?
319344

320-
You can determine this by seeing if workloads depend on any Kubernetes control-plane metrics. If they do, they are using this feature.
345+
Dependence on any Kubernetes control-plane metrics implies that they are using this feature.
321346

322347
###### How can someone using this feature know that it is working for their instance?
323348

324349
They will be able to see metrics.
325350

326-
327351
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
328352

329353
This tooling runs in precommit. It does not affect runtime SLOs.

0 commit comments

Comments
 (0)