Skip to content

Commit a2bfe35

Browse files
committed
graduate suspend job to stable
1 parent 6ec5481 commit a2bfe35

File tree

3 files changed

+26
-70
lines changed

3 files changed

+26
-70
lines changed

keps/prod-readiness/sig-apps/2232.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ alpha:
33
approver: "@wojtek-t"
44
beta:
55
approver: "@wojtek-t"
6+
stable:
7+
approver: "@wojtek-t"

keps/sig-apps/2232-suspend-jobs/README.md

Lines changed: 22 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -6,25 +6,25 @@ To get started with this template:
66
- [x] **Pick a hosting SIG.**
77
Make sure that the problem space is something the SIG is interested in taking
88
up. KEPs should not be checked in without a sponsoring SIG.
9-
- [ ] **Create an issue in kubernetes/enhancements**
9+
- [x] **Create an issue in kubernetes/enhancements**
1010
When filing an enhancement tracking issue, please make sure to complete all
1111
fields in that template. One of the fields asks for a link to the KEP. You
1212
can leave that blank until this KEP is filed, and then go back to the
1313
enhancement and add the link.
14-
- [ ] **Make a copy of this template directory.**
14+
- [x] **Make a copy of this template directory.**
1515
Copy this template into the owning SIG's directory and name it
1616
`NNNN-short-descriptive-title`, where `NNNN` is the issue number (with no
1717
leading-zero padding) assigned to your enhancement above.
18-
- [ ] **Fill out as much of the kep.yaml file as you can.**
18+
- [x] **Fill out as much of the kep.yaml file as you can.**
1919
At minimum, you should fill in the "Title", "Authors", "Owning-sig",
2020
"Status", and date-related fields.
21-
- [ ] **Fill out this file as best you can.**
21+
- [x] **Fill out this file as best you can.**
2222
At minimum, you should fill in the "Summary" and "Motivation" sections.
2323
These should be easy if you've preflighted the idea of the KEP with the
2424
appropriate SIG(s).
25-
- [ ] **Create a PR for this KEP.**
25+
- [x] **Create a PR for this KEP.**
2626
Assign it to people in the SIG who are sponsoring this process.
27-
- [ ] **Merge early and iterate.**
27+
- [x] **Merge early and iterate.**
2828
Avoid getting hung up on specific details and instead aim to get the goals of
2929
the KEP clarified and merged quickly. The best way to do this is to just
3030
start with the high-level sections and fill out details incrementally in
@@ -104,8 +104,8 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
104104
- [x] (R) Production readiness review completed
105105
- [x] Production readiness review approved
106106
- [x] "Implementation History" section is up-to-date for milestone
107-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
108-
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
107+
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
108+
- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
109109

110110
[kubernetes.io]: https://kubernetes.io/
111111
[kubernetes/enhancements]: https://git.k8s.io/enhancements
@@ -225,6 +225,9 @@ period is honoured. Pods terminated this way are considered a failure and the
225225
controller does not count terminated Pods towards completions. This behaviour
226226
is similar to decreasing the Job's parallelism to zero.
227227

228+
Completed pods before suspension will count towards completion after the job is unsuspended.
229+
For example, for jobs with `completionMode: Indexed`; successfully completed indexes will not run again.
230+
228231
Similar to existing [JobConditionType](https://github.com/kubernetes/kubernetes/blob/c98f6bf30890f2c5826067ae50cfc36958106e68/staging/src/k8s.io/api/batch/v1/types.go#L167)s
229232
"Complete" and "Failed", we propose adding a new condition type called
230233
"Suspended" as a part of the Job's status as follows:
@@ -291,67 +294,12 @@ Unit, integration, and end-to-end tests will be added to test that:
291294
* We're confident that no further semantical changes will be needed to achieve the goals of the KEP
292295
* All known functional bugs have been fixed
293296

294-
<!--
295-
**Note:** *Not required until targeted at a release.*
296-
297-
Define graduation milestones.
298-
299-
These may be defined in terms of API maturity, or as something else. The KEP
300-
should keep this high-level with a focus on what signals will be looked at to
301-
determine graduation.
302-
303-
Consider the following in developing the graduation criteria for this enhancement:
304-
- [Maturity levels (`alpha`, `beta`, `stable`)][maturity-levels]
305-
- [Deprecation policy][deprecation-policy]
306-
307-
Clearly define what graduation means by either linking to the [API doc
308-
definition](https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-versioning)
309-
or by redefining what graduation means.
310-
311-
In general we try to use the same stages (alpha, beta, GA), regardless of how the
312-
functionality is accessed.
313-
314-
[maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions
315-
[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/
316-
317-
Below are some examples to consider, in addition to the aforementioned [maturity levels][maturity-levels].
318-
319-
#### Alpha -> Beta Graduation
320-
321-
- Gather feedback from developers and surveys
322-
- Complete features A, B, C
323-
- Tests are in Testgrid and linked in KEP
324-
325-
#### Beta -> GA Graduation
326-
327-
- N examples of real-world usage
328-
- N installs
329-
- More rigorous forms of testing—e.g., downgrade tests and scalability tests
330-
- Allowing time for feedback
331-
332-
**Note:** Generally we also wait at least two releases between beta and
333-
GA/stable, because there's no opportunity for user feedback, or even bug reports,
334-
in back-to-back releases.
335-
336-
#### Removing a Deprecated Flag
337-
338-
- Announce deprecation and support policy of the existing flag
339-
- Two versions passed since introducing the functionality that deprecates the flag (to address version skew)
340-
- Address feedback on usage/changed behavior, provided on GitHub issues
341-
- Deprecate the flag
342-
343-
**For non-optional features moving to GA, the graduation criteria must include
344-
[conformance tests].**
345-
346-
[conformance tests]: https://git.k8s.io/community/contributors/devel/sig-architecture/conformance-tests.md
347-
-->
348-
349297
### Upgrade / Downgrade Strategy
350298

351299
Upgrading from 1.20 and below will not change the behaviour of how Jobs work.
352300

353301
To make use of this feature, the `SuspendJob` feature gate must be explicitly
354-
enabled on the API server and the controller manager and the `suspend` field
302+
enabled on the api-server and kube-controller-manager and the `suspend` field
355303
must be explicitly set in the Job spec.
356304

357305
### Version Skew Strategy
@@ -414,10 +362,7 @@ _This section must be completed when targeting beta graduation to a release._
414362
While the above list isn't exhaustive, they're signals in favour of rollbacks.
415363

416364
* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path
417-
tested?** <!-- I'll answer this after implementation.
418-
Describe manual testing that was done and the outcomes.
419-
Longer term, we may want to require automated upgrade/rollback tests, but we
420-
are missing a bunch of machinery and tooling and can't do that now. -->
365+
tested? yes manually tested successfully.
421366

422367
* **Is the rollout accompanied by any deprecations and/or removals of features,
423368
APIs, fields of API types, flags, etc.?** No.
@@ -431,6 +376,14 @@ _This section must be completed when targeting beta graduation to a release._
431376
Job can also be used to determine whether a Job is using the feature (look
432377
for a condition of type "Suspended").
433378

379+
* **How can someone using this feature know that it is working for their instance?**
380+
- [x] Events
381+
- Event Reason: Suspended
382+
- The message includes the job name.
383+
- [x] API .status
384+
- Condition name: Suspended
385+
- Other field:
386+
434387
* **What are the SLIs (Service Level Indicators) an operator can use to
435388
determine the health of the service?**
436389
- [x] Metrics
@@ -532,6 +485,7 @@ _This section must be completed when targeting beta graduation to a release._
532485
2021-02-01: Initial KEP merged, alpha targeted for 1.21
533486
2021-03-08: Implementation merged in 1.21 with feature gate disabled by default
534487
2021-04-22: KEP updated for beta graduation in 1.22
488+
2022-01-18: KEP updated for GA graduation in 1.24
535489

536490
## Drawbacks
537491

keps/sig-apps/2232-suspend-jobs/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,12 @@ prr-approvers:
1515
- "@wojtek-t"
1616

1717
# The target maturity stage in the current dev cycle for this KEP.
18-
stage: beta
18+
stage: stable
1919

2020
# The most recent milestone for which work toward delivery of this KEP has been
2121
# done. This can be the current (upcoming) milestone, if it is being actively
2222
# worked on.
23-
latest-milestone: "v1.22"
23+
latest-milestone: "v1.24"
2424

2525
# The milestone at which this feature was, or is targeted to be, at each stage.
2626
milestone:

0 commit comments

Comments
 (0)