Skip to content

Commit a4b5f5c

Browse files
authored
Merge pull request kubernetes#3482 from alculquicondor/job-tracking-ga
KEP-2307: Graduate JobTrackingWithFinalizers to GA
2 parents b1e464a + e5689ba commit a4b5f5c

File tree

3 files changed

+141
-46
lines changed

3 files changed

+141
-46
lines changed

keps/prod-readiness/sig-apps/2307.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ alpha:
33
approver: "@wojtek-t"
44
beta:
55
approver: "@wojtek-t"
6+
stable:
7+
approver: "@wojtek-t"

keps/sig-apps/2307-job-tracking-without-lingering-pods/README.md

Lines changed: 136 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -12,18 +12,26 @@
1212
- [New API calls](#new-api-calls)
1313
- [Bigger Job status](#bigger-job-status)
1414
- [Unprotected Job status endpoint](#unprotected-job-status-endpoint)
15+
- [Jobs with legacy tracking](#jobs-with-legacy-tracking)
1516
- [Design Details](#design-details)
1617
- [API changes](#api-changes)
1718
- [Algorithm](#algorithm)
1819
- [Simplified algorithm for Indexed Jobs](#simplified-algorithm-for-indexed-jobs)
1920
- [Deleted Pods](#deleted-pods)
2021
- [Deleted Jobs](#deleted-jobs)
2122
- [Pod adoption](#pod-adoption)
23+
- [Monitoring Pods with finalizers](#monitoring-pods-with-finalizers)
2224
- [Test Plan](#test-plan)
25+
- [Prerequisite testing updates](#prerequisite-testing-updates)
26+
- [Unit tests](#unit-tests)
27+
- [Integration tests](#integration-tests)
28+
- [E2E test:](#e2e-test)
29+
- [Load test:](#load-test)
2330
- [Graduation Criteria](#graduation-criteria)
2431
- [Alpha](#alpha)
2532
- [Alpha -> Beta Graduation](#alpha---beta-graduation)
2633
- [Beta -> GA Graduation](#beta---ga-graduation)
34+
- [Deprecation](#deprecation)
2735
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
2836
- [Version Skew Strategy](#version-skew-strategy)
2937
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
@@ -68,11 +76,15 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
6876
- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
6977
- [x] (R) KEP approvers have approved the KEP status as `implementable`
7078
- [x] (R) Design details are appropriately documented
71-
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
79+
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
80+
- [x] e2e Tests for all Beta API Operations (endpoints)
81+
- [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
82+
- [x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
7283
- [x] (R) Graduation criteria is in place
84+
- [x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
7385
- [x] (R) Production readiness review completed
7486
- [x] (R) Production readiness review approved
75-
- [ ] "Implementation History" section is up-to-date for milestone
87+
- [x] "Implementation History" section is up-to-date for milestone
7688
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
7789
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
7890

@@ -161,6 +173,25 @@ Changes in the status not produced by the Job controller in
161173
kube-controller-manager could affect the Job tracking. Cluster administrators
162174
should make sure to protect the Job status endpoint via RBAC.
163175

176+
#### Jobs with legacy tracking
177+
178+
Starting in 1.27, the job controller will ignore the annotation `batch.kubernetes.io/job-completion`
179+
and will start tracking every Job with finalizers.
180+
This means that terminated pods without finalizers will be ignored and
181+
replacement pods might be created (with finalizers). This behavior is similar
182+
to:
183+
- Having a low terminated pods threshold in the Pod GC or
184+
- Losing pods because of node upgrades.
185+
186+
The impact should be minimal for the following reasons:
187+
- During 1.26, all new Jobs will be tracked with finalizers, as the feature
188+
cannot be disabled.
189+
- Most clusters would also have the feature enabled in 1.25, giving extra
190+
time for jobs to terminate.
191+
192+
In other words, in most clusters Jobs will have 2 releases to terminate
193+
before getting their pods recreated.
194+
164195
## Design Details
165196

166197
### API changes
@@ -307,21 +338,63 @@ finalizer.
307338
The job controller adds the finalizer in the same patch request that modifies
308339
the owner reference.
309340

341+
## Monitoring Pods with finalizers
342+
343+
Starting in 1.26, the metric `job_terminated_pod_tracking_finalizer` is a gauge
344+
that tracks the number of terminated pods (`.status.phase=(Succeeded|Failed)`)
345+
that currently have a job tracking finalizer.
346+
347+
The job controller tracks this metric in its event handlers.
348+
310349
### Test Plan
311350

312-
- Unit tests:
351+
[x] I/we understand the owners of the involved components may require updates to
352+
existing tests to make this code solid enough prior to committing the changes necessary
353+
to implement this enhancement.
354+
355+
##### Prerequisite testing updates
356+
357+
Already fulfilled at alpha and beta stages.
358+
359+
##### Unit tests
360+
313361
- Job sync with feature gate enabled.
314362
- Removal of finalizers when feature gate is disabled.
315-
- Tracking of terminating Pods.
316-
- Integration tests:
317-
- Job tracking with feature enabled.
318-
- Tracking of terminating Pods.
319-
- Transition from feature enabled to disabled and enabled again.
320-
- Clean up finalizers of Orphan Pods.
363+
- Tracking of terminating Pods for NonIndexed and Indexed Jobs.
364+
365+
Coverage:
366+
367+
- `pkg/controller/job`: 2022-08-06 - 90%
368+
- `pkg/apis/batch/validation`: 2022-08-06 - 96%
369+
- `pkg/apis/batch/v1`: 2022-08-06 - 85.2%
370+
- `pkg/registry/batch/job`: 2022-08-06 - 79.7%
371+
372+
##### Integration tests
373+
374+
Almost the entire [test suite](https://storage.googleapis.com/k8s-triage/index.html?job=ci-kubernetes-integration&test=test%2Fintegration%2Fjob) runs with finalizers.
375+
376+
- Job tracking with feature enabled: `TestNonParallelJob`, `TestParallelJob`, `TestParallelJobParallelism`, `TestIndexedJob`, `TestJobFailedWithInterrupts`.
377+
- Transition from feature enabled to disabled and enabled again: `TestDisableJobTrackingWithFinalizers`.
378+
- Clean up finalizers of Orphan Pods `TestOrphanPodsFinalizersClearedWithGC`
321379
- Tracking Jobs with big number of Pods, making sure the status is eventually
322-
consistent.
323-
- E2E test:
324-
- Job tracking with feature enabled.
380+
consistent (`TestParallelJobWithCompletions`, `TestFinalizersClearedWhenBackoffLimitExceeded`)
381+
382+
Exceptions:
383+
384+
- Test orphan pods are cleared when TrackingWithFinalizers is disabled: `TestOrphanPodsFinalizersClearedWithFeatureDisabled`.
385+
- Test suspend jobs (finalizers to be enabled).
386+
- Test mutable scheduling directives (finalizers to be enabled).
387+
388+
##### E2E test:
389+
390+
[Every E2E](https://testgrid.k8s.io/sig-testing-canaries#ci-kubernetes-coverage-e2e-gci-gce&width=20&include-filter-by-regex=%5C%5Bsig-apps%5C%5D%20Job)
391+
test is affected. The feature didn't require new tests, as it doesn't add
392+
new endpoints or new functionality.
393+
394+
##### Load test:
395+
396+
A [clusterloader2 test](https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/testing/batch/config.yaml)
397+
for jobs with multiple sizes.
325398

326399
### Graduation Criteria
327400

@@ -346,20 +419,27 @@ the owner reference.
346419

347420
#### Beta -> GA Graduation
348421

349-
- Remove legacy tracking and the use of `batch.kubernetes.io/job-completion` as
350-
an annotation. This is possible assuming:
351-
- After the feature was enabled for some time, the Job controller is already
352-
tracking most Jobs using finalizers.
353-
- For the remaining Jobs, the Job controller already accounted most of the
354-
finished Pods in the status. The controller adds a tracking finalizer to
355-
any running Pod that doesn't have it. The controller might leave some
356-
finished Pods unaccounted, if they finish before the controller has a chance
357-
to add a finalizer. This is acceptable as it's no worse that the current
358-
behavior were the controller doesn't account for Pods removed by garbage
359-
collection or other means.
360422
- Job E2E tests graduate to conformance.
361423
- Job tracking scales to 10^5 completions per Job processed within an order of
362424
minutes.
425+
- Write blog post about the feature and the future deprecation plans.
426+
427+
#### Deprecation
428+
429+
In 1.26:
430+
431+
- Declare deprecation of annotation `batch.kubernetes.io/job-completion` in
432+
[documentation](https://kubernetes.io/docs/reference/labels-annotations-taints/#batch-kubernetes-io-job-tracking).
433+
- Lock `JobTrackingWithFinalizers` to true.
434+
435+
In 1.27:
436+
437+
- Remove legacy tracking code.
438+
- Ignore annotation `batch.kubernetes.io/job-completion` and stop adding it.
439+
Mark the annotation as legacy in the documentation.
440+
441+
In 1.28:
442+
- Remove feature gate `JobTrackingWithFinalizers`.
363443

364444
### Upgrade / Downgrade Strategy
365445

@@ -448,6 +528,7 @@ No implications to node runtime.
448528
duration than previous versions of the job controller due to the new API
449529
calls.
450530
- Stale `job_sync_total` or `job_finished_total`.
531+
- The metric `job_terminated_pod_tracking_finalizer` increases steadily.
451532

452533
#### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
453534

@@ -472,7 +553,7 @@ The flow was completed successfully with all the stated verifications.
472553

473554
#### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
474555

475-
No.
556+
Yes, see [Deprecation](#deprecation) for the full plan.
476557

477558
### Monitoring Requirements
478559

@@ -504,14 +585,17 @@ The flow was completed successfully with all the stated verifications.
504585

505586
- [x] Metrics
506587
- Metric name: `job_sync_duration_seconds`
507-
- [Optional] Aggregation method:
508-
- Components exposing the metric: `kube-controller-manager`
588+
- [Optional] Aggregation method:
589+
- Components exposing the metric: `kube-controller-manager`
590+
- Metric name: `job_terminated_pod_tracking_finalizer`
591+
- [Optional] Aggregation method:
592+
- Components exposing the metric: `kube-controller-manager`
509593

510594
#### Are there any missing metrics that would be useful to have to improve observability of this feature?
511595

512-
- A label in `job_sync_total` for the type of Job tracking. This label would
513-
have to be removed when we graduate the feature to GA, adding operational
514-
burden.
596+
- A label in `job_sync_total` for the type of Job tracking. We decided not to
597+
add this label because it would have to be removed on GA graduation, adding
598+
operational burden.
515599

516600
### Dependencies
517601

@@ -570,20 +654,28 @@ The flow was completed successfully with all the stated verifications.
570654

571655
#### What are other known failure modes?
572656

573-
TBD from user feedback. No know failures modes so far.
574-
575-
<!--
576-
For each of them, fill in the following information by copying the below template:
577-
- [Failure mode brief description]
578-
- Detection: How can it be detected via metrics? Stated another way:
579-
how can an operator troubleshoot without logging into a master or worker node?
580-
- Mitigations: What can be done to stop the bleeding, especially for already
581-
running user workloads?
582-
- Diagnostics: What are the useful log messages and their required logging
583-
levels that could help debug the issue?
584-
Not required until feature graduated to beta.
585-
- Testing: Are there any tests for failure mode? If not, describe why.
586-
-->
657+
- Terminated pods are stuck with finalizers
658+
- Detection:
659+
- Before 1.26: Observe the behavior in pods.
660+
- After 1.26: Based on metric `job_terminated_pod_tracking_finalizer`
661+
- Mitigations:
662+
Before 1.26, disable `JobTrackingWithFinalizers`.
663+
- Diagnostics:
664+
The job controller reports errors updating the Job status and/or patching
665+
Pods.
666+
There were some bugs that would cause this (examples:
667+
[#109485](https://github.com/kubernetes/kubernetes/issues/109485),
668+
[#111646](https://github.com/kubernetes/kubernetes/pull/111646)).
669+
In newer versions, this can still happen if there is a buggy webhook
670+
that prevents pod updates to remove finalizers.
671+
- Testing: Discovered bugs are covered by unit and integration tests.
672+
- Job pods might be recreated upon upgrade to 1.27.
673+
- Detection:
674+
In 1.26, there are non-finished jobs without annotation `batch.kubernetes.io/job-completion`.
675+
- Mitigation:
676+
- Keep `JobTrackingWithFinalizers` feature gate enabled in 1.25. This
677+
minimizes the chances of having legacy jobs before upgrading to 1.27.
678+
- Wait for Jobs without `batch.kubernetes.io/job-completion` to finish.
587679

588680
#### What steps should be taken if SLOs are not being met to determine the problem?
589681

@@ -608,6 +700,7 @@ The flow was completed successfully with all the stated verifications.
608700
- 2021-08-18: PRR completed and graduation to beta proposed.
609701
- 2021-10-14: Added details for Upgrade->Downgrade->Upgrade manual test.
610702
- 2021-10-21: Add link to testgrid.
703+
- 2022-08-29: Add GA and deprecation notes.
611704

612705
## Drawbacks
613706

keps/sig-apps/2307-job-tracking-without-lingering-pods/kep.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,13 @@ stage: beta
1818
# The most recent milestone for which work toward delivery of this KEP has been
1919
# done. This can be the current (upcoming) milestone, if it is being actively
2020
# worked on.
21-
latest-milestone: "v1.23"
21+
latest-milestone: "v1.26"
2222

2323
# The milestone at which this feature was, or is targeted to be, at each stage.
2424
milestone:
2525
alpha: "v1.22"
2626
beta: "v1.23"
27-
stable: "v1.25"
27+
stable: "v1.26"
2828

2929
# The following PRR answers are required at alpha release
3030
# List the feature gate name and the components for which it must be enabled
@@ -33,7 +33,7 @@ feature-gates:
3333
components:
3434
- kube-apiserver
3535
- kube-controller-manager
36-
disable-supported: true
36+
disable-supported: false
3737

3838
# The following PRR answers are required at beta release
3939
metrics:

0 commit comments

Comments
 (0)