You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -68,8 +76,12 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
68
76
-[x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
69
77
-[x] (R) KEP approvers have approved the KEP status as `implementable`
70
78
-[x] (R) Design details are appropriately documented
71
-
-[x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
79
+
-[x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
80
+
-[x] e2e Tests for all Beta API Operations (endpoints)
81
+
-[x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
82
+
-[x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
72
83
-[x] (R) Graduation criteria is in place
84
+
-[x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
73
85
-[x] (R) Production readiness review completed
74
86
-[x] (R) Production readiness review approved
75
87
-[ ] "Implementation History" section is up-to-date for milestone
@@ -307,21 +319,89 @@ finalizer.
307
319
The job controller adds the finalizer in the same patch request that modifies
308
320
the owner reference.
309
321
322
+
## Monitoring Pods with finalizers
323
+
324
+
Starting in 1.26, the metric `job_pod_tracking_finalizer` is a gauge that
325
+
tracks the number of pods that currently have a job tracking finalizer.
326
+
327
+
The metric increments when the job controller observes a pod created or adopted,
328
+
and decrements when the job controller observes an update that removes the
329
+
finalizer or a pod deletion.
330
+
331
+
## Migrating Jobs with legacy tracking
332
+
333
+
Starting in 1.26, when the feature gate `MigrateJobLegacyTracking` is enabled,
334
+
the job controller migrates jobs with legacy tracking to tracking with finalizers.
335
+
336
+
If a Job doesn't have the annotation `batch.kubernetes.io/job-completion`, it
337
+
means that is not currently tracked with finalizers. The job controller starts
338
+
the following migration process:
339
+
1. Add the finalizer `batch.kubernetes.io/job-completion` to all pods with
340
+
`.status.phase=(Pending/Running)`.
341
+
2. Ignore pods with `.status.phase=(Complete/Failed)` that don't have the `batch.kubernetes.io/job-completion`.
342
+
They are considered to be already counted in `.status.(failed/succeeded)`.
343
+
3. Add the annotation `batch.kubernetes.io/job-completion`.
344
+
345
+
This might lead to extra pods being created, but this is acceptable because:
346
+
347
+
- After the `JobTrackingWithFinalizers` feature was enabled for some time, the
348
+
Job controller is already tracking most Jobs using finalizers.
349
+
- For the remaining Jobs, the Job controller already accounted most of the
350
+
finished Pods in the status. The controller might leave some
351
+
finished Pods unaccounted, if they finish before the controller has a chance
352
+
to add a finalizer. This situation is no worse that the legacy tracking
353
+
were the controller doesn't account for Pods removed by garbage collection or
354
+
other means.
355
+
310
356
### Test Plan
311
357
312
-
- Unit tests:
358
+
[x] I/we understand the owners of the involved components may require updates to
359
+
existing tests to make this code solid enough prior to committing the changes necessary
360
+
to implement this enhancement.
361
+
362
+
##### Prerequisite testing updates
363
+
364
+
Already fulfilled at alpha and beta stages.
365
+
366
+
##### Unit tests
367
+
313
368
- Job sync with feature gate enabled.
314
369
- Removal of finalizers when feature gate is disabled.
315
-
- Tracking of terminating Pods.
316
-
- Integration tests:
317
-
- Job tracking with feature enabled.
318
-
- Tracking of terminating Pods.
319
-
- Transition from feature enabled to disabled and enabled again.
320
-
- Clean up finalizers of Orphan Pods.
370
+
- Tracking of terminating Pods for NonIndexed and Indexed Jobs.
371
+
372
+
Coverage:
373
+
374
+
-`pkg/controller/job`: 2022-08-06 - 90%
375
+
-`pkg/apis/batch/validation`: 2022-08-06 - 96%
376
+
-`pkg/apis/batch/v1`: 2022-08-06 - 85.2%
377
+
-`pkg/registry/batch/job`: 2022-08-06 - 79.7%
378
+
379
+
##### Integration tests
380
+
381
+
Almost the entire [test suite](https://storage.googleapis.com/k8s-triage/index.html?job=ci-kubernetes-integration&test=test%2Fintegration%2Fjob) runs with finalizers.
0 commit comments