@@ -321,21 +321,27 @@ the owner reference.
321
321
322
322
## Monitoring Pods with finalizers
323
323
324
- Starting in 1.26, the metric ` job_pod_tracking_finalizer ` is a gauge that
325
- tracks the number of pods that currently have a job tracking finalizer.
324
+ Starting in 1.26, the metric ` job_terminated_pod_tracking_finalizer ` is a gauge
325
+ that tracks the number of terminated pods (` .status.phase=(Succeeded|Failed) ` )
326
+ that currently have a job tracking finalizer.
326
327
327
- The metric increments when the job controller observes a pod created or adopted,
328
- and decrements when the job controller observes an update that removes the
329
- finalizer or a pod deletion.
328
+ The job controller tracks this metric in its event handlers.
330
329
331
330
## Migrating Jobs with legacy tracking
332
331
333
- Starting in 1.26, when the feature gate ` MigrateJobLegacyTracking ` is enabled,
334
- the job controller migrates jobs with legacy tracking to tracking with finalizers.
332
+ Once ` JobTrackingWithFinalizers ` graduates to stable, Jobs that start in a
333
+ kubernetes version where ` JobTrackingWithFinalizer ` is disabled need to be
334
+ migrated to the new tracking. This migration mechanism will be initially guarded
335
+ by the feature gate ` MigrateJobLegacyTracking ` , starting in 1.26,
336
+ enabled by default.
337
+
338
+ When the feature gate ` MigrateJobLegacyTracking ` is enabled, the job controller
339
+ migrates jobs with legacy tracking to tracking with finalizers as described
340
+ below:
335
341
336
342
If a Job doesn't have the annotation ` batch.kubernetes.io/job-completion ` , it
337
- means that is not currently tracked with finalizers. The job controller starts
338
- the following migration process:
343
+ means that the Job is not currently tracked with finalizers. The job controller
344
+ starts the following migration process:
339
345
1 . Add the finalizer ` batch.kubernetes.io/job-completion ` to all pods with
340
346
` .status.phase=(Pending/Running) ` .
341
347
2 . Ignore pods with ` .status.phase=(Complete/Failed) ` that don't have the ` batch.kubernetes.io/job-completion ` .
@@ -349,7 +355,7 @@ This might lead to extra pods being created, but this is acceptable because:
349
355
- For the remaining Jobs, the Job controller already accounted most of the
350
356
finished Pods in the status. The controller might leave some
351
357
finished Pods unaccounted, if they finish before the controller has a chance
352
- to add a finalizer. This situation is no worse that the legacy tracking
358
+ to add a finalizer. This situation is no worse than the legacy tracking
353
359
were the controller doesn't account for Pods removed by garbage collection or
354
360
other means.
355
361
@@ -427,7 +433,7 @@ for jobs with multiple sizes.
427
433
#### Beta -> GA Graduation
428
434
429
435
- [ Migrate existing Jobs to tracking with finalizers] ( #migrating-jobs-with-legacy-tracking )
430
- under feature gate ` MigrateJobLegacyTracking ` , disabled by default.
436
+ under feature gate ` MigrateJobLegacyTracking ` , enabled by default.
431
437
- Job E2E tests graduate to conformance.
432
438
- Job tracking scales to 10^5 completions per Job processed within an order of
433
439
minutes.
@@ -539,7 +545,7 @@ No implications to node runtime.
539
545
duration than previous versions of the job controller due to the new API
540
546
calls.
541
547
- Stale ` job_sync_total ` or ` job_finished_total ` .
542
- - The metric ` job_pod_tracking_finalizer ` doesn't decrease when pods finish .
548
+ - The metric ` job_terminated_pod_tracking_finalizer ` increases steadily .
543
549
544
550
#### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
545
551
@@ -598,7 +604,7 @@ Yes, see [Deprecation](#deprecation) for the full plan.
598
604
- Metric name: ` job_sync_duration_seconds `
599
605
- [ Optional] Aggregation method:
600
606
- Components exposing the metric: ` kube-controller-manager `
601
- - Metric name: ` job_pod_tracking_finalizer `
607
+ - Metric name: ` job_terminated_pod_tracking_finalizer `
602
608
- [ Optional] Aggregation method:
603
609
- Components exposing the metric: ` kube-controller-manager `
604
610
@@ -668,7 +674,7 @@ Yes, see [Deprecation](#deprecation) for the full plan.
668
674
- Terminated pods are stuck with finalizers
669
675
- Detection:
670
676
- Before 1.26: Observe the behavior in pods.
671
- - After 1.26: Based on metric ` job_pod_tracking_finalizer `
677
+ - After 1.26: Based on metric ` job_terminated_pod_tracking_finalizer `
672
678
- Mitigations:
673
679
Before 1.26, disable ` JobTrackingWithFinalizers ` .
674
680
- Diagnostics:
0 commit comments