@@ -449,10 +449,10 @@ kubectl get -o yaml job job-backoff-limit-per-index-example
449
449
` ` `
450
450
451
451
The Job controller adds the ` FailureTarget` Job condition to trigger
452
- [Job termination and cleanup](#job-termination-and-cleanup). The
453
- ` Failed ` condition has the same values for `reason` and `message` as the
454
- ` FailureTarget ` Job condition, but is added to the Job at the moment all Pods
455
- are terminated; for details see [Termination of Job pods ](#termination-of-job-pods).
452
+ [Job termination and cleanup](#job-termination-and-cleanup). When all of the
453
+ Job Pods are terminated, the Job controller adds the `Failed` condition
454
+ with the same values for `reason` and `message` as the `FailureTarget` Job
455
+ condition. For details, see [Termination of Job Pods ](#termination-of-job-pods).
456
456
457
457
Additionally, you may want to use the per-index backoff along with a
458
458
[pod failure policy](#pod-failure-policy). When using
@@ -679,9 +679,9 @@ and `.spec.backoffLimit` result in a permanent Job failure that requires manual
679
679
A Job has two possible terminal states, each of which has a corresponding Job
680
680
condition :
681
681
* Succeeded: Job condition `Complete`
682
- * Failed: Job condition `Failed`.
682
+ * Failed: Job condition `Failed`
683
683
684
- The possible reasons for a Job failure :
684
+ Jobs fail for the following reasons :
685
685
- The number of Pod failures exceeded the specified `.spec.backoffLimit` in the Job
686
686
specification. For details, see [Pod backoff failure policy](#pod-backoff-failure-policy).
687
687
- The Job runtime exceeded the specified `.spec.activeDeadlineSeconds`
@@ -693,23 +693,23 @@ The possible reasons for a Job failure:
693
693
action. For details about how Pod failure policy rules might affect failure
694
694
evaluation, see [Pod failure policy](#pod-failure-policy).
695
695
696
- The possible reasons for a Job success :
696
+ Jobs succeed for the following reasons :
697
697
- The number of succeeded Pods reached the specified `.spec.completions`
698
698
- The criteria specified in `.spec.successPolicy` are met. For details, see
699
699
[Success policy](#success-policy).
700
700
701
701
In Kubernetes v1.31 and later the Job controller delays the addition of the
702
- terminal conditions,`Failed` or `Succeeded `, until all pods are terminated.
702
+ terminal conditions,`Failed` or `Complete `, until all of the Job Pods are terminated.
703
703
704
- {{< note >}}
705
- In Kubernetes v1.30 and earlier, Job terminal conditions were added when the Job
706
- termination process is triggered, and all Pod finalizers are removed, but some
707
- pods may still remain running/ terminating at that point in time .
704
+ In Kubernetes v1.30 and earlier, the Job controller added the `Complete` or the
705
+ ` Failed ` Job terminal conditions as soon as the Job termination process was
706
+ triggered and all Pod finalizers were removed. However, some Pods would still
707
+ be running or terminating at the moment that the terminal condition was added .
708
708
709
- The change of the behavior is activated by enablement of the `JobManagedBy` or
710
- ` JobPodReplacementPolicy` (enabled by default)
709
+ In Kubernetes v1.31 and later, the controller only adds the Job terminal conditions
710
+ _after_ all of the Pods are terminated. You can enable this behavior by using the
711
+ ` JobManagedBy` or the `JobPodReplacementPolicy` (enabled by default)
711
712
[feature gates](/docs/reference/command-line-tools-reference/feature-gates/).
712
- {{< /note >}}
713
713
714
714
# ## Termination of Job pods
715
715
@@ -727,13 +727,16 @@ You can use the `FailureTarget` or the `SuccessCriteriaMet` condition to evaluat
727
727
whether the Job has failed or succeeded without having to wait for the controller
728
728
to add a terminal condition.
729
729
730
- {{< note >}}
731
- For example, you can use the `FailureTarget` condition to quickly decide whether
732
- to create a replacement Job, but it could result in Pods from the failing and
733
- replacement Jobs running at the same time for a while. Thus, if your cluster
734
- capacity is limited, you may prefer to wait for the `Failed` condition before
735
- creating the replacement Job.
736
- {{< /note >}}
730
+ For example, you might want to decide when to create a replacement Job
731
+ that replaces a failed Job. If you replace the failed Job when the `FailureTarget`
732
+ condition appears, your replacement Job runs sooner, but could result in Pods
733
+ from the failed and the replacement Job running at the same time, using
734
+ extra compute resources.
735
+
736
+ Alternatively, if your cluster has limited resource capacity, you could choose to
737
+ wait until the `Failed` condition appears on the Job, which would delay your
738
+ replacement Job but would ensure that you conserve resources by waiting
739
+ until all of the failed Pods are removed.
737
740
738
741
# # Clean up finished jobs automatically
739
742
0 commit comments