@@ -880,11 +880,16 @@ not added to the pod for a long enough time (for example 2 minutes).
880
880
881
881
# ### Marking pods as Failed
882
882
883
- When matching a failed pod against Job pod failure policy it is important that
883
+ When matching a failed pod against Job pod failure policy, it is important that
884
884
the pod is actually in the terminal phase (`Failed`), to ensure their state is
885
885
not modified while Job controller matches them against the pod failure policy.
886
886
887
- However, there are scenarios in which a pod gets stuck in a non-terminal phase,
887
+ Additionally, it is necessary to avoid the creation of a replacement Pod if the
888
+ previously created Pod becomes terminating (has a `deletionTimestamp` but is
889
+ not `Failed` nor `Succeeded` yet), or we might create replacement Pods that
890
+ wouldn't be created if the pod failure policy was applied against the terminated Pod.
891
+
892
+ There are scenarios in which a pod gets stuck in a non-terminal phase,
888
893
but is doomed to be failed, as it is terminating (has `deletionTimestamp` set, also
889
894
known as the `DELETING` state, see :
890
895
[The API Object Lifecycle](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/object-lifecycle.md)).
@@ -1433,7 +1438,11 @@ type PodFailurePolicyRule struct {
1433
1438
OnPodConditions []PodFailurePolicyOnPodConditionsPattern
1434
1439
}
1435
1440
1436
- // PodFailurePolicy describes how failed pods influence the backoffLimit.
1441
+ // podFailurePolicy describes how failed pods are accounted. In particular,
1442
+ // how they influence the backoffLimit.
1443
+ // When using podFailurePolicy, terminating Pods (have a ` deletionTimestamp`)
1444
+ // are not immediately replaced and don't count as failed until they reach
1445
+ // a terminal phase (`Failed` or `Succeeded`).
1437
1446
type PodFailurePolicy struct {
1438
1447
// A list of pod failure policy rules. The rules are evaluated in order.
1439
1448
// Once a rule matches a Pod failure, the remaining of the rules are ignored.
@@ -1507,9 +1516,18 @@ spec:
1507
1516
### Evaluation
1508
1517
1509
1518
We use the ` syncJob` function of the Job controller to evaluate the specified
1510
- ` podFailurePolicy` rules against the failed pods. It is only the first rule with
1511
- matching requirements which is applied as the rules are evaluated in order. If
1512
- the pod failure does not match any of the specified rules, then default
1519
+ ` podFailurePolicy` rules against the failed pods.
1520
+
1521
+ Since terminating Pods (have `deletionTimestamp` and are not `Failed` or
1522
+ ` Succeeded` ) don't have an exit code yet and might actually succeed, the
1523
+ controller will not evaluate them against the `podFailurePolicy`.
1524
+ The job controller will also not create a replacement Pod until they reach the
1525
+ ` Failed` phase. This behavior is the same as
1526
+ [`podReplacementPolicy : Failed`](../3939-allow-replacement-when-fully-terminated/).
1527
+
1528
+ When evaluating Failed Pods against the `podFailurePolicy`, it is only the first
1529
+ rule with matching requirements which is applied as the rules are evaluated in order.
1530
+ If the pod failure does not match any of the specified rules, then default
1513
1531
handling of failed pods applies.
1514
1532
1515
1533
If we limit this feature to use `onExitCodes` only when `restartPolicy=Never`
@@ -1708,14 +1726,17 @@ Below are some examples to consider, in addition to the aforementioned [maturity
1708
1726
[SSA](https://kubernetes.io/docs/reference/using-api/server-side-apply/) client.
1709
1727
- The feature flag enabled by default
1710
1728
1711
- Second iteration :
1729
+ Second iteration (1.27) :
1712
1730
- Extend Kubelet to mark as failed pending terminating pods (see : [Marking pods as Failed](#marking-pods-as-failed)).
1713
1731
- Extend the feature documentation to explain transitioning of pending and
1714
1732
terminating pods into `Failed` phase.
1715
1733
1716
1734
Third iteration (1.28) :
1717
1735
- Add `DisruptionTarget` condition for pods which are preempted by Kubelet to make room for critical pods.
1718
1736
Also, backport this fix to 1.26 and 1.27 release branches, and update the user-facing documentation to reflect this change.
1737
+ - Avoid creation of replacement Pods for terminating Pods until they reach
1738
+ the terminal phase. Update user-facing documentation.
1739
+ Might be considered for backport to 1.27.
1719
1740
1720
1741
# ### GA
1721
1742
0 commit comments