Skip to content

Commit a7bfe21

Browse files
kannon92mimowoalculquicondorsoltysh
authored
KEP-3939: Consider terminating pods in job controller (#3940)
* terminating pods should be included as active * add job and deploy api to include terminating pods as active * notes * update kep with some questions * kep updates * misspell * add pointer bool for replicas * add feature toggles * Update keps/sig-apps/3939-include-terminating-pods-as-active/README.md Co-authored-by: Michał Woźniak <[email protected]> * update pr comments * Update keps/sig-apps/3939-include-terminating-pods-as-active/README.md Co-authored-by: Aldo Culquicondor <[email protected]> * Update keps/sig-apps/3939-include-terminating-pods-as-active/README.md Co-authored-by: Aldo Culquicondor <[email protected]> * pr updates * remove deployment from kep * update toc * pr comments * pr comments * adding kep readiness and cleaning up * update toc * rename feature toggle and update comment * update kep with comments * update prod-readiness kep * update kep * rename kep * typo fix * pr comments * filling in more of the KEP * Apply suggestions from code review Co-authored-by: Aldo Culquicondor <[email protected]> * update pr comments * update with pr comments * add feedback from pr * update with pr comments * update with latest pr feedback * update with a e2e test * update kep feedback * update dependencies * update dependencies * remove terminating from risks * add metric note * update with metric comments * Clarifying interactions with other KEPs * Always count number of terminating pods * rename field and default for PodFailurePolicy * Update PRR * PRR second pass * Update keps/sig-apps/3329-retriable-and-non-retriable-failures/README.md Co-authored-by: Michał Woźniak <[email protected]> * Update keps/sig-apps/3329-retriable-and-non-retriable-failures/README.md Co-authored-by: Michał Woźniak <[email protected]> * Update keps/sig-apps/3329-retriable-and-non-retriable-failures/README.md Co-authored-by: Maciej Szulik <[email protected]> --------- Co-authored-by: Michał Woźniak <[email protected]> Co-authored-by: Aldo Culquicondor <[email protected]> Co-authored-by: Aldo Culquicondor <[email protected]> Co-authored-by: Maciej Szulik <[email protected]>
1 parent 087bd4f commit a7bfe21

File tree

5 files changed

+996
-7
lines changed

5 files changed

+996
-7
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 3939
2+
alpha:
3+
approver: "@wojtek-t"

keps/sig-apps/3329-retriable-and-non-retriable-failures/README.md

Lines changed: 28 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -880,11 +880,16 @@ not added to the pod for a long enough time (for example 2 minutes).
880880

881881
#### Marking pods as Failed
882882

883-
When matching a failed pod against Job pod failure policy it is important that
883+
When matching a failed pod against Job pod failure policy, it is important that
884884
the pod is actually in the terminal phase (`Failed`), to ensure their state is
885885
not modified while Job controller matches them against the pod failure policy.
886886

887-
However, there are scenarios in which a pod gets stuck in a non-terminal phase,
887+
Additionally, it is necessary to avoid the creation of a replacement Pod if the
888+
previously created Pod becomes terminating (has a `deletionTimestamp` but is
889+
not `Failed` nor `Succeeded` yet), or we might create replacement Pods that
890+
wouldn't be created if the pod failure policy was applied against the terminated Pod.
891+
892+
There are scenarios in which a pod gets stuck in a non-terminal phase,
888893
but is doomed to be failed, as it is terminating (has `deletionTimestamp` set, also
889894
known as the `DELETING` state, see:
890895
[The API Object Lifecycle](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/object-lifecycle.md)).
@@ -1433,7 +1438,11 @@ type PodFailurePolicyRule struct {
14331438
OnPodConditions []PodFailurePolicyOnPodConditionsPattern
14341439
}
14351440
1436-
// PodFailurePolicy describes how failed pods influence the backoffLimit.
1441+
// podFailurePolicy describes how failed pods are accounted. In particular,
1442+
// how they influence the backoffLimit.
1443+
// When using podFailurePolicy, terminating Pods (have a `deletionTimestamp`)
1444+
// are not immediately replaced and don't count as failed until they reach
1445+
// a terminal phase (`Failed` or `Succeeded`).
14371446
type PodFailurePolicy struct {
14381447
// A list of pod failure policy rules. The rules are evaluated in order.
14391448
// Once a rule matches a Pod failure, the remaining of the rules are ignored.
@@ -1507,9 +1516,18 @@ spec:
15071516
### Evaluation
15081517
15091518
We use the `syncJob` function of the Job controller to evaluate the specified
1510-
`podFailurePolicy` rules against the failed pods. It is only the first rule with
1511-
matching requirements which is applied as the rules are evaluated in order. If
1512-
the pod failure does not match any of the specified rules, then default
1519+
`podFailurePolicy` rules against the failed pods.
1520+
1521+
Since terminating Pods (have `deletionTimestamp` and are not `Failed` or
1522+
`Succeeded`) don't have an exit code yet and might actually succeed, the
1523+
controller will not evaluate them against the `podFailurePolicy`.
1524+
The job controller will also not create a replacement Pod until they reach the
1525+
`Failed` phase. This behavior is the same as
1526+
[`podReplacementPolicy: Failed`](../3939-allow-replacement-when-fully-terminated/).
1527+
1528+
When evaluating Failed Pods against the `podFailurePolicy`, it is only the first
1529+
rule with matching requirements which is applied as the rules are evaluated in order.
1530+
If the pod failure does not match any of the specified rules, then default
15131531
handling of failed pods applies.
15141532

15151533
If we limit this feature to use `onExitCodes` only when `restartPolicy=Never`
@@ -1708,14 +1726,17 @@ Below are some examples to consider, in addition to the aforementioned [maturity
17081726
[SSA](https://kubernetes.io/docs/reference/using-api/server-side-apply/) client.
17091727
- The feature flag enabled by default
17101728

1711-
Second iteration:
1729+
Second iteration (1.27):
17121730
- Extend Kubelet to mark as failed pending terminating pods (see: [Marking pods as Failed](#marking-pods-as-failed)).
17131731
- Extend the feature documentation to explain transitioning of pending and
17141732
terminating pods into `Failed` phase.
17151733

17161734
Third iteration (1.28):
17171735
- Add `DisruptionTarget` condition for pods which are preempted by Kubelet to make room for critical pods.
17181736
Also, backport this fix to 1.26 and 1.27 release branches, and update the user-facing documentation to reflect this change.
1737+
- Avoid creation of replacement Pods for terminating Pods until they reach
1738+
the terminal phase. Update user-facing documentation.
1739+
Might be considered for backport to 1.27.
17191740

17201741
#### GA
17211742

keps/sig-apps/3329-retriable-and-non-retriable-failures/kep.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ approvers:
1818
- "@deads2k"
1919
- "@dchen1107"
2020

21+
see-also:
22+
- "/keps/sig-apps/3939-allow-replacement-when-fully-terminated"
23+
2124
# The target maturity stage in the current dev cycle for this KEP.
2225
stage: beta
2326

0 commit comments

Comments
 (0)