Skip to content

Commit 6e87729

Browse files
authored
Merge pull request kubernetes#3452 from mimowo/retriable-exit-codes-additional-update2
Updates to KEP-3329 "Retriable and non-retriable Pod failures for Jobs"
2 parents 0bec369 + 0c92bec commit 6e87729

File tree

1 file changed

+58
-12
lines changed
  • keps/sig-apps/3329-retriable-and-non-retriable-failures

1 file changed

+58
-12
lines changed

keps/sig-apps/3329-retriable-and-non-retriable-failures/README.md

Lines changed: 58 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,7 @@ tags, and then generate with `hack/update-toc.sh`.
107107
- [Evolving condition types](#evolving-condition-types)
108108
- [Design Details](#design-details)
109109
- [New PodConditions](#new-podconditions)
110+
- [Interim FailureTarget condition](#interim-failuretarget-condition)
110111
- [JobSpec API](#jobspec-api)
111112
- [Evaluation](#evaluation)
112113
- [Test Plan](#test-plan)
@@ -169,7 +170,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
169170
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
170171
- [x] (R) Production readiness review completed
171172
- [x] (R) Production readiness review approved
172-
- [ ] "Implementation History" section is up-to-date for milestone
173+
- [x] "Implementation History" section is up-to-date for milestone
173174
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
174175
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
175176

@@ -674,7 +675,8 @@ The Pod status (which includes the `conditions` field and the container exit
674675
codes) could be lost if the failed pod is garbage collected.
675676

676677
Losing Pod's status before it is interpreted by Job Controller can be prevented
677-
by using the feature of [job tracking with finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/).
678+
by using the feature of [job tracking with finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/)
679+
(see more about the design details section: [Interim FailureTarget condition](#interim-failuretarget-condition)).
678680

679681
#### Evolving condition types
680682

@@ -739,13 +741,33 @@ pod delete requests are issued to modify the code to also append a meaningful
739741
condition with dedicated `Type`, `Reason` and `Message` fields based on the
740742
invocation context.
741743

744+
### Interim FailureTarget condition
745+
746+
There is a risk of losing the Pod status information due to PodGC, which could
747+
prevent Job Controller to react to a pod failure with respect to the configured
748+
pod failure policy rules (see also: [Garbage collected pods](#garbage-collected-pods)).
749+
750+
In order to make sure all pods are checked against the rules we require the
751+
feature of [job tracking with finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/)
752+
to be enabled.
753+
754+
Additionally, before we actually remove the finalizers from the pods
755+
(allowing them to be deleted by PodGC) we record the determined job failure
756+
message (if any rule with `JobFail` matched) in an interim job condition, called
757+
`FailureTarget`. Once the pod finalizers are removed we update the job status
758+
with the final `Failed` job condition. This strategy eliminates a possible
759+
race condition that we could lose the information about the job failure if
760+
Job Controller crashed between removing the pod finalizers are updating the final
761+
`Failed` condition in the job status.
762+
742763
### JobSpec API
743764

744765
We extend the Job API in order to allow to apply different actions depending
745766
on the conditions associated with the pod failure.
746767

747768
```golang
748769
// PodFailurePolicyAction specifies how a Pod failure is handled.
770+
// +enum
749771
type PodFailurePolicyAction string
750772
751773
const (
@@ -764,6 +786,7 @@ const (
764786
PodFailurePolicyActionCount PodFailurePolicyAction = "Count"
765787
)
766788
789+
// +enum
767790
type PodFailurePolicyOnExitCodesOperator string
768791
769792
const (
@@ -780,38 +803,47 @@ const (
780803
type PodFailurePolicyOnExitCodesRequirement struct {
781804
// Restricts the check for exit codes to the container with the
782805
// specified name. When null, the rule applies to all containers.
806+
// When specified, it should match one the container or initContainer
807+
// names in the pod template.
783808
// +optional
784809
ContainerName *string
785810
786811
// Represents the relationship between the container exit code(s) and the
787812
// specified values. Containers completed with success (exit code 0) are
788-
// excluded from the requirement check.Possible values are:
813+
// excluded from the requirement check. Possible values are:
789814
// - In: the requirement is satisfied if at least one container exit code
790815
// (might be multiple if there are multiple containers not restricted
791816
// by the 'containerName' field) is in the set of specified values.
792817
// - NotIn: the requirement is satisfied if at least one container exit code
793818
// (might be multiple if there are multiple containers not restricted
794819
// by the 'containerName' field) is not in the set of specified values.
820+
// Additional values are considered to be added in the future. Clients should
821+
// react to an unknown operator by assuming the requirement is not satisfied.
795822
Operator PodFailurePolicyOnExitCodesOperator
796823
797824
// Specifies the set of values. Each returned container exit code (might be
798825
// multiple in case of multiple containers) is checked against this set of
799-
// values with respect to the operator. Value '0' cannot be used for the In
800-
// operator.
826+
// values with respect to the operator. The list of values must be ordered
827+
// and must not contain duplicates. Value '0' cannot be used for the In operator.
828+
// At least one element is required. At most 255 elements are allowed.
801829
// +listType=set
802830
Values []int32
803831
}
804832
805833
// PodFailurePolicyOnPodConditionsPattern describes a pattern for matching
806834
// an actual pod condition type.
807835
type PodFailurePolicyOnPodConditionsPattern struct {
808-
// Specifies the required Pod condition type. The pattern matches a pod condition
809-
// if the specified type equals the pod condition type.
836+
// Specifies the required Pod condition type. To match a pod condition
837+
// it is required that specified type equals the pod condition type.
810838
Type api.PodConditionType
839+
// Specifies the required Pod condition status. To match a pod condition
840+
// it is required that the specified status equals the pod condition status.
841+
// Defaults to True.
842+
Status api.ConditionStatus
811843
}
812844
813845
// PodFailurePolicyRule describes how a pod failure is handled when the requirements are met.
814-
// Only one of OnExitCodes and onPodConditions can be used in each rule.
846+
// One of OnExitCodes and onPodConditions, but not both, can be used in each rule.
815847
type PodFailurePolicyRule struct {
816848
// Specifies the action taken on a pod failure when the requirements are satisfied.
817849
// Possible values are:
@@ -821,6 +853,8 @@ type PodFailurePolicyRule struct {
821853
// incremented and a replacement pod is created.
822854
// - Count: indicates that the pod is handled in the default way - the
823855
// counter towards the .backoffLimit is incremented.
856+
// Additional values are considered to be added in the future. Clients should
857+
// react to an unknown action by skipping the rule.
824858
Action PodFailurePolicyAction
825859
826860
// Represents the requirement on the container exit codes.
@@ -829,7 +863,7 @@ type PodFailurePolicyRule struct {
829863
830864
// Represents the requirement on the pod conditions. The requirement is represented
831865
// as a list of pod condition patterns. The requirement is satisfied if at
832-
// least pattern matches an actual pod condition.
866+
// least one pattern matches an actual pod condition. At most 20 elements are allowed.
833867
// +listType=atomic
834868
OnPodConditions []PodFailurePolicyOnPodConditionsPattern
835869
}
@@ -840,7 +874,7 @@ type PodFailurePolicy struct {
840874
// Once a rule matches a Pod failure, the remaining of the rules are ignored.
841875
// When no rule matches the Pod failure, the default handling applies - the
842876
// counter of pod failures is incremented and it is checked against
843-
// the backoffLimit.
877+
// the backoffLimit. At most 20 elements are allowed.
844878
// +listType=atomic
845879
Rules []PodFailurePolicyRule
846880
}
@@ -854,7 +888,7 @@ type JobSpec struct {
854888
// If empty, the default behaviour applies - the counter of failed pods,
855889
// represented by the jobs's .status.failed field, is incremented and it is
856890
// checked against the backoffLimit. This field cannot be used in combination
857-
// with restartPolicy=OnFailure.
891+
// with .spec.podTemplate.spec.restartPolicy=OnFailure.
858892
//
859893
// This field is alpha-level. To use this field, you must enable the
860894
// `JobPodFailurePolicy` feature gate (disabled by default).
@@ -1083,7 +1117,9 @@ Below are some examples to consider, in addition to the aforementioned [maturity
10831117
- Re-evaluate introduction of a generic opinionated condition type
10841118
indicating that a pod should be retried (see: [Evolving condition types](#evolving-condition-types))
10851119
- Simplify the code in job controller responsible for detection of failed pods
1086-
based on the fix for pods stuck in the running phase (see: [Marking pods as Failed](marking-pods-as-failed))
1120+
based on the fix for pods stuck in the running phase (see: [Marking pods as Failed](marking-pods-as-failed)).
1121+
Also, introduce a worker in the disruption controller to mark pods
1122+
stuck in the pending phase, with set deletionTimestamp, as failed.
10871123
- Commonize the code for appending pod conditions between components
10881124
- Do not update the pod disruption condition (with type=`DisruptionTarget`) if
10891125
it is already present with `status=True`
@@ -1545,6 +1581,16 @@ For each of them, fill in the following information by copying the below templat
15451581

15461582
## Implementation History
15471583

1584+
- 2022-06-23: Initial KEP merged
1585+
- 2022-07-12: Preparatory PR "Refactor gc_controller to do not use the deletePod stub" merged
1586+
- 2022-07-14: Preparatory PR "efactor taint_manager to do not use getPod and getNode stubs" merged
1587+
- 2022-07-20: Preparatory PR "Add integration test for podgc" merged
1588+
- 2022-07-28: KEP updates merged
1589+
- 2022-08-01: Additional KEP updates merged
1590+
- 2022-08-02: PR "Append new pod conditions when deleting pods to indicate the reason for pod deletion" merged
1591+
- 2022-08-02: PR "Add worker to clean up stale DisruptionTarget condition" merged
1592+
- 2022-08-04: PR "Support handling of pod failures with respect to the configured rules" merged
1593+
15481594
<!--
15491595
Major milestones in the lifecycle of a KEP should be tracked in this section.
15501596
Major milestones might include:

0 commit comments

Comments
 (0)