@@ -28,6 +28,9 @@ You should already be familiar with the basic use of [Job](/docs/concepts/worklo
28
28
29
29
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
30
30
31
+ Ensure that the [ feature gates] ( /docs/reference/command-line-tools-reference/feature-gates/ )
32
+ ` PodDisruptionConditions ` and ` JobPodFailurePolicy ` are both enabled in your cluster.
33
+
31
34
## Using Pod failure policy to avoid unnecessary Pod retries
32
35
33
36
With the following example, you can learn how to use Pod failure policy to
@@ -129,6 +132,114 @@ kubectl delete jobs/job-pod-failure-policy-ignore
129
132
130
133
The cluster automatically cleans up the Pods.
131
134
135
+ ## Using Pod failure policy to avoid unnecessary Pod retries based on custom Pod Conditions
136
+
137
+ With the following example, you can learn how to use Pod failure policy to
138
+ avoid unnecessary Pod restarts based on custom Pod Conditions.
139
+
140
+ {{< note >}}
141
+ The example below works since version 1.27 as it relies on transitioning of
142
+ deleted pods, in the ` Pending ` phase, to a terminal phase
143
+ (see: [ Pod Phase] ( /docs/concepts/workloads/pods/pod-lifecycle/#pod-phase ) ).
144
+ {{< /note >}}
145
+
146
+ 1 . First, create a Job based on the config:
147
+
148
+ {{< codenew file="/controllers/job-pod-failure-policy-config-issue.yaml" >}}
149
+
150
+ by running:
151
+
152
+ ``` sh
153
+ kubectl create -f job-pod-failure-policy-config-issue.yaml
154
+ ```
155
+
156
+ Note that, the image is misconfigured, as it does not exist.
157
+
158
+ 2 . Inspect the status of the job's Pods by running:
159
+
160
+ ``` sh
161
+ kubectl get pods -l job-name=job-pod-failure-policy-config-issue -o yaml
162
+ ```
163
+
164
+ You will see output similar to this:
165
+ ``` yaml
166
+ containerStatuses :
167
+ - image : non-existing-repo/non-existing-image:example
168
+ ...
169
+ state :
170
+ waiting :
171
+ message : Back-off pulling image "non-existing-repo/non-existing-image:example"
172
+ reason : ImagePullBackOff
173
+ ...
174
+ phase : Pending
175
+ ` ` `
176
+
177
+ Note that the pod remains in the ` Pending` phase as it fails to pull the
178
+ misconfigured image. This, in principle, could be a transient issue and the
179
+ image could get pulled. However, in this case, the image does not exist so
180
+ we indicate this fact by a custom condition.
181
+
182
+ 3. Add the custom condition. First prepare the patch by running :
183
+
184
+ ` ` ` sh
185
+ cat <<EOF > patch.yaml
186
+ status:
187
+ conditions:
188
+ - type: ConfigIssue
189
+ status: "True"
190
+ reason: "NonExistingImage"
191
+ lastTransitionTime: "$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
192
+ EOF
193
+ ` ` `
194
+ Second, select one of the pods created by the job by running :
195
+ ` ` `
196
+ podName=$(kubectl get pods -l job-name=job-pod-failure-policy-config-issue -o jsonpath='{.items[0].metadata.name}')
197
+ ` ` `
198
+
199
+ Then, apply the patch on one of the pods by running the following command :
200
+
201
+ ` ` ` sh
202
+ kubectl patch pod $podName --subresource=status --patch-file=patch.yaml
203
+ ` ` `
204
+
205
+ If applied successfully, you will get a notification like this :
206
+
207
+ ` ` ` sh
208
+ pod/job-pod-failure-policy-config-issue-k6pvp patched
209
+ ` ` `
210
+
211
+ 4. Delete the pod to transition it to `Failed` phase, by running the command :
212
+
213
+ ` ` ` sh
214
+ kubectl delete pods/$podName
215
+ ` ` `
216
+
217
+ 5. Inspect the status of the Job by running :
218
+
219
+ ` ` ` sh
220
+ kubectl get jobs -l job-name=job-pod-failure-policy-config-issue -o yaml
221
+ ` ` `
222
+
223
+ In the Job status, see a job `Failed` condition with the field `reason`
224
+ equal `PodFailurePolicy`. Additionally, the `message` field contains a
225
+ more detailed information about the Job termination, such as :
226
+ ` Pod default/job-pod-failure-policy-config-issue-k6pvp has condition ConfigIssue matching FailJob rule at index 0` .
227
+
228
+ {{< note >}}
229
+ In a production environment, the steps 3 and 4 should be automated by a
230
+ user-provided controller.
231
+ {{< /note >}}
232
+
233
+ # ## Cleaning up
234
+
235
+ Delete the Job you created :
236
+
237
+ ` ` ` sh
238
+ kubectl delete jobs/job-pod-failure-policy-config-issue
239
+ ` ` `
240
+
241
+ The cluster automatically cleans up the Pods.
242
+
132
243
# # Alternatives
133
244
134
245
You could rely solely on the
0 commit comments