Skip to content

Commit f5954cd

Browse files
committed
add a better example for pod replacement policy
Signed-off-by: Dejan Zele Pejchev <[email protected]>
1 parent 6c97965 commit f5954cd

File tree

1 file changed

+48
-6
lines changed

1 file changed

+48
-6
lines changed

content/en/blog/_posts/2025-0x-xx-jobs-podreplacementpolicy-goes-ga.md

Lines changed: 48 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,8 @@ replaces terminating Pods, helping you avoid these issues.
3939

4040
This enhancement means that Jobs in Kubernetes have an optional field `.spec.podReplacementPolicy`.
4141
You can choose one of two policies:
42-
- TerminatingOrFailed (default): Replaces Pods as soon as they start terminating.
43-
- Failed: Replaces Pods only after they fully terminate and transition to the `Failed` phase.
42+
- `TerminatingOrFailed` (default): Replaces Pods as soon as they start terminating.
43+
- `Failed`: Replaces Pods only after they fully terminate and transition to the `Failed` phase.
4444

4545
Setting the policy to `Failed` ensures that a new Pod is only created after the previous one has completely terminated.
4646

@@ -49,20 +49,22 @@ See [Pod Failure Policy](/docs/concepts/workloads/controllers/job/#pod-failure-p
4949

5050
You can check how many Pods are currently terminating by inspecting the Job’s `.status.terminating` field:
5151

52-
```sh
52+
```shell
5353
kubectl get job myjob -o=jsonpath='{.status.terminating}'
5454
```
5555

5656
## Example
5757

58-
Here’s a simple Job spec that ensures Pods are replaced only after they terminate completely:
59-
58+
Here’s a Job example that executes a task two times (`spec.completions: 2`) in parallel (`spec.parallelism: 2`) and
59+
replaces Pods only after they fully terminate (`spec.podReplacementPolicy: Failed`):
6060
```yaml
6161
apiVersion: batch/v1
6262
kind: Job
6363
metadata:
6464
name: example-job
6565
spec:
66+
completions: 2
67+
parallelism: 2
6668
podReplacementPolicy: Failed
6769
template:
6870
spec:
@@ -72,7 +74,47 @@ spec:
7274
image: your-image
7375
```
7476
75-
With this setting, Kubernetes won’t launch a replacement Pod while the previous Pod is still terminating.
77+
If a Pod receives a SIGTERM signal (deletion, eviction, preemption...), it begins terminating.
78+
When the container handles termination gracefully, cleanup may take some time.
79+
80+
When the Job starts, we will see two Pods running:
81+
```shell
82+
kubectl get pods
83+
84+
NAME READY STATUS RESTARTS AGE
85+
example-job-qr8kf 1/1 Running 0 2s
86+
example-job-stvb4 1/1 Running 0 2s
87+
```
88+
89+
Let's delete one of the Pods (`example-job-qr8kf`).
90+
91+
With the `TerminatingOrFailed` policy, as soon as one Pod (`example-job-qr8kf`) starts terminating, the Job controller immediately creates a new Pod (`example-job-b59zk`) to replace it.
92+
```shell
93+
kubectl get pods
94+
95+
NAME READY STATUS RESTARTS AGE
96+
example-job-b59zk 1/1 Running 0 1s
97+
example-job-qr8kf 1/1 Terminating 0 17s
98+
example-job-stvb4 1/1 Running 0 17s
99+
```
100+
101+
With the `Failed` policy, the new Pod (`example-job-b59zk`) is not created while the old Pod (`example-job-qr8kf`) is terminating.
102+
```shell
103+
kubectl get pods
104+
105+
NAME READY STATUS RESTARTS AGE
106+
example-job-qr8kf 1/1 Terminating 0 17s
107+
example-job-stvb4 1/1 Running 0 17s
108+
```
109+
110+
When the terminating Pod has fully transitioned to the `Failed` phase, a new Pod is created:
111+
```shell
112+
kubectl get pods
113+
114+
NAME READY STATUS RESTARTS AGE
115+
example-job-b59zk 1/1 Running 0 1s
116+
example-job-stvb4 1/1 Running 0 25s
117+
```
76118

77119
## How can you learn more?
78120

0 commit comments

Comments
 (0)