|
1 | 1 | ---
|
2 | 2 | reviewers:
|
3 | 3 | - janetkuo
|
4 |
| -title: Automatic Clean-up for Finished Jobs |
| 4 | +title: Automatic Cleanup for Finished Jobs |
5 | 5 | content_type: concept
|
6 | 6 | weight: 70
|
| 7 | +description: >- |
| 8 | + A time-to-live mechanism to clean up old Jobs that have finished execution. |
7 | 9 | ---
|
8 | 10 |
|
9 | 11 | <!-- overview -->
|
10 | 12 |
|
11 | 13 | {{< feature-state for_k8s_version="v1.23" state="stable" >}}
|
12 | 14 |
|
13 |
| -TTL-after-finished {{<glossary_tooltip text="controller" term_id="controller">}} provides a |
14 |
| -TTL (time to live) mechanism to limit the lifetime of resource objects that |
15 |
| -have finished execution. TTL controller only handles |
16 |
| -{{< glossary_tooltip text="Jobs" term_id="job" >}}. |
| 15 | +When your Job has finished, it's useful to keep that Job in the API (and not immediately delete the Job) |
| 16 | +so that you can tell whether the Job succeeded or failed. |
| 17 | + |
| 18 | +Kubernetes' TTL-after-finished {{<glossary_tooltip text="controller" term_id="controller">}} provides a |
| 19 | +TTL (time to live) mechanism to limit the lifetime of Job objects that |
| 20 | +have finished execution. |
17 | 21 |
|
18 | 22 | <!-- body -->
|
19 | 23 |
|
20 |
| -## TTL-after-finished Controller |
| 24 | +## Cleanup for finished Jobs |
21 | 25 |
|
22 |
| -The TTL-after-finished controller is only supported for Jobs. A cluster operator can use this feature to clean |
| 26 | +The TTL-after-finished controller is only supported for Jobs. You can use this mechanism to clean |
23 | 27 | up finished Jobs (either `Complete` or `Failed`) automatically by specifying the
|
24 | 28 | `.spec.ttlSecondsAfterFinished` field of a Job, as in this
|
25 | 29 | [example](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically).
|
26 |
| -The TTL-after-finished controller will assume that a job is eligible to be cleaned up |
27 |
| -TTL seconds after the job has finished, in other words, when the TTL has expired. When the |
| 30 | + |
| 31 | +The TTL-after-finished controller assumes that a Job is eligible to be cleaned up |
| 32 | +TTL seconds after the Job has finished. The timer starts once the |
| 33 | +status condition of the Job changes to show that the Job is either `Complete` or `Failed`; once the TTL has |
| 34 | +expired, that Job becomes eligible for |
| 35 | +[cascading](/docs/concepts/architecture/garbage-collection/#cascading-deletion) removal. When the |
28 | 36 | TTL-after-finished controller cleans up a job, it will delete it cascadingly, that is to say it will delete
|
29 |
| -its dependent objects together with it. Note that when the job is deleted, |
30 |
| -its lifecycle guarantees, such as finalizers, will be honored. |
| 37 | +its dependent objects together with it. |
| 38 | + |
| 39 | +Kubernetes honors object lifecycle guarantees on the Job, such as waiting for |
| 40 | +[finalizers](/docs/concepts/overview/working-with-objects/finalizers/). |
31 | 41 |
|
32 |
| -The TTL seconds can be set at any time. Here are some examples for setting the |
| 42 | +You can set the TTL seconds at any time. Here are some examples for setting the |
33 | 43 | `.spec.ttlSecondsAfterFinished` field of a Job:
|
34 | 44 |
|
35 |
| -* Specify this field in the job manifest, so that a Job can be cleaned up |
| 45 | +* Specify this field in the Job manifest, so that a Job can be cleaned up |
36 | 46 | automatically some time after it finishes.
|
37 |
| -* Set this field of existing, already finished jobs, to adopt this new |
38 |
| - feature. |
| 47 | +* Manually set this field of existing, already finished Jobs, so that they become eligible |
| 48 | + for cleanup. |
39 | 49 | * Use a
|
40 | 50 | [mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks)
|
41 |
| - to set this field dynamically at job creation time. Cluster administrators can |
| 51 | + to set this field dynamically at Job creation time. Cluster administrators can |
42 | 52 | use this to enforce a TTL policy for finished jobs.
|
43 | 53 | * Use a
|
44 | 54 | [mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks)
|
45 |
| - to set this field dynamically after the job has finished, and choose |
46 |
| - different TTL values based on job status, labels, etc. |
| 55 | + to set this field dynamically after the Job has finished, and choose |
| 56 | + different TTL values based on job status, labels. For this case, the webhook needs |
| 57 | + to detect changes to the `.status` of the Job and only set a TTL when the Job |
| 58 | + is being marked as completed. |
| 59 | +* Write your own controller to manage the cleanup TTL for Jobs that match a particular |
| 60 | + {{< glossary_tooltip term_id="selector" text="selector-selector" >}}. |
47 | 61 |
|
48 |
| -## Caveat |
| 62 | +## Caveats |
49 | 63 |
|
50 |
| -### Updating TTL Seconds |
| 64 | +### Updating TTL for finished Jobs |
51 | 65 |
|
52 |
| -Note that the TTL period, e.g. `.spec.ttlSecondsAfterFinished` field of Jobs, |
53 |
| -can be modified after the job is created or has finished. However, once the |
54 |
| -Job becomes eligible to be deleted (when the TTL has expired), the system won't |
55 |
| -guarantee that the Jobs will be kept, even if an update to extend the TTL |
56 |
| -returns a successful API response. |
| 66 | +You can modify the TTL period, e.g. `.spec.ttlSecondsAfterFinished` field of Jobs, |
| 67 | +after the job is created or has finished. If you extend the TTL period after the |
| 68 | +existing `ttlSecondsAfterFinished` period has expired, Kubernetes doesn't guarantee |
| 69 | +to retain that Job, even if an update to extend the TTL returns a successful API |
| 70 | +response. |
57 | 71 |
|
58 |
| -### Time Skew |
| 72 | +### Time skew |
59 | 73 |
|
60 |
| -Because TTL-after-finished controller uses timestamps stored in the Kubernetes jobs to |
| 74 | +Because the TTL-after-finished controller uses timestamps stored in the Kubernetes jobs to |
61 | 75 | determine whether the TTL has expired or not, this feature is sensitive to time
|
62 |
| -skew in the cluster, which may cause TTL-after-finish controller to clean up job objects |
| 76 | +skew in your cluster, which may cause the control plane to clean up Job objects |
63 | 77 | at the wrong time.
|
64 | 78 |
|
65 | 79 | Clocks aren't always correct, but the difference should be
|
66 | 80 | very small. Please be aware of this risk when setting a non-zero TTL.
|
67 | 81 |
|
68 |
| - |
69 |
| - |
70 | 82 | ## {{% heading "whatsnext" %}}
|
71 | 83 |
|
72 |
| -* [Clean up Jobs automatically](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically) |
73 |
| - |
74 |
| -* [Design doc](https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/592-ttl-after-finish/README.md) |
| 84 | +* Read [Clean up Jobs automatically](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically) |
75 | 85 |
|
| 86 | +* Refer to the [Kubernetes Enhancement Proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/592-ttl-after-finish/README.md) |
| 87 | + (KEP) for adding this mechanism. |
0 commit comments