Skip to content

Commit 8fa8c47

Browse files
Update use of Job annotation instead of finalizer
And add note about keeping the legacy behavior Signed-off-by: Aldo Culquicondor <[email protected]>
1 parent 187b2f6 commit 8fa8c47

File tree

1 file changed

+24
-16
lines changed
  • keps/sig-apps/2307-job-tracking-without-lingering-pods

1 file changed

+24
-16
lines changed

keps/sig-apps/2307-job-tracking-without-lingering-pods/README.md

Lines changed: 24 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -180,8 +180,9 @@ could be stopped at any point and executed again from the first step without
180180
losing information. Generally, all the steps happen in a single Job sync
181181
cycle.
182182

183-
0. kube-apiserver adds the `batch.kubernetes.io/job-completion` finalizer
184-
to newly created Jobs.
183+
0. kube-apiserver adds the `batch.kubernetes.io/job-completion` annotation
184+
to newly created Jobs. This annotation allows the distinction of new Jobs
185+
from Jobs that are already tracked with the legacy algorithm.
185186
1. The Job controller calculates the number of succeeded Pods as the sum of:
186187
- `.status.succeeded`,
187188
- the size of `job.status.uncountedTerminatedPods.succeeded` and
@@ -210,9 +211,6 @@ cycle.
210211
The counts increment the `.status.failed` and `.status.succeeded` and clears
211212
counted Pods from `.status.uncountedTerminatedPods` lists. The controller
212213
sends a status update.
213-
5. The Job controller removes the `batch.kubernetes.io/job-completion` finalizer
214-
from the Job if it has completed (succeeded or failed) and no Job Pod's have
215-
finalizers.
216214

217215
Steps 2 to 4 might deal with a potentially big number of Pods. Thus, status
218216
updates can potentially stress the kube-apiserver. For this reason, the Job
@@ -269,9 +267,11 @@ failures.
269267

270268
### Deleted Jobs
271269

272-
When a user or another controller deletes a Job, the job controller scans
273-
associated Pods and removes finalizers from them without updating any Job
274-
status.
270+
When a user or another controller deletes a Job, the cascading makes sure that
271+
each Pods gets a deletion timestamp. The job controller captures this Pod
272+
update event, adding the orphan Pod (Pod for which the Job controller doesn't
273+
exist) to a separate work queue. A single worker scans this work queue to
274+
remove the finalizer from the Pod.
275275

276276
### Pod adoption
277277

@@ -312,13 +312,20 @@ the owner reference.
312312
- Processing 5000 Pods per minute across any number of Jobs, with Pod creation
313313
having higher priority than status updates. This might depend on
314314
[Priority and Fairness](https://git.k8s.io/enhancements/keps/sig-api-machinery/1040-priority-and-fairness).
315+
- Ensure that tracking Jobs with big number of Pods doesn't cause starvation of
316+
smaller jobs.
315317
- Metrics:
316318
- latency
317319
- errors
318320
- Tests are in Testgrid and linked in KEP
319321

320322
#### Beta -> GA Graduation
321323

324+
- Established a plan to remove legacy tracking and the use of
325+
`batch.kubernetes.io/job-completion` as an annotation. The tentative
326+
expectation is to keep them for two releases after the graduation to GA.
327+
This time can be reduced if we can envision an algorithm to safely transition
328+
a Job from legacy to new tracking.
322329
- E2E test graduates to conformance.
323330
- Job tracking scales to 10^5 completions per Job processed within an order of
324331
minutes.
@@ -328,19 +335,23 @@ the owner reference.
328335
When the feature `JobTrackingWithFinalizers` is enabled for the first
329336
time, the cluster can have Jobs whose Pods don't have the
330337
`batch.kubernetes.io/job-completion` finalizer. It would be hard to add the
331-
finalizer to all Pods while preventing race conditions.
338+
finalizer to all Pods while preventing race conditions. That is, at the time
339+
of migration to the new tracking, a Pod could not have the finalizer for two
340+
reasons: it wasn't migrated yet, or it was already counted.
332341

333-
The job controller uses the existence of the finalizer
342+
The job controller uses the existence of the Job annotation
334343
`batch.kubernetes.io/job-completion` to determine if it should use tracking with
335-
finalizers. If the finalizer is not present, and the Job is not yet completed,
344+
finalizers. If the annotation is not present, and the Job is not yet completed,
336345
the job controllers tracks Pods using the legacy tracking (with lingering Pods).
337346

338-
The kube-apiserver sets the `batch.kubernetes.io/job-completion` finalizer to
347+
The kube-apiserver sets the `batch.kubernetes.io/job-completion` annotation to
339348
newly created Jobs when the feature gate `JobTrackingWithFinalizers` is enabled.
349+
This annotation cannot be added in a Job update.
340350

341351
When the feature is disabled after being enabled for some time, the next time
342352
the Job controller syncs a Job:
343-
1. It removes finalizers from the Job and all the Pods owned by it.
353+
1. It removes finalizers from the Pods owned by it and the annotation from the
354+
Job.
344355
2. Sets `.status.uncountedTerminatedPods` to nil.
345356

346357
After this point, the Job will no longer be tracked using finalizers, even if
@@ -480,9 +491,6 @@ previous answers based on experience in the field._
480491
- estimated throughput: one per Pod created by the Job controller, when Pod
481492
finishes or is removed.
482493
- originating component: kube-controller-manager
483-
- PATCH Jobs, to remove finalizers.
484-
- estimated throughput: one call for each Job created.
485-
- originating component: kube-controller-manager
486494
- PUT Job status, to keep track of uncounted Pods.
487495
- estimated throughput: at least one per Job sync. The job controller
488496
throttles additional calls at 1 per a few seconds (precise throughput TBD

0 commit comments

Comments
 (0)