Skip to content

Commit acb342a

Browse files
authored
Merge pull request kubernetes#2362 from alculquicondor/completed-indexes
Add completedIndexes to Indexed Job status
2 parents 3e2c87b + 660ebcf commit acb342a

File tree

1 file changed

+35
-14
lines changed

1 file changed

+35
-14
lines changed

keps/sig-apps/2214-indexed-job/README.md

Lines changed: 35 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
- [JobSpec API](#jobspec-api)
1616
- [Pod detail](#pod-detail)
1717
- [Job completion and restart policy](#job-completion-and-restart-policy)
18+
- [Track completed indexes in Job status](#track-completed-indexes-in-job-status)
1819
- [Job parallelism](#job-parallelism)
1920
- [Test Plan](#test-plan)
2021
- [Graduation Criteria](#graduation-criteria)
@@ -206,16 +207,28 @@ type JobSpec struct {
206207
// `NonIndexed`.
207208
CompletionMode CompletionMode
208209
}
210+
211+
type JobStatus struct {
212+
...
213+
214+
// CompletedIndexes holds the completed indexes when .spec.completionMode =
215+
// "Indexed" in a text format. The indexes are represented as decimal integers
216+
// separated by commas. The numbers are listed in increasing order. Two or
217+
// more consecutive numbers are compressed and represented by the first and
218+
// last element of the series, separated by a hyphen.
219+
// For example, if the completed indexes are 1, 3, 4, 5 and 7, they are
220+
// represented as "1,3-5,7".
221+
CompletedIndexes string
222+
}
209223
```
210224

211-
As the comment describes, when `.spec.completionMode = "Indexed"`, the
212-
`.spec.completions` must be:
225+
As the comment describes, when `.spec.completionMode = "Indexed"`:
213226

214-
- a non-zero positive value. This is to trigger Job management strategy for
215-
*fixed completion count*. That is, `Indexed` mode cannot be used for work
216-
queue patterns.
217-
- less than or equal to `10^6`. This is to guarantee that we can keep track of
218-
completions per-index in the Job status in the future.
227+
- `.spec.completions` must be a non-zero positive value. This is to trigger Job
228+
management strategy for *fixed completion count*. That is, `Indexed` mode
229+
cannot be used for work queue patterns.
230+
- `.spec.parallelism` must be less than or equal to `10^5`. This is to guarantee
231+
that we can keep track of completions per-index in the Job status.
219232

220233
### Pod detail
221234

@@ -271,9 +284,7 @@ them.
271284
The kubelet handles container restarts as usual, according to the
272285
`spec.template.spec.restartPolicy`.
273286

274-
<<[UNRESOLVED TBD Beta: Track completed indexes in Job status]>>
275-
Once [kubernetes/kubernetes#28486](https://github.com/kubernetes/kubernetes/issues/28486)
276-
is resolved:
287+
#### Track completed indexes in Job status
277288

278289
The Job controller keeps track of completed indexes in
279290
`.status.completedIndexes`, a string that represents a list of numbers in a
@@ -287,9 +298,8 @@ CompletedIndexes: "2-4,6-7"
287298
The `kubectl describe` command crops the list of indexes if it's too long:
288299

289300
```
290-
Completed Indexes: [1-25,28,30-32,...]
301+
Completed Indexes: 1-25,28,30-32,...
291302
```
292-
<<[/UNRESOLVED]>>
293303

294304
### Job parallelism
295305

@@ -326,7 +336,13 @@ gate enabled and disabled.
326336
#### Alpha -> Beta Graduation
327337

328338
- Complete features:
329-
- Tracking completions by index in Job status
339+
- Indexed Jobs when tracking completion without lingering Pods
340+
[kubernetes/enhancements#2307](https://github.com/kubernetes/enhancements/issues/2307).
341+
342+
Keeping the size of .status.completedIndexes is desirable to reduce load
343+
on watchers. We will evaluate holding of from counting completed Pods that
344+
have an outlying index. That is, contiguous indexes would be counted first.
345+
This allows to keep the size of the compressed list small.
330346
- Gather feedback from end users and operators' developers. Open questions:
331347
- Are stable Pod names necessary?
332348
- Tests are in Testgrid and linked in KEP
@@ -498,7 +514,12 @@ the existing API objects?**
498514
Yes.
499515

500516
- API type(s): Job
501-
- Estimated increase in size: new field of about 30 bytes.
517+
- Estimated increase in size:
518+
- New field in Spec about 30 bytes.
519+
- New field in Status. In the worst case scenario, completed indexes are
520+
non-consecutive. Since the API limits parallelism to 10^5, we could have
521+
up to 5*10^4 non-consecutive numbers, which can be represented in less
522+
than 1MB.
502523

503524
- API type(s): Pod, only when created with the new completion mode.
504525
- Estimated increase in size: new annotation of about 50 bytes.

0 commit comments

Comments
 (0)