Skip to content

Commit df2aabc

Browse files
mimowoatiratreesoltysh
authored
Update the managedBy KEP with fix for terminating pods (kubernetes#4657)
* Update the managedBy KEP * Update keps/sig-apps/4368-support-managed-by-for-batch-jobs/README.md Co-authored-by: Filip Křepinský <[email protected]> * Update implementation history * Update keps/sig-apps/4368-support-managed-by-for-batch-jobs/README.md Co-authored-by: Maciej Szulik <[email protected]> * Mention FailureTarget and SuccessCriteriaMet conditions * Full links * Full links2 * Add the ForbidSoft to non-goals * go through second alpha * add validation for proceeding addition of Failed or Complete conditions * remarks about related issues * Add test plan * Links for the integration tests * Move validation to the second alpha * cleanup irrelevant paragraph now * New risks section Signed-off-by: Michal Wozniak <[email protected]> * add reviewers for the kep to reflect the status quo * typo fix * revert * new diagram * Add a note about the test for re-enabling --------- Signed-off-by: Michal Wozniak <[email protected]> Co-authored-by: Filip Křepinský <[email protected]> Co-authored-by: Maciej Szulik <[email protected]>
1 parent ed7065f commit df2aabc

File tree

5 files changed

+182
-12
lines changed

5 files changed

+182
-12
lines changed

keps/sig-apps/3329-retriable-and-non-retriable-failures/README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@ tags, and then generate with `hack/update-toc.sh`.
9090
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
9191
- [Job-level vs. pod-level spec](#job-level-vs-pod-level-spec)
9292
- [Relationship with Pod.spec.restartPolicy](#relationship-with-podspecrestartpolicy)
93+
- [The scope of the FailureTarget condition](#the-scope-of-the-failuretarget-condition)
9394
- [Current state review](#current-state-review)
9495
- [Preemption](#preemption)
9596
- [Taint-based eviction](#taint-based-eviction)
@@ -521,6 +522,20 @@ Job controller. For example, Kubelet could restart a failed container before the
521522
Job controller decides to terminate the corresponding job due to a rule using
522523
`onExitCodes`.
523524

525+
#### The scope of the FailureTarget condition
526+
527+
As part of this KEP we introduced the
528+
[FailureTarget condition](#interim-failuretarget-job-condition) scoped to the
529+
failures due to pod failure policy.
530+
531+
However, we are going to extend the scope of the condition to all Job failure
532+
scenarios (covering also backoffLimit exceeded and ActiveDeadlineSeconds
533+
exceeded), as part of fixing
534+
(issue #123775)[https://github.com/kubernetes/kubernetes/issues/123775].
535+
536+
See more details in the
537+
[Job API managed-by mechanism](https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/4368-support-managed-by-for-batch-jobs/README.md).
538+
524539
#### Current state review
525540

526541
Here we review the current state of kubernetes (version 1.24) regarding its

keps/sig-apps/3939-allow-replacement-when-fully-terminated/README.md

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ tags, and then generate with `hack/update-toc.sh`.
9696
- [Design Details](#design-details)
9797
- [Job API Definition](#job-api-definition)
9898
- [Defaulting and validation](#defaulting-and-validation)
99+
- [Tracking the terminating pods](#tracking-the-terminating-pods)
99100
- [Implementation](#implementation)
100101
- [Test Plan](#test-plan)
101102
- [Prerequisite testing updates](#prerequisite-testing-updates)
@@ -406,6 +407,22 @@ is in use:
406407
When `podFailurePolicy` is in use, the only allowed value for `podFailurePolicy`
407408
is `Failed`.
408409

410+
#### Tracking the terminating pods
411+
412+
In order to allow the quota management for Job-level controllers [story 3](#story-3)
413+
we introduced the `.status.terminating` field which tracks the number of
414+
terminating pods. However, in the initial Beta implementation the field stops
415+
tracking the number of terminating pods as soon as the Job is marked as Failed
416+
with the `Failed` condition (see (issue #123775)[https://github.com/kubernetes/kubernetes/issues/123775]).
417+
The remaining pods may be occupying resources for an arbitrary amount of time.
418+
419+
In 1.31 we are going to fix this issue by delaying the
420+
addition of the `Failed` or `Complete` conditions until all pods are fully
421+
terminated. To indicate that a Job is doomed to fail or succeed, as soon as
422+
possible, we extend the scope of pre-existing conditions: `FailureTarget`, and
423+
`SuccessCriteriaMet`, respectively, See more details in
424+
[Job API managed-by mechanism](https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/4368-support-managed-by-for-batch-jobs/README.md).
425+
409426
### Implementation
410427

411428
As part of this KEP, we need to track pods that are terminating (`deletionTimestamp != nil` and `phase` is `Pending` or `Running`).
@@ -466,6 +483,11 @@ implementing this enhancement to ensure the enhancements have also solid foundat
466483
- `gc_controller.go`: `April 3rd 2023` - `82.4`
467484
a. Set `PodPhase` to `failed` when `JobPodReplacementPolicy` true but `PodDisruptionConditions` is false
468485

486+
The following scenarios related to [tracking the terminating pods](#tracking-the-terminating-pods) are covered:
487+
- `Failed` or `Complete` conditions are not added while there are still terminating pods
488+
- `FailureTarget` is added when backoffLimitCount is exceeded, or activeDeadlineSeconds timeout is exceeded
489+
- `SuccessCriteriaMet` is added when the `completions` are satisfied
490+
469491
##### Integration tests
470492

471493
<!--
@@ -515,8 +537,13 @@ Case for disable and reenable `JobPodReplacementPolicy`
515537
11. Verify that pod creation only occurs once pod is fully terminated.
516538
12. Verify that pod creation only occurs once deletion happens.
517539

518-
To cover cases with `PodDisruptionCondition` we really only need to worry about tracking terminating fields.
519-
Tests will verify counting of terminating fields regardless of `PodDisruptionCondition` being on or off.
540+
To cover cases with `PodDisruptionCondition` we really only need to worry about tracking terminating fields.
541+
Tests will verify counting of terminating fields regardless of `PodDisruptionCondition` being on or off.
542+
543+
The following scenarios related to [tracking the terminating pods](#tracking-the-terminating-pods) are covered:
544+
- `Failed` or `Complete` conditions are not added while there are still terminating pods
545+
- `FailureTarget` is added when backoffLimitCount is exceeded, or activeDeadlineSeconds timeout is exceeded
546+
- `SuccessCriteriaMet` is added when the `completions` are satisfied
520547

521548
##### e2e tests
522549

@@ -571,6 +598,9 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
571598
#### GA
572599

573600
- Address reviews and bug reports from Beta users
601+
- Allow Job API clients tracking the number of the terminating pods until all
602+
the resources are released (see [tracking the terminating pods](#tracking-the-terminating-pods)).
603+
Also, link provide links for the relevant integration tests in the KEP.
574604
- Lock the `JobPodReplacementPolicy` feature-gate to true
575605

576606
#### Deprecation

keps/sig-apps/3998-job-success-completion-policy/README.md

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
- [Difference between &quot;Complete&quot; and &quot;SuccessCriteriaMet&quot;](#difference-between-complete-and-successcriteriamet)
1616
- [The CronJob concurrentPolicy is not affected by JobSuccessPolicy](#the-cronjob-concurrentpolicy-is-not-affected-by-jobsuccesspolicy)
1717
- [Status never switches from &quot;SuccessCriteriaMet&quot; to &quot;Failed&quot;](#status-never-switches-from-successcriteriamet-to-failed)
18+
- [The scope of the SuccessCriteriaMet condition](#the-scope-of-the-successcriteriamet-condition)
1819
- [Risks and Mitigations](#risks-and-mitigations)
1920
- [Design Details](#design-details)
2021
- [Job API](#job-api)
@@ -215,6 +216,18 @@ So, the status can never switch from `SucessCriteriaMet` to `Failed`.
215216
Additionally, once the job has `SuccessCriteriaMet=true` condition, the job definitely ends with `Complete=true` condition
216217
even if the lingering pods could potentially meet the failure policies.
217218

219+
#### The scope of the SuccessCriteriaMet condition
220+
221+
As part of this KEP we introduced the `SuccessCriteriaMet` condition scoped to
222+
the success policy.
223+
224+
However, we are going to extend the scope of the condition to the scenario when
225+
the Job completes by reaching the `.spec.completions`, as part of fixing
226+
(issue #123775)[https://github.com/kubernetes/kubernetes/issues/123775].
227+
228+
See more details in the
229+
[Job API managed-by mechanism](https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/4368-support-managed-by-for-batch-jobs/README.md).
230+
218231
### Risks and Mitigations
219232

220233
- If the job object's size reaches to limit of the etcd and
@@ -346,17 +359,20 @@ Then, after the lingering pods are terminated, the `Complete` condition is added
346359

347360
### Transition of "status.conditions"
348361

349-
When the job with successPolicies is submitted, the job `status.conditions` transits in the following:
350-
Note that the Job doesn't have an actual `Running` condition in the `status.conditions`.
362+
After extending the scope of the `SuccessCriteriaMet` and `FailureTarget` conditions
363+
as proposed in [The scope of the SuccessCriteriaMet condition](#the-scope-of-the-successcriteriamet-condition)
364+
the diagram of transitions looks like below:
351365

352366
```mermaid
353367
stateDiagram-v2
354368
[*] --> Running
355-
Running --> Failed: Exceeded backoffLimit
369+
Running --> FailureTarget: Exceeded backoffLimit
370+
Running --> FailureTarget: Exceeded activeDeadlineSeconds
356371
Running --> FailureTarget: Matched FailurePolicy with action=FailJob
357372
FailureTarget --> Failed: All pods are terminated
358373
Failed --> [*]
359374
Running --> SuccessCriteriaMet: Matched SuccessPolicy
375+
Running --> SuccessCriteriaMet: Achieved the expected completions
360376
SuccessCriteriaMet --> Complete: All pods are terminated
361377
Complete --> [*]
362378
```
@@ -365,6 +381,7 @@ It means that the job's `.status.conditions` follows the following rules:
365381

366382
- The job could have both `SuccessCriteriaMet=true` and `Complete=true` conditions.
367383
- The job can't have both `Failed=true` and `SuccessCriteriaMet=true` conditions.
384+
- The job can't have both `FailureTarget=true` and `SuccessCriteriaMet=true` conditions.
368385
- The job can't have both `Failed=true` and `Complete=true` conditions.
369386

370387
#### The situations where successPolicy conflicts other terminating policies

0 commit comments

Comments
 (0)