Fix description of back-off count reset

leon-sony · Leon Barrett · commit 7b92c4650324 · 2020-07-10T14:10:05.000-07:00
By carefully reading the code in `job_controller.go`, I finally understood that the back-off count is reset when `forget` is true, which happens when `active` or `successful` changes without any new failures right at the moment. That happens in this code: https://github.com/kubernetes/kubernetes/blob/dd649bb7ef4788bfe65c93ebc974962d64476b39/pkg/controller/job/job_controller.go#L588 That behavior does not match what this document says. My change fixes the doc to match the code. It might be better to fix the behavior to match the doc, since the behavior is kind of weird to describe. But I imagine that the Kubernetes team will need to consider factors I'm not aware of before deciding to change job back-off behavior, so I am not going to the effort of proposing a change like that.
diff --git a/content/en/docs/concepts/workloads/controllers/job.md b/content/en/docs/concepts/workloads/controllers/job.md
@@ -215,8 +215,8 @@ To do so, set `.spec.backoffLimit` to specify the number of retries before
 considering a Job as failed. The back-off limit is set by default to 6. Failed
 Pods associated with the Job are recreated by the Job controller with an
 exponential back-off delay (10s, 20s, 40s ...) capped at six minutes. The
-back-off count is reset if no new failed Pods appear before the Job's next
-status check.
+back-off count is reset when a job pod is deleted or successful without any
+other pods failing around that time.
 
 {{< note >}}
 If your job has `restartPolicy = "OnFailure"`, keep in mind that your container running the Job