@@ -328,7 +328,7 @@ ensuring the job creation skew is not increasing.
328
328
329
329
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
330
330
331
- 99th percentile of cron_job_creation_skew <= 5 seconds per cluster-day.
331
+ 99th percentile over day for cron_job_creation_skew is <= 15s
332
332
333
333
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
334
334
347
347
348
348
###### Does this feature depend on any specific services running in the cluster?
349
349
350
- None.
350
+ CronJob's TimeZone support relies on external TimeZone package, if one is missing
351
+ golang's internal package will be used, instead.
352
+
353
+ - kube-controller-manager and kube-apiserver
354
+ - Usage description:
355
+ Both kube-controller-manager and kube-apiserver need to have ` CronJobTimeZone `
356
+ feature gate turned for this feature to fully work.
357
+ - Impact of its outage on the feature:
358
+ CronJob's TimeZone functionality will not work.
359
+ - Impact of its degraded performance or high-error rates on the feature:
360
+ Delays in creating new Jobs.
361
+
362
+ - TimeZone package
363
+ - Usage description: CronJob's TimeZone support relies on external TimeZone package,
364
+ if one is missing golang's internal package will be used, instead.
365
+ - Impact of its outage on the feature:
366
+ TimeZone functionality will not work.
367
+ - Impact of its degraded performance or high-error rates on the feature:
368
+ Delays in creating new Jobs.
351
369
352
370
### Scalability
353
371
@@ -386,14 +404,20 @@ We're not using it, yet.
386
404
387
405
###### What are other known failure modes?
388
406
389
- - [ Incorrect TimeZone]
407
+ - Incorrect TimeZone
390
408
- Detection: ` UnknownTimeZone ` events being reported for a CronJob.
391
409
- Mitigations: Fix the TimeZone or suspend a CronJob.
392
410
- Diagnostics: Logs containing ` TimeZone ` phrase.
393
411
- Testing: A set of unit tests is ensuring that invalid TimeZone is properly
394
412
handled both in the apiserver and in the controller itself, reporting to
395
413
user the problem.
396
-
414
+ - Job creation problems
415
+ - Detection: ` cron_job_creation_skew ` metric is exceeding expected 15s per day.
416
+ - Mitigations: Disable ` CronJobTimeZone ` feature gate.
417
+ - Diagnostics: Check logs from CronJob controller.
418
+ - Testing: A set of unit tests is ensuring that invalid TimeZone is properly
419
+ handled both in the apiserver and in the controller itself, reporting to
420
+ user the problem.
397
421
398
422
###### What steps should be taken if SLOs are not being met to determine the problem?
399
423
0 commit comments