Skip to content

Conversation

@igorhrcek
Copy link
Contributor

Problem
For cron-scheduled jobs, the scheduler used last_exec_time_sec to compute the next run. This could be stale (e.g., after restarts, long delays, or missed executions), leading to:

  • Negative delays: last_exec_time_sec + next_delay - current_time could be negative, causing immediate or incorrect execution.
  • Incorrect timing: the next run was based on a past timestamp instead of the current time.

This PR solves the following issues:

  1. Execution of all scheduled playbooks 120s after Robusta restart
  2. Completely random execution of scheduled playbooks outside of scheduled time.

This patch has been part of my production for 2 weeks now and I haven't observed any issues.

Solution
The fix changes cron delay calculation to:

  • Use current time as the base: croniter(job.scheduling_params.cron_expression, now).get_next() - now computes the delay from now to the next occurrence.
  • Ignore stale last_exec_time_sec for cron schedules.
  • Preserve deployment protection: for new jobs (JobStatus.NEW), still enforce INITIAL_SCHEDULE_DELAY_SEC (120 seconds) to avoid race conditions during deployments when old and new runners might both see the job.

Impact

  • Prevents negative delays and incorrect immediate runs.
  • Ensures cron jobs run at the correct time relative to the current moment.
  • Maintains the initial delay protection for new jobs during deployments.

@coderabbitai
Copy link

coderabbitai bot commented Dec 6, 2025

Walkthrough

Modifies __calc_job_delay_for_next_run in the scheduler to implement explicit cron-based delay calculation for CronScheduleRepeat. For new jobs, applies minimum initial delay; for other statuses, returns computed next_delay directly. Removes reliance on last_exec_time_sec for cron scheduling.

Changes

Cohort / File(s) Change Summary
Cron delay calculation logic
src/robusta/core/schedule/scheduler.py
Refactors __calc_job_delay_for_next_run to add explicit cron-based delay path using croniter; applies minimum INITIAL_SCHEDULE_DELAY_SEC for CronScheduleRepeat with NEW state; returns direct computed delay for non-NEW states; preserves existing behavior for non-cron types; removes last_exec_time_sec dependency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Requires verification that all scheduling type branches (CronScheduleRepeat, DynamicDelayRepeat, etc.) maintain correct behavior
  • Edge case validation needed for NEW vs. non-NEW job states to ensure timing assumptions hold
  • Impact on initial deployment timing and potential race condition protections should be verified

Possibly related PRs

Suggested reviewers

  • Sheeproid

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main issue fixed: random scheduling of playbooks caused by stale timestamp calculations in cron-based scheduling.
Description check ✅ Passed The description clearly explains the problem with stale last_exec_time_sec, the specific issues it caused, the solution implemented, and the positive impact, all directly related to the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/robusta/core/schedule/scheduler.py (1)

157-159: Dead code: CronScheduleRepeat branch is now unreachable.

Since all CronScheduleRepeat jobs now return early (line 143 for NEW, line 145 otherwise), this branch can never execute. Remove it to avoid confusion.

         if isinstance(job.scheduling_params, DynamicDelayRepeat):
             next_delay_idx = min(job.state.exec_count, len(job.scheduling_params.delay_periods) - 1)
             next_delay = job.scheduling_params.delay_periods[next_delay_idx]
-        elif isinstance(job.scheduling_params, CronScheduleRepeat):
-            now = time.time()
-            next_delay = croniter(job.scheduling_params.cron_expression, now).get_next() - now
         else:  # FIXED_DELAY_REPEAT type
             next_delay = job.scheduling_params.seconds_delay
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f7a22f7 and 626a8ea.

📒 Files selected for processing (1)
  • src/robusta/core/schedule/scheduler.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/robusta/core/schedule/scheduler.py (1)
src/robusta/core/schedule/model.py (3)
  • CronScheduleRepeat (25-26)
  • JobStatus (7-10)
  • DynamicDelayRepeat (21-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run_tests
🔇 Additional comments (2)
src/robusta/core/schedule/scheduler.py (2)

135-145: LGTM! Clean fix for the stale timestamp issue.

The cron-specific path correctly computes delay from current time using croniter, avoiding the stale last_exec_time_sec problem. The max() guard for NEW jobs preserves deployment race condition protection while allowing running jobs to execute at their correct scheduled times.


147-152: LGTM!

The refactored NEW job handling for non-cron types correctly preserves the original behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant