-
Notifications
You must be signed in to change notification settings - Fork 4
Fix runner workload readiness logic #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughRemoved version-based waiting from the Run command, eliminated the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🧠 Learnings (2)📓 Common learnings📚 Learning: 2024-11-01T13:56:06.103ZApplied to files:
🧬 Code graph analysis (2)lib/command/run.rb (2)
lib/core/controlplane.rb (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (6)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review - PR #260: Fix runner workload readiness logicSummaryThis PR addresses a critical issue where ✅ Strengths
🔍 Potential Concerns & Questions1. Error Handling & Timeout BehaviorThe
2. Behavior Change for Non-Suspended Workloads
3. Missing Context in Error MessagesThe old implementation had explicit step messages:
With 4. Potential Race ConditionThe code flow is now: create_runner_workload if cp.fetch_workload(runner_workload).nil?
update_runner_workload # Uses --ready flag
start_job
5. Test Coverage GapLooking at
📝 Recommendations
context "when runner workload is a suspended cron" do
# Test that update completes without hanging
end
# Use --ready flag to wait for resource to be ready before returning.
# This handles suspended cron workloads that don't update lastProcessedVersion.
cmd = "cpln apply #{gvc_org} --file #{f.path} --ready"
🎯 Overall AssessmentVerdict: ✅ Approve with minor suggestions This is a solid improvement that:
The main concerns are around edge cases and ensuring
Great work identifying and fixing this issue! 🚀 |
4dd0605 to
3dfa17b
Compare
Pull Request Review: Fix runner workload readiness logicOverviewThis PR addresses a critical bug where the run command could hang indefinitely when updating suspended cron workloads. The fix simplifies the logic by leveraging the --ready flag in cpln apply instead of manually polling for deployment status. Code Quality and Best PracticesStrengths:
Suggestions:
Potential Bugs and IssuesCritical Observation:
Concern - Error Handling: Recommendation: Consider documenting the timeout behavior of cpln apply --ready or adding logging to help debug if this blocks longer than expected. Performance ConsiderationsImprovements:
Potential concerns:
Security ConcernsNo security issues identified. The changes are purely related to deployment synchronization logic. Test CoverageGood news: Observations:
Recommendation: Consider adding a specific test case for the bug this PR fixes to catch potential regressions. Additional Observations
SummaryThis is a well-executed bug fix that simplifies the code while solving a real problem. The approach of using cpln apply --ready is much cleaner than manual polling. Action items before merge:
Overall assessment: Approved - This PR significantly improves reliability and code quality. The minor cleanup items are nice-to-haves but not blockers. Great work on identifying and fixing this subtle bug! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CHANGELOG.mdlib/command/run.rblib/core/controlplane.rb
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: rafaelgomesxyz
Repo: shakacode/control-plane-flow PR: 183
File: examples/controlplane.yml:131-133
Timestamp: 2024-10-08T20:53:47.076Z
Learning: The `pre_deploy_hook` in the `examples/controlplane.yml` can be a command, not just a script path.
Learnt from: rafaelgomesxyz
Repo: shakacode/control-plane-flow PR: 183
File: examples/controlplane.yml:131-133
Timestamp: 2024-07-27T10:08:35.868Z
Learning: The `pre_deploy_hook` in the `examples/controlplane.yml` can be a command, not just a script path.
🧬 Code graph analysis (1)
lib/core/controlplane.rb (1)
lib/core/shell.rb (1)
cmd(66-73)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: claude-review
- GitHub Check: RSpec (Fast) / rspec
🔇 Additional comments (3)
CHANGELOG.md (1)
17-20: LGTM!The changelog entry clearly documents the bug fix and follows the proper format.
lib/command/run.rb (2)
100-100: LGTM!The removal of
expected_deployed_versionfrom the attribute reader is consistent with the elimination of explicit deployment version tracking. The new--readyflag approach inapply_templatenow handles readiness implicitly.
195-244: LGTM!The simplified update logic correctly relies on the
--readyflag behavior added toapply_templateinlib/core/controlplane.rb. The removal of explicit wait loops should resolve the indefinite hang issue described in the PR objectives, assuming the--readyflag works correctly for suspended cron workloads.
lib/core/controlplane.rb
Outdated
| f.write(data) | ||
| f.rewind | ||
| cmd = "cpln apply #{gvc_org} --file #{f.path}" | ||
| cmd = "cpln apply #{gvc_org} --file #{f.path} --ready" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
cpln apply --ready flag documentation Control Plane CLI
💡 Result:
I couldn't find any documented --ready flag for the cpln apply command in the official Control Plane CLI docs — the cpln apply guide and CLI reference list flags like --file, --gvc, --k8s and stdin usage but make no mention of --ready. [1][2]
If you meant a different command or a newer CLI version, tell me the version (cpln --version) and I’ll re-check the release notes and docs for that version.
Sources:
- cpln apply (Apply YAML Manifests) guide. [1]
- Control Plane CLI reference / overview. [2]
🏁 Script executed:
# Find cpln version specification in the project
find . -name "*.gemfile" -o -name "Gemfile*" -o -name "*.yml" -o -name "*.yaml" | head -20Repository: shakacode/control-plane-flow
Length of output: 792
🏁 Script executed:
# Search for references to --ready flag in the codebase
rg "\-\-ready" --type rbRepository: shakacode/control-plane-flow
Length of output: 98
🏁 Script executed:
# Look for lastProcessedVersion references
rg "lastProcessedVersion" --type rb -iRepository: shakacode/control-plane-flow
Length of output: 98
🏁 Script executed:
# Check the suspended cron workload reference at lib/command/run.rb line 173
head -n 180 lib/command/run.rb | tail -n 20Repository: shakacode/control-plane-flow
Length of output: 789
🏁 Script executed:
# Check Gemfile for cpln CLI version specification
cat GemfileRepository: shakacode/control-plane-flow
Length of output: 482
🏁 Script executed:
# Search for --ready flag in all Ruby files (fixing the rg command)
rg "\-\-ready" .Repository: shakacode/control-plane-flow
Length of output: 159
🏁 Script executed:
# Search for lastProcessedVersion references
rg "lastProcessedVersion" .Repository: shakacode/control-plane-flow
Length of output: 184
🏁 Script executed:
# Check for any cpln version or CLI setup
rg "cpln" Gemfile Gemfile.lock 2>/dev/null || echo "No cpln in Gemfile"Repository: shakacode/control-plane-flow
Length of output: 90
🏁 Script executed:
# Look for CLI documentation or version info in the project
find . -name "*.md" -type f | xargs grep -l "ready" 2>/dev/null | head -5Repository: shakacode/control-plane-flow
Length of output: 159
🏁 Script executed:
# Check README for cpln version requirements
grep -i "cpln" README.md | head -20Repository: shakacode/control-plane-flow
Length of output: 2009
🏁 Script executed:
# Check docs for cpln version info
grep -i "cpln\|version\|ready" docs/*.md | grep -i "ready\|version" | head -10Repository: shakacode/control-plane-flow
Length of output: 945
🏁 Script executed:
# Look at the context around lastProcessedVersion in controlplane.rb
grep -B 10 -A 10 "lastProcessedVersion" lib/core/controlplane.rbRepository: shakacode/control-plane-flow
Length of output: 701
🏁 Script executed:
# Check if there's any PR description or CHANGELOG mentioning --ready
find . -name "CHANGELOG*" -o -name "*.md" | xargs grep -l "ready" 2>/dev/nullRepository: shakacode/control-plane-flow
Length of output: 188
🏁 Script executed:
# Look for any .git directory and check recent commits mentioning --ready or lastProcessedVersion
git log --all --oneline --grep="ready\|lastProcessedVersion" 2>/dev/null | head -20Repository: shakacode/control-plane-flow
Length of output: 109
🏁 Script executed:
# Search for any issue references or PR discussions
find . -name "*.md" -o -name "*.txt" | xargs grep -i "ready.*flag\|--ready" 2>/dev/nullRepository: shakacode/control-plane-flow
Length of output: 56
🏁 Script executed:
# Check if there's version history or release notes
find . -name "CHANGELOG*" -o -name "HISTORY*" -o -name "RELEASES*" 2>/dev/nullRepository: shakacode/control-plane-flow
Length of output: 86
🏁 Script executed:
# Look at surrounding code context - check what changed around line 411
sed -n '400,425p' lib/core/controlplane.rbRepository: shakacode/control-plane-flow
Length of output: 857
🏁 Script executed:
# Search for any control plane API or CLI documentation URLs in the codebase
rg "control.*plane.*cli|cpln.*doc|api.*reference" -i . 2>/dev/null | head -10Repository: shakacode/control-plane-flow
Length of output: 959
🏁 Script executed:
# Check CHANGELOG for --ready flag or recent cpln changes
cat CHANGELOG.md | head -50Repository: shakacode/control-plane-flow
Length of output: 3075
🏁 Script executed:
# Also search CHANGELOG for lastProcessedVersion or ready-related changes
grep -i "ready\|lastProcessedVersion\|cpln apply" CHANGELOG.md | head -20Repository: shakacode/control-plane-flow
Length of output: 54
The --ready flag is not documented in the official Control Plane CLI, and its availability remains unverified.
The web search found no mention of this flag in the official cpln apply documentation. Additionally:
- No cpln version constraint exists in the project dependencies to specify when this flag became available
- No CHANGELOG entries document the
--readyflag or its behavior - The flag appears only in this one location, with no error handling or fallback logic if it fails
- How this flag specifically handles the
lastProcessedVersionissue for suspended cron workloads is unclear
Before merging, confirm:
- Which cpln CLI version introduced the
--readyflag - That the flag is available in the project's minimum supported cpln version
- That it correctly resolves the
lastProcessedVersionpolling issue for suspended cron workloads (as referenced in thecron_workload_deployed_versionmethod in this same file)
3dfa17b to
682689c
Compare
Code Review for PR #260: Fix runner workload readiness logicSummaryThis PR simplifies the runner workload update logic by leveraging the ✅ Strengths
🔍 Code Qualitylib/command/run.rb (line 100)
lib/core/controlplane.rb (line 412) cmd += " --ready" unless ENV.fetch("DISABLE_APPLY_READY", nil)
spec/spec_helper.rb (line 7)
🐛 Potential Issues1. Broader Impact on All
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
CHANGELOG.mdlib/command/run.rblib/core/controlplane.rbspec/spec_helper.rb
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: rafaelgomesxyz
Repo: shakacode/control-plane-flow PR: 183
File: examples/controlplane.yml:131-133
Timestamp: 2024-10-08T20:53:47.076Z
Learning: The `pre_deploy_hook` in the `examples/controlplane.yml` can be a command, not just a script path.
Learnt from: rafaelgomesxyz
Repo: shakacode/control-plane-flow PR: 183
File: examples/controlplane.yml:131-133
Timestamp: 2024-07-27T10:08:35.868Z
Learning: The `pre_deploy_hook` in the `examples/controlplane.yml` can be a command, not just a script path.
📚 Learning: 2024-11-01T13:43:58.455Z
Learnt from: zzaakiirr
Repo: shakacode/control-plane-flow PR: 240
File: spec/core/terraform_config/generator_spec.rb:337-345
Timestamp: 2024-11-01T13:43:58.455Z
Learning: In `spec/core/terraform_config/generator_spec.rb`, prefer to keep configuration values inline in the tests rather than extracting them into named constants.
Applied to files:
spec/spec_helper.rb
📚 Learning: 2024-11-01T13:55:15.053Z
Learnt from: zzaakiirr
Repo: shakacode/control-plane-flow PR: 240
File: spec/core/terraform_config/workload_spec.rb:24-409
Timestamp: 2024-11-01T13:55:15.053Z
Learning: The user prefers to keep tests in `spec/core/terraform_config/workload_spec.rb` written without shared examples, as they find the current version easier to read.
Applied to files:
spec/spec_helper.rb
📚 Learning: 2024-11-01T11:39:47.992Z
Learnt from: zzaakiirr
Repo: shakacode/control-plane-flow PR: 240
File: lib/command/terraform/generate.rb:28-36
Timestamp: 2024-11-01T11:39:47.992Z
Learning: In `lib/command/terraform/generate.rb`, the `generate_app_config` method uses `File.write` with `mode: "a+"` because multiple templates with the same kind can appear, and appending to existing files is necessary.
Applied to files:
lib/core/controlplane.rb
🧬 Code graph analysis (1)
lib/core/controlplane.rb (1)
lib/core/shell.rb (1)
cmd(66-73)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: claude-review
- GitHub Check: RSpec (Fast) / rspec
🔇 Additional comments (3)
spec/spec_helper.rb (1)
7-7: LGTM!Disabling the
--readyflag in tests is appropriate. This prevents tests from blocking while waiting for resource readiness and maintains existing test behavior.lib/command/run.rb (2)
100-100: LGTM!The removal of
expected_deployed_versionfrom the public interface is correct, as version tracking is no longer needed with the--readyflag handling deployment readiness.
128-130: Ensure--readyflag handles suspended workloads correctly.The explicit wait logic for runner workload deployment and updates has been removed, relying instead on the
--readyflag incpln applyto handle readiness. However, the runner workload is created as suspended (line 173 setssuspend: true), and the PR description mentions that suspended cron workloads don't always updatelastProcessedVersion.Verify that:
- The
--readyflag returns successfully for suspended workloads (doesn't wait indefinitely for a workload that will never become "ready" while suspended)- The workload is in the correct state when
start_jobexecutes at line 130This verification is related to the one requested in
lib/core/controlplane.rb.
lib/core/controlplane.rb
Outdated
| f.write(data) | ||
| f.rewind | ||
| cmd = "cpln apply #{gvc_org} --file #{f.path}" | ||
| cmd += " --ready" unless ENV.fetch("DISABLE_APPLY_READY", nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify cpln CLI supports --ready flag and check its behavior
# Check if --ready flag is documented in cpln apply help
echo "=== Checking cpln apply help for --ready flag ==="
cpln apply --help | grep -i "ready" || echo "No --ready flag found in help output"
# Search codebase for cpln version requirements or constraints
echo -e "\n=== Checking for cpln version specifications ==="
rg -n "cpln.*version|CPLN.*VERSION" -g '!*.md' -g '!CHANGELOG*'
# Check for any references to cpln in installation/setup files
echo -e "\n=== Checking for cpln installation or version specs ==="
find . -maxdepth 2 \( -name "*.gemfile" -o -name "Gemfile*" -o -name "*.yaml" -o -name "*.yml" -o -name "*.json" \) -type f | head -20Repository: shakacode/control-plane-flow
Length of output: 1676
🌐 Web query:
Control Plane CLI cpln apply --ready flag documentation and behavior with suspended cron workloads
💡 Result:
Short answer: cpln apply --ready waits for created/updated workloads to reach a "ready" state; because a suspended workload (including a suspended cron whose replicas are 0) will not create running replicas, it will not become ready—so using --ready against a suspended cron will block until the CLI's wait times out (or indefinitely if no timeout is supplied). [1][2]
What to do:
- Remove the workload's suspend (unsuspend/start it) before running cpln apply --ready, or
- Omit --ready when applying resources you intend to keep suspended, or
- Unsuspend as a separate step after apply and then wait for readiness. [2][1]
Sources:
- cpln apply / release note adding --ready flag. [1]
- Workload "Suspend" behavior (suspend sets scale to 0 / stops serving traffic). [2]
Critical issue: --ready flag will hang indefinitely when applying suspended cron workloads.
The --ready flag does exist in the cpln apply command and waits for workloads to reach a "ready" state. However, suspended cron workloads (with replicas set to 0) will never become ready, causing cpln apply --ready to block indefinitely or until timeout.
The current code at line 412 unconditionally adds the --ready flag unless DISABLE_APPLY_READY is explicitly set. This creates a significant risk: users applying configurations with suspended cron workloads will experience hangs.
Required solutions:
- Detect suspended workloads and conditionally omit
--ready, or - Document this behavior prominently and require explicit
DISABLE_APPLY_READYfor workflows with suspended resources, or - Implement a timeout mechanism with clear user messaging.
682689c to
5673b97
Compare
Pull Request Review: Fix runner workload readiness logicSummaryThis PR addresses a critical bug where the Code Quality ✅Strengths:
Minor observations:
Potential Issues 🔍1. Test Coverage GapThe PR adds Recommendation: Consider adding at least one test that exercises the
2. Environment Variable PatternThe use of # Current
cmd += " --ready" if wait && ENV.fetch("DISABLE_APPLY_READY", nil).nil?
# Alternative (more explicit)
cmd += " --ready" if wait && ENV["DISABLE_APPLY_READY"] != "true"Both work, but the second is slightly clearer about intent. Not a blocker. 3. Backward CompatibilityThe change fundamentally alters when
This is the intended fix, but consider:
Recommendation: Verify that Performance Considerations ⚡Positive impact:
Potential concern:
Security Concerns 🔒No security issues identified. The change:
Test Coverage 📊Current state:
Missing coverage:
Architecture Considerations 🏗️Good decisions:
Question:
Recommendations SummaryHigh Priority:
Medium Priority:
Low Priority (Nice to have):
Conclusion ✨This is a well-executed fix that simplifies the code while solving a critical bug. The main concern is the lack of test coverage for the new behavior due to Overall Assessment: Approve with recommendation to add test coverage for the new Great work on identifying and fixing this issue! 🎉 |
Summary
--readyflag tocpln applyto ensure resources are ready before proceedingwait_for_runner_workload_deploycheck (now handled by--ready)Problem
cpln applyfor suspended cron workloads doesn't always updatelastProcessedVersionin deployment status, causing the wait loop to hang indefinitely.Solution
Use
cpln apply --readyflag which waits for the resource to be ready before returning, making the explicit wait check unnecessary.Summary by CodeRabbit
Bug Fixes
New Behavior
Chores / Tests
✏️ Tip: You can customize this high-level summary in your review settings.