Better retries when a sub-agent fails by derekmisler · Pull Request #75 · docker/cagent-action

derekmisler · 2026-03-06T17:01:28Z

Summary

When a sub-agent fails (e.g., due to API overload), the root agent can recover and still post a review — but the pipeline previously treated any non-zero exit code as a hard failure, posting a redundant "Review Failed" comment and potentially retrying the entire pipeline, causing duplicate reviews. This PR disables pipeline-level retries and adds smarter exit-code handling to distinguish a true failure from a partial success.

Changes

review-pr/action.yml — disable pipeline retries: Sets max-retries: "0" on the review step with an explanatory comment, since the root agent already recovers internally when sub-agents fail and retrying the pipeline produces duplicate reviews.
review-pr/action.yml — smarter failure detection: Exposes verbose-log-file from the review step output and, on a non-zero exit code, checks whether a pull request review was actually posted (by grepping for pullrequestreview-[0-9]+ in the log). If a review was found, the pipeline reports ⚠️ Review completed with warnings instead of ❌ Review failed and skips posting the fallback failure comment.

How to Test

Simulate a sub-agent failure (e.g., trigger an API overload mid-review) and confirm the pipeline exits with a warning status rather than posting a "Review Failed" comment when the root agent successfully posted a review.
Confirm that a genuine total failure (no review posted, non-zero exit) still posts the ❌ PR Review Failed comment to the PR as before.
Verify no duplicate reviews appear on a PR when a sub-agent fails and the root agent recovers.

Signed-off-by: Derek Misler <derek.misler@docker.com>

derekmisler · 2026-03-06T17:02:02Z

/describe

docker-agent · 2026-03-06T17:02:36Z

✅ PR description has been generated and updated!

docker-agent

Review Summary

Assessment: 🟢 APPROVE

This PR adds intelligent error recovery to handle cases where a sub-agent fails but the root agent successfully posts a review anyway. The changes are well-reasoned:

Key improvements:

✅ Disables retries (max-retries: "0") at the pipeline level since the review agent has internal recovery logic
✅ Adds detection for reviews posted despite non-zero exit codes by checking for pullrequestreview-[0-9]+ in verbose logs
✅ Treats this scenario as a "partial success" with a warning rather than a full failure
✅ Only posts error comments when no review was actually posted

Review notes:

The grep pattern pullrequestreview-[0-9]+ is specific enough to avoid false positives in practice — it matches GitHub's review ID format which only appears when the API successfully returns a review object
The error handling logic correctly distinguishes between "failed with recovery" and "failed without recovery" scenarios

No bugs found in the changed code. This improves the robustness of the review workflow.

Better retries when a sub-agent fails

ed19d25

Signed-off-by: Derek Misler <derek.misler@docker.com>

derekmisler force-pushed the better-retries branch from 383fb0b to ed19d25 Compare March 6, 2026 17:01

derekmisler changed the title ~~Better retries when a sub-agent failes~~ Better retries when a sub-agent fails Mar 6, 2026

derekmisler marked this pull request as ready for review March 6, 2026 17:02

derekmisler requested a review from a team as a code owner March 6, 2026 17:02

docker-agent bot reviewed Mar 6, 2026

View reviewed changes

krissetto approved these changes Mar 6, 2026

View reviewed changes

derekmisler merged commit 68c6c67 into docker:main Mar 6, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better retries when a sub-agent fails#75

Better retries when a sub-agent fails#75
derekmisler merged 1 commit intodocker:mainfrom
derekmisler:better-retries

derekmisler commented Mar 6, 2026 •

edited by docker-agent bot

Loading

Uh oh!

derekmisler commented Mar 6, 2026

Uh oh!

docker-agent bot commented Mar 6, 2026

Uh oh!

docker-agent bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

derekmisler commented Mar 6, 2026 • edited by docker-agent bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

How to Test

Uh oh!

derekmisler commented Mar 6, 2026

Uh oh!

docker-agent bot commented Mar 6, 2026

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Assessment: 🟢 APPROVE

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

derekmisler commented Mar 6, 2026 •

edited by docker-agent bot

Loading