Skip to content

Commit 50f3baf

Browse files
justin808claude
andauthored
Fix CI safety check to evaluate latest workflow attempt (#2062)
## Summary Fixes the `ensure-master-docs-safety` GitHub Action to check the **latest attempt** of workflow runs instead of the overall run conclusion. This prevents false positives when workflows are manually re-run and succeed. ## Problem The safety check was blocking docs-only commits from skipping CI when previous master commits had workflows marked as "failed", even if those workflows had been successfully re-run. This happened because: 1. GitHub marks a workflow run as "failed" even after a manual rerun succeeds 2. The `run.conclusion` field is never updated to "success" when reruns succeed 3. The safety check was only looking at `run.conclusion`, not the actual latest attempt This created a situation where: - Commit A fails a workflow - The workflow is manually re-run and succeeds - Commit B (docs-only) is blocked because commit A's workflow is still marked as "failed" - This continues indefinitely, blocking all subsequent commits ## Solution Modified the action to: 1. Fetch the jobs for each workflow run via the GitHub API 2. Find the maximum `run_attempt` number to identify the latest attempt 3. Filter jobs to only those from the latest attempt 4. Check if any jobs in the **latest attempt** have failed conclusions 5. Only block docs-only commits if the latest attempt has actual failures This allows the safety check to correctly recognize when failures have been resolved via manual reruns, while still preventing docs-only skips when there are genuine unresolved failures. ## Test Plan - [x] Code changes reviewed for correctness - [x] Linting passes locally - [ ] CI passes on this PR (will verify the fix works) - [ ] After merge, verify that docs-only commits are no longer blocked by previously-fixed workflow failures 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Improved the accuracy of documentation safety checks by enhancing how workflow failures are detected and assessed, ensuring more precise identification of build issues during the latest workflow attempts. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Claude <[email protected]>
1 parent beb70f0 commit 50f3baf

File tree

1 file changed

+48
-13
lines changed
  • .github/actions/ensure-master-docs-safety

1 file changed

+48
-13
lines changed

.github/actions/ensure-master-docs-safety/action.yml

Lines changed: 48 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,10 @@ runs:
7070
// Use run_number as tiebreaker since created_at might be identical for rapid reruns.
7171
// Note: If workflows are manually re-run out of order, we use the highest run_number
7272
// which represents the most recent attempt, regardless of trigger order.
73+
//
74+
// IMPORTANT: We need to fetch the latest attempt for each run, not just the run conclusion.
75+
// GitHub marks a run as "failed" even if a rerun succeeded, so we must check the actual
76+
// latest attempt to see if the failure was resolved.
7377
const latestByWorkflow = new Map();
7478
for (const run of workflowRuns) {
7579
const existing = latestByWorkflow.get(run.workflow_id);
@@ -100,19 +104,50 @@ runs:
100104
return;
101105
}
102106
103-
// Check for workflows that failed on the previous commit.
104-
// We treat these conclusions as failures:
105-
// - 'failure': Obvious failure case
106-
// - 'timed_out': Infrastructure or performance issue that should be investigated
107-
// - 'cancelled': Might indicate timeout, CI infrastructure issues, or manual intervention needed
108-
// Being conservative here prevents a green checkmark when the previous commit
109-
// might have real issues that weren't fully validated
110-
// - 'action_required': Requires manual intervention
111-
// We treat 'skipped' and 'neutral' as non-blocking since they indicate
112-
// intentional skips or informational-only workflows.
113-
const failingRuns = Array.from(latestByWorkflow.values()).filter((run) => {
114-
return ['failure', 'timed_out', 'cancelled', 'action_required'].includes(run.conclusion);
115-
});
107+
// For each workflow run, fetch the jobs to check the latest attempt's conclusion.
108+
// GitHub's run.conclusion reflects the overall run, but if a run was re-run and succeeded,
109+
// we want to consider that success, not the original failure.
110+
const failingRuns = [];
111+
112+
for (const run of Array.from(latestByWorkflow.values())) {
113+
// Fetch jobs for this run to check the latest attempt
114+
const jobsResponse = await github.rest.actions.listJobsForWorkflowRun({
115+
owner: context.repo.owner,
116+
repo: context.repo.repo,
117+
run_id: run.id,
118+
per_page: 100
119+
});
120+
121+
const jobs = jobsResponse.data.jobs;
122+
123+
if (jobs.length === 0) {
124+
// No jobs found - treat as incomplete
125+
failingRuns.push(run);
126+
continue;
127+
}
128+
129+
// Get the maximum run_attempt number to find the latest attempt
130+
const latestAttempt = Math.max(...jobs.map(job => job.run_attempt));
131+
132+
// Get all jobs from the latest attempt
133+
const latestJobs = jobs.filter(job => job.run_attempt === latestAttempt);
134+
135+
// Check if any job in the latest attempt has failed
136+
// We treat these conclusions as failures:
137+
// - 'failure': Obvious failure case
138+
// - 'timed_out': Infrastructure or performance issue that should be investigated
139+
// - 'cancelled': Might indicate timeout, CI infrastructure issues, or manual intervention needed
140+
// - 'action_required': Requires manual intervention
141+
// We treat 'skipped' and 'neutral' as non-blocking since they indicate
142+
// intentional skips or informational-only workflows.
143+
const hasFailedJob = latestJobs.some(job =>
144+
['failure', 'timed_out', 'cancelled', 'action_required'].includes(job.conclusion)
145+
);
146+
147+
if (hasFailedJob) {
148+
failingRuns.push(run);
149+
}
150+
}
116151
117152
if (failingRuns.length === 0) {
118153
core.info(`Previous master commit ${previousSha} completed without failures. Docs-only skip allowed.`);

0 commit comments

Comments
 (0)