fix(ci): make release workflow retry-safe and resilient#5810
fix(ci): make release workflow retry-safe and resilient#5810killagu merged 1 commit intoeggjs:nextfrom
Conversation
📝 WalkthroughWalkthroughThe release workflow now includes retry-detection logic to idempotently handle version bumps and draft release creation, while a new publish script manages per-package publishing with version checks, retries, and npm registry validation across the monorepo. Changes
Sequence Diagram(s)sequenceDiagram
participant GHA as GitHub Actions
participant WF as Workflow
participant NPM as npm Registry
participant Workspace as Workspace Packages
participant Release as Draft Release
GHA->>WF: Trigger release workflow
WF->>WF: Detect existing release
alt Release exists
WF->>WF: Skip version bump
else No release found
WF->>Workspace: Bump version
WF->>NPM: Push version tag
end
WF->>Workspace: Discover publishable packages
loop For each package
WF->>NPM: Check published version
alt Version already published
WF->>WF: Skip package
else Version missing
WF->>NPM: Publish with pnpm
alt Publish fails
WF->>NPM: Retry publish once
end
end
end
WF->>Release: Create or reuse draft release
Release-->>WF: Return release URL
WF->>GHA: Report status and release URL
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @killagu-claw, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and reliability of the CI/CD release workflow. It introduces a new resilient publishing script for npm packages, improves retry mechanisms for release steps, and refines error handling and idempotency across various stages of the release process. Additionally, it simplifies qualifier registration logic in the Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new, resilient publishing script and adapts the CI workflow to be retry-safe. The new script is a great addition for improving the reliability of releases. My review includes a few suggestions for the new script to improve argument parsing and reduce code duplication. Additionally, the changes in the tegg packages, which allow overwriting qualifiers to support retries, represent a significant change in behavior. I've added comments to highlight the potential impact of this change and to suggest ensuring it aligns with the overall design goals.
I am having trouble creating individual review comments. Click here to see my feedback.
scripts/publish.js (27-29)
The current parsing for the --tag argument is not fully robust. If the command is run with --tag= (an empty value), npmTag will be set to an empty string, which could cause the pnpm publish command to fail or behave unexpectedly. It's safer to ensure a value is present before assigning it.
if (tagArg) {
const value = tagArg.split('=')[1];
if (value) {
npmTag = value;
}
}
scripts/publish.js (78-133)
The logic for publishing a package is duplicated in the initial publishing loop and the retry loop (specifically lines 88-102 and 119-131). This includes the try/catch block for publishOne and the post-failure check with isPublished. This repetition makes the script harder to maintain. Consider refactoring this logic into a single helper function to avoid code duplication.
tegg/core/core-decorator/src/util/QualifierUtil.ts (8-16)
This change alters the method's behavior to silently overwrite existing qualifiers instead of throwing an error. While this supports retry-safe operations, it removes a safeguard against accidental qualifier redefinitions, which could mask bugs. Please consider if this is the desired behavior globally or if a more localized solution for retries would be safer. It would be beneficial to document this semantic change and its reasoning in the PR description.
tegg/core/dynamic-inject/src/QualifierImplUtil.ts (10-16)
Similar to the change in QualifierUtil.addProtoQualifier, removing this check changes the behavior to silently overwrite existing implementations. This makes the system more resilient for retries but also removes a safeguard against accidental re-definitions. It's worth confirming this is the desired behavior across all use cases.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## next #5810 +/- ##
==========================================
- Coverage 85.24% 85.20% -0.04%
==========================================
Files 650 650
Lines 12518 12514 -4
Branches 1436 1434 -2
==========================================
- Hits 10671 10663 -8
- Misses 1727 1731 +4
Partials 120 120 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/release.yml (1)
219-219:⚠️ Potential issue | 🟡 MinorChangelog URL has duplicate
vprefix.The
tagvariable already includes thevprefix (e.g.,v1.0.0), sov${tag}producesvv1.0.0. Also, comparing a tag to itself doesn't show a meaningful diff.🔧 Proposed fix
The changelog link should compare the previous tag to the new tag. If the previous tag isn't easily accessible, consider removing this line or using a placeholder:
- releaseBody += `**Full Changelog**: https://github.com/${{ github.repository }}/compare/v${tag}...${tag}`; + releaseBody += `**Full Changelog**: https://github.com/${{ github.repository }}/releases/tag/${tag}`;Or if you want to show changes since the previous release, you'd need to fetch the previous tag first.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/release.yml at line 219, The changelog URL construction uses `v${tag}` which adds an extra `v` because `tag` already includes the prefix and also compares the tag to itself; update the `releaseBody` assignment that builds the URL to use the raw `tag` value (not `v${tag}`) and change the range to compare the previous tag (e.g., `prevTag...tag`) instead of `tag...tag`; if you can't obtain a previous tag in this workflow, either remove the full changelog line or substitute a sensible placeholder, and ensure you reference the `releaseBody` and `tag` variables (and a `prevTag` variable if introduced) when making the change.
🧹 Nitpick comments (2)
scripts/publish.js (2)
58-72: Consider usingspawnSyncwith array arguments instead ofexecSyncwith string concatenation.While package names from npm are validated and shouldn't contain problematic characters, using
spawnSyncwith an argument array is more robust and avoids potential shell escaping issues.♻️ Alternative using spawnSync
+import { execSync, spawnSync } from 'node:child_process'; -import { execSync } from 'node:child_process'; function publishOne(pkg) { const publishArgs = [ '--filter', pkg.name, 'publish', '--no-git-checks', '--access', 'public', '--tag', npmTag, ]; if (useProvenance) publishArgs.push('--provenance'); if (isDryRun) publishArgs.push('--dry-run'); - execSync(`pnpm ${publishArgs.join(' ')}`, { - cwd: baseDir, - stdio: 'inherit', - env: { ...process.env, NPM_CONFIG_LOGLEVEL: 'verbose' }, - timeout: 120000, - }); + const result = spawnSync('pnpm', publishArgs, { + cwd: baseDir, + stdio: 'inherit', + env: { ...process.env, NPM_CONFIG_LOGLEVEL: 'verbose' }, + timeout: 120000, + }); + if (result.status !== 0) { + throw new Error(`pnpm publish failed with exit code ${result.status}`); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/publish.js` around lines 58 - 72, In publishOne, replace the execSync call with child_process.spawnSync invoked with the command 'pnpm' and the publishArgs array (built in the function) as the args parameter to avoid shell concatenation; keep the same env merge ({ ...process.env, NPM_CONFIG_LOGLEVEL: 'verbose' }), cwd (baseDir), stdio ('inherit') and timeout (120000), and preserve the existing conditional pushes for useProvenance and isDryRun so behavior is unchanged; remove shell usage so errors are returned via spawnSync's status and error fields instead of relying on a shell-invoked execSync.
26-29: Edge case: malformed--tag=argument.If someone passes
--tag=without a value,tagArg.split('=')[1]will be an empty string, which could causepnpm publish --tag ""to fail or behave unexpectedly.🛡️ Proposed validation
const tagArg = args.find(arg => arg.startsWith('--tag=')); if (tagArg) { - npmTag = tagArg.split('=')[1]; + const tagValue = tagArg.split('=')[1]; + if (!tagValue) { + console.error('❌ --tag= requires a value (e.g., --tag=beta)'); + process.exit(1); + } + npmTag = tagValue; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/publish.js` around lines 26 - 29, The code that extracts npmTag from the CLI args (using args.find -> tagArg and then tagArg.split('=')[1]) does not handle a malformed `--tag=` with no value; update the logic in the tag parsing block to validate the split result (or use a regex/capture) so that if the value is missing or an empty string you either set a safe default or throw/exit with a clear error; specifically check tagArg after finding it, extract the value into npmTag only when the captured value is non-empty, and handle the empty case (e.g., log an error and exit or ignore the tag) to avoid passing an empty tag to pnpm publish.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In @.github/workflows/release.yml:
- Line 219: The changelog URL construction uses `v${tag}` which adds an extra
`v` because `tag` already includes the prefix and also compares the tag to
itself; update the `releaseBody` assignment that builds the URL to use the raw
`tag` value (not `v${tag}`) and change the range to compare the previous tag
(e.g., `prevTag...tag`) instead of `tag...tag`; if you can't obtain a previous
tag in this workflow, either remove the full changelog line or substitute a
sensible placeholder, and ensure you reference the `releaseBody` and `tag`
variables (and a `prevTag` variable if introduced) when making the change.
---
Nitpick comments:
In `@scripts/publish.js`:
- Around line 58-72: In publishOne, replace the execSync call with
child_process.spawnSync invoked with the command 'pnpm' and the publishArgs
array (built in the function) as the args parameter to avoid shell
concatenation; keep the same env merge ({ ...process.env, NPM_CONFIG_LOGLEVEL:
'verbose' }), cwd (baseDir), stdio ('inherit') and timeout (120000), and
preserve the existing conditional pushes for useProvenance and isDryRun so
behavior is unchanged; remove shell usage so errors are returned via spawnSync's
status and error fields instead of relying on a shell-invoked execSync.
- Around line 26-29: The code that extracts npmTag from the CLI args (using
args.find -> tagArg and then tagArg.split('=')[1]) does not handle a malformed
`--tag=` with no value; update the logic in the tag parsing block to validate
the split result (or use a regex/capture) so that if the value is missing or an
empty string you either set a safe default or throw/exit with a clear error;
specifically check tagArg after finding it, extract the value into npmTag only
when the captured value is non-empty, and handle the empty case (e.g., log an
error and exit or ignore the tag) to avoid passing an empty tag to pnpm publish.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/release.ymlscripts/publish.jstegg/core/core-decorator/src/util/QualifierUtil.tstegg/core/dynamic-inject/src/QualifierImplDecoratorUtil.tstegg/core/dynamic-inject/src/QualifierImplUtil.tstegg/plugin/tegg/src/lib/AppLoadUnit.ts
💤 Files with no reviewable changes (1)
- tegg/core/dynamic-inject/src/QualifierImplUtil.ts
Add retry detection to skip version bump when re-running a failed release. Replace pnpm -r publish with per-package publish script that skips already-published versions, retries failures, and handles race conditions. - Add scripts/publish.js for resilient per-package publishing - Detect existing version tag/commit on retry to skip duplicate bump - Separate push into own step for better failure granularity - Make GitHub Release creation idempotent (check before create) - Run cnpm sync and summary even on partial publish failure - Fix broken error handling (duplicate || tail on publish) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
0348579 to
3ced92f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/release.yml (1)
239-258:⚠️ Potential issue | 🟡 MinorSummary can incorrectly report success after failed publish/build steps.
This step runs on
!cancelled(), so it executes after failures too, but it still writes a “Release Completed” heading. That can mislead operators during partial-failure retries.💡 Suggested fix
else - if [ "${{ steps.detect.outputs.skip_version_bump }}" == "true" ]; then + if [ "${{ job.status }}" != "success" ]; then + echo "### ⚠️ Release Finished With Failures" >> $GITHUB_STEP_SUMMARY + elif [ "${{ steps.detect.outputs.skip_version_bump }}" == "true" ]; then echo "### 🔄 Retry Release Completed" >> $GITHUB_STEP_SUMMARY else echo "### ✅ Release Completed" >> $GITHUB_STEP_SUMMARY fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/release.yml around lines 239 - 258, The summary step currently runs under if: ${{ !cancelled() }} and unconditionally writes "### ✅ Release Completed", which can misreport success after failures; change the step to run always (use always() instead of !cancelled()) and inside the run block inspect the job outcome (e.g. via ${{ job.status }} or a prior step's outcome) and only echo "### ✅ Release Completed" when the job status equals "success" (still preserving the steps.detect.outputs.skip_version_bump check); otherwise emit a clear "Release failed/incomplete" heading so partial failures or failed publish/builds are not reported as successful.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/publish.js`:
- Around line 92-107: The dry-run logic currently treats caught publish errors
as success because failures are only queued into toRetry and retries are skipped
when isDryRun is true, leaving finalFailed empty; fix by treating dry-run
publish errors as real failures: in the catch block (around the
isDryRun/isPublished check) ensure that when isDryRun is true you do not mark
the package as "already published" but instead record it as failed (add the
package/label to finalFailed or a dryRunFailed list and log an error), and
likewise in the retry section (where toRetry is handled when !isDryRun) add a
branch that when isDryRun is true moves all toRetry entries into finalFailed (or
logs them as failed) so the script exits non-zero or reports failures correctly;
reference the isDryRun, toRetry, finalFailed variables and the catch block
handling the publish result.
---
Outside diff comments:
In @.github/workflows/release.yml:
- Around line 239-258: The summary step currently runs under if: ${{
!cancelled() }} and unconditionally writes "### ✅ Release Completed", which can
misreport success after failures; change the step to run always (use always()
instead of !cancelled()) and inside the run block inspect the job outcome (e.g.
via ${{ job.status }} or a prior step's outcome) and only echo "### ✅ Release
Completed" when the job status equals "success" (still preserving the
steps.detect.outputs.skip_version_bump check); otherwise emit a clear "Release
failed/incomplete" heading so partial failures or failed publish/builds are not
reported as successful.
| } catch { | ||
| // Double-check: the publish might have actually succeeded | ||
| // (e.g. npm returned non-zero but the package landed) | ||
| if (!isDryRun && isPublished(pkg.name, pkg.version)) { | ||
| console.log(` ⏭️ ${label} already published (confirmed after error)`); | ||
| skipped.push(label); | ||
| } else { | ||
| console.error(` ❌ ${label} failed, will retry`); | ||
| toRetry.push(pkg); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Retry failed packages once | ||
| const finalFailed = []; | ||
| if (toRetry.length > 0 && !isDryRun) { |
There was a problem hiding this comment.
Dry-run failures are currently swallowed and reported as success.
Failed dry-run publishes are only queued in toRetry, but retry is skipped in dry-run mode, so finalFailed stays empty and the script exits 0 with a success message.
💡 Suggested fix
const published = [];
const skipped = [];
const toRetry = [];
+const finalFailed = [];
for (const pkg of packages) {
const label = `${pkg.name}@${pkg.version}`;
@@
try {
publishOne(pkg);
console.log(` ✅ ${label}`);
published.push(label);
} catch {
+ if (isDryRun) {
+ console.error(` ❌ ${label} failed (dry-run)`);
+ finalFailed.push(label);
+ continue;
+ }
+
// Double-check: the publish might have actually succeeded
// (e.g. npm returned non-zero but the package landed)
if (!isDryRun && isPublished(pkg.name, pkg.version)) {
console.log(` ⏭️ ${label} already published (confirmed after error)`);
skipped.push(label);
@@
-const finalFailed = [];
if (toRetry.length > 0 && !isDryRun) {
@@
-console.log('\n✅ All packages published successfully!');
+console.log(
+ isDryRun
+ ? '\n✅ Dry-run publish checks completed successfully!'
+ : '\n✅ All packages published successfully!'
+);Also applies to: 141-149
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/publish.js` around lines 92 - 107, The dry-run logic currently treats
caught publish errors as success because failures are only queued into toRetry
and retries are skipped when isDryRun is true, leaving finalFailed empty; fix by
treating dry-run publish errors as real failures: in the catch block (around the
isDryRun/isPublished check) ensure that when isDryRun is true you do not mark
the package as "already published" but instead record it as failed (add the
package/label to finalFailed or a dryRunFailed list and log an error), and
likewise in the retry section (where toRetry is handled when !isDryRun) add a
branch that when isDryRun is true moves all toRetry entries into finalFailed (or
logs them as failed) so the script exits non-zero or reports failures correctly;
reference the isDryRun, toRetry, finalFailed variables and the catch block
handling the publish result.
Summary
scripts/publish.jsfor resilient per-package publishing that skips already-published versions, retries failures once, and handles race conditions!cancelled())|| tailwith incorrect logic)Test plan
node scripts/publish.js --tag=beta --dry-runpasses for all 74 packages🤖 Generated with Claude Code
Summary by CodeRabbit