fix(workflow-engine): replay completed void steps on restart#4186
fix(workflow-engine): replay completed void steps on restart#4186NathanFlurry wants to merge 1 commit intomainfrom
Conversation
|
🚅 Deployed to the rivet-pr-4186 environment in rivet-frontend
|
How to use the Graphite Merge QueueAdd the label merge-queue to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
Code ReviewSummaryThis PR fixes a critical bug in the workflow engine where steps that return Positive Aspects ✅
Potential Concerns 🔍1. Performance Consideration (Minor)Location: The fix loads metadata for every step replay, even when Current flow: // Line 378: Fast path for steps with defined output
if (stepData.output !== undefined) {
return stepData.output as T;
}
// Line 389: Loads metadata for ALL remaining steps
const metadata = await loadMetadata(...);
// Line 395: Checks completed status for void steps
if (metadata.status === "completed") {
return stepData.output as T;
}Consideration: The metadata load happens
Question: Is there a concern about the performance impact of loading metadata for void steps? The Potential optimization (if needed): // Fast path: output exists AND not explicitly undefined in serialization
if (stepData.output !== undefined) {
return stepData.output as T;
}
// Load metadata once for both void step check and retry logic
const metadata = await loadMetadata(...);
// Check if completed (handles void steps)
if (metadata.status === "completed") {
return stepData.output as T;
}
// Continue with retry logic...This is already what the PR does, so the implementation is optimal given the constraint. ✅ 2. Edge Case: What if metadata is missing?Location: The Question: Are there any scenarios where:
This seems unlikely in normal operation, but could occur if:
Recommendation: Consider if this scenario needs defensive handling or is considered an invalid/corrupted state that should fail loudly. 3. Test Coverage Gap (Minor)The new test validates restart behavior, but doesn't explicitly test:
Recommendation: Consider adding tests for: it("should retry failed void steps", async () => {
let attempts = 0;
const workflow = async (ctx: WorkflowContextInterface) => {
await ctx.step("failing-void", async () => {
attempts++;
if (attempts < 2) throw new Error("fail");
// return void on success
});
};
// ... test that it retries and eventually succeeds
});Code Style ✅
Security & Correctness ✅
RecommendationApprove with minor suggestions The fix is correct and well-implemented. The performance impact is minimal due to metadata caching. Consider:
The PR is ready to merge as-is. The bug fix is important and the implementation is sound. Review generated by Claude Code - Learn more |

Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: