Skip to content

Conversation

@MasterPtato
Copy link
Contributor

@MasterPtato MasterPtato commented Jan 14, 2026

Fixes RVT-5333

@linear
Copy link

linear bot commented Jan 14, 2026

Copy link
Contributor Author

MasterPtato commented Jan 14, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Jan 14, 2026

Code Review: PR #3896 - Fix engine-runner shutdown close code handling

Summary

This PR fixes a race condition in the WebSocket close event handler by restructuring how shutdown state is managed. The changes simplify the shutdown flow and ensure proper handling of close events during shutdown.

Key Changes Analysis

1. Removal of #shuttingDown flag ✅

Before: Used separate #shuttingDown and #shutdown flags
After: Only uses #shutdown flag

Assessment: Good simplification. The dual-flag approach was confusing and led to race conditions. The #shuttingDown flag was set immediately at the start of shutdown(), but #shutdown was set later, creating a window where state was inconsistent.

2. Moved #shutdown = true to beginning of shutdown method ✅

Location: engine/sdks/typescript/runner/src/mod.ts:488

// Before:
this.#shuttingDown = true;
// ... log and cleanup ...
this.#shutdown = true;

// After:
this.#shutdown = true;
// ... log and cleanup ...

Assessment: Correct fix. Setting #shutdown = true immediately prevents concurrent shutdown calls and makes the state immediately consistent.

3. Restructured WebSocket close handler ✅

Location: engine/sdks/typescript/runner/src/mod.ts:918-976

The close handler now has a clear if/else structure:

  • If NOT shutting down: Handle reconnection logic, parse close reasons, schedule reconnects
  • If shutting down: Log clean closure and call onDisconnected callback

Assessment: Much cleaner logic flow. The previous version had confusing nested conditions where eviction and runner_shutdown were handled differently, with some paths duplicating cleanup logic.

Issues Found

⚠️ Potential Issue: Ack interval cleanup duplication

Location: engine/sdks/typescript/runner/src/mod.ts:941-945 and shutdown() method at line 509-512

The ack interval is cleared in two places:

  1. In the close handler when !this.#shutdown (lines 941-945)
  2. In the shutdown() method itself (lines 509-512)

Concern: While this is not harmful (clearing twice is safe), it suggests potential race conditions:

  • If WebSocket closes while shutdown() is executing, which one clears first?
  • The close handler only clears the interval when NOT shutting down, but the shutdown method also clears it

Recommendation: The current implementation is actually correct - the close handler clears the interval for unexpected disconnections (when !this.#shutdown), and the shutdown method clears it for graceful shutdowns. This is intentional redundancy. No change needed, but consider adding a comment explaining this pattern.

🔍 Question: Eviction handling during shutdown

Location: engine/sdks/typescript/runner/src/mod.ts:921-929

if (!this.#shutdown) {
    const closeError = parseWebSocketCloseReason(ev.reason);
    if (closeError?.group === "ws" && closeError?.error === "eviction") {
        // ... handle eviction
        await this.shutdown(true);
    }
}

Question: What happens if an eviction occurs while already shutting down? The eviction is ignored, which seems correct, but should be verified:

  • Is it possible for the server to send an eviction after shutdown() is called but before the WebSocket closes?
  • If so, is ignoring it the correct behavior?

Recommendation: Add a comment explaining that evictions during shutdown are intentionally ignored since shutdown is already in progress.

Removed special handling for runner_shutdown close code

The previous code had special logic for pegboard.runner_shutdown close code. This has been removed.

Assessment: Without seeing the requirements, I cannot determine if this is correct. Questions:

  • Was the special handling for runner_shutdown actually needed?
  • Does removing it change expected behavior?

Recommendation: Verify with tests or ticket requirements that removing runner_shutdown special handling is intentional.

Positive Aspects

  1. Simplified state management: Single source of truth for shutdown state
  2. Clearer control flow: The if/else structure in the close handler is much easier to follow
  3. Fixes race condition: Setting #shutdown = true immediately prevents concurrent shutdown calls
  4. Better separation of concerns: Reconnection logic is now clearly separated from shutdown logic

Testing Recommendations

  1. Test shutdown during active connection: Verify shutdown() properly closes WebSocket and no reconnection occurs
  2. Test eviction handling: Verify eviction triggers shutdown and onDisconnected is called once
  3. Test concurrent shutdown calls: Verify multiple shutdown calls are handled gracefully
  4. Test close during shutdown: Verify WebSocket close during shutdown does not trigger reconnection logic
  5. Test runner_shutdown close code: Verify behavior is correct without special handling (if this was intentionally removed)

Code Quality

  • ✅ Follows TypeScript best practices
  • ✅ Logging is appropriate and structured
  • ✅ Error handling is consistent
  • ✅ Code is more readable after refactoring

Security Considerations

No security concerns identified. The changes are internal state management improvements.

Performance Considerations

Negligible performance impact. The restructuring might be marginally faster due to reduced branching complexity.

Conclusion

Overall Assessment: ✅ LGTM with minor suggestions

This is a solid bug fix that simplifies the shutdown logic and fixes a race condition. The code quality is good, and the refactoring improves readability.

Action Items:

  1. ✅ Add comment explaining why ack interval cleanup exists in multiple places (optional)
  2. ✅ Add comment explaining evictions during shutdown are ignored (optional)
  3. ⚠️ Verify removal of runner_shutdown special handling is intentional (recommended)
  4. ✅ Ensure test coverage for shutdown scenarios (recommended)

The PR is ready to merge once the runner_shutdown behavior change is confirmed as intentional.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 14, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3896

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3896

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3896

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3896

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3896

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3896

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3896

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3896

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3896

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3896

commit: 53bd400

@MasterPtato MasterPtato force-pushed the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch from 600206d to 55a38d5 Compare January 14, 2026 22:45
@MasterPtato MasterPtato force-pushed the 01-13-chore_gas_add_overview_and_history_docs branch 2 times, most recently from 654b4e9 to c023ef5 Compare January 14, 2026 22:47
@MasterPtato MasterPtato force-pushed the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch 2 times, most recently from 715b5e8 to d6cf484 Compare January 14, 2026 22:52
@MasterPtato MasterPtato force-pushed the 01-13-chore_gas_add_overview_and_history_docs branch 2 times, most recently from af232ee to 1c80e13 Compare January 14, 2026 23:02
@MasterPtato MasterPtato force-pushed the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch 2 times, most recently from 13e2511 to ce0f27d Compare January 14, 2026 23:07
@MasterPtato MasterPtato force-pushed the 01-13-chore_gas_add_overview_and_history_docs branch from 5889ab3 to 7d75756 Compare January 14, 2026 23:39
@MasterPtato MasterPtato force-pushed the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch from ce0f27d to 53bd400 Compare January 14, 2026 23:39
@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 14, 2026

Merge activity

  • Jan 14, 11:40 PM UTC: MasterPtato added this pull request to the Graphite merge queue.
  • Jan 14, 11:41 PM UTC: CI is running for this pull request on a draft pull request (#3908) due to your merge queue CI optimization settings.
  • Jan 14, 11:42 PM UTC: Merged by the Graphite merge queue via draft PR: #3908.

graphite-app bot pushed a commit that referenced this pull request Jan 14, 2026
@graphite-app graphite-app bot closed this Jan 14, 2026
@graphite-app graphite-app bot deleted the 01-14-fix_engine-runner_handle_shutdown_close_codes_correctly branch January 14, 2026 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants