Skip to content

Conversation

@anthony-murphy
Copy link
Contributor

Summary

  • Fix unhandled promise rejections in PrefetchDocumentStorageService when parallel prefetch requests fail
  • Add unit tests for prefetch error handling and caching behavior

Problem

The PrefetchDocumentStorageService fires prefetch requests using fire-and-forget pattern (void this.cachedRead(blob)). When these requests failed, the .catch() handler was re-throwing errors, creating rejected promises stored in cache that no one awaited - causing uncaughtException errors.

Evidence from telemetry showed 98 uncaughtException errors from a single process within ~70ms, all originating from readBlob via prefetch.

Solution

Store the original promise in the cache and attach .catch() for side effects only (cache cleanup on retryable errors). Do not re-throw - callers who await the cached promise still receive the rejection properly.

Test plan

  • New unit tests for PrefetchDocumentStorageService
  • driver-utils tests pass (59 tests)
  • routerlicious-driver tests pass (132 tests)

claude added 2 commits January 6, 2026 17:34
The PrefetchDocumentStorageService was storing a promise that re-threw
errors in the cache. When prefetch fired requests with fire-and-forget
pattern (void this.cachedRead()), failures created rejected promises
that no one awaited, causing unhandled rejection errors.

The fix stores the original promise in the cache and attaches a .catch()
handler for side effects only (cache cleanup on retryable errors).
Callers who await the cached promise still receive the rejection properly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add unit tests verifying:
- Errors propagate to callers who await readBlob
- Cache is cleared on retryable errors, allowing retry
- Prefetch successfully caches blobs for later reads

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copilot AI review requested due to automatic review settings January 7, 2026 01:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes unhandled promise rejections in PrefetchDocumentStorageService that occurred when fire-and-forget prefetch requests failed. The fix attaches a .catch() handler for side effects only (cache cleanup) without re-throwing, while still allowing callers who await the cached promise to receive rejections properly.

Key Changes:

  • Modified error handling to attach .catch() handler before caching the promise, preventing unhandled rejections
  • Added comprehensive unit tests for error propagation, retry behavior, and prefetch caching

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
packages/loader/driver-utils/src/prefetchDocumentStorageService.ts Refactored error handling to attach catch handler for side effects without re-throwing, stores original promise in cache
packages/loader/driver-utils/src/test/prefetchDocumentStorageService.spec.ts New test file with 3 test cases covering error propagation, retryable error cache clearing, and successful prefetch caching

// Note: Callers who await the cached promise will still see the rejection
prefetchedBlobPFromStorage.catch((error) => {
if (canRetryOnError(error)) {
this.prefetchCache.delete(blobId);
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition between the error handler and cache operations. If an error occurs and canRetryOnError(error) is true, the cache entry is deleted asynchronously in the catch handler. However, between the time the promise is stored in the cache (line 69) and when the catch handler runs, another caller could retrieve the same promise from the cache (line 56-57). When the catch handler executes and deletes the cache entry, the second caller would still have a reference to the rejected promise but the cache would be cleared, potentially causing inconsistent behavior.

Consider checking if the cache entry still contains the same promise reference before deleting it in the error handler to avoid clearing a newer cached promise.

Suggested change
this.prefetchCache.delete(blobId);
// Only clear the cache entry if it still points to this promise
const cachedPromise = this.prefetchCache.get(blobId);
if (cachedPromise === prefetchedBlobPFromStorage) {
this.prefetchCache.delete(blobId);
}

Copilot uses AI. Check for mistakes.
await prefetchService.getSnapshotTree();

// Wait for prefetch
await new Promise((resolve) => setTimeout(resolve, 10));
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test uses a hardcoded timeout to wait for async prefetch operations. This can lead to flaky tests if the prefetch takes longer than expected on slower CI systems. Consider using a more deterministic approach, such as:

  1. Exposing a way to await prefetch completion (e.g., tracking promises)
  2. Using a helper that polls for the expected state with a timeout
  3. Mocking/stubbing the async behavior to make it synchronous in tests

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +93
describe("PrefetchDocumentStorageService", () => {
let mockStorage: MockStorageService;
let prefetchService: PrefetchDocumentStorageService;

beforeEach(() => {
mockStorage = new MockStorageService();
prefetchService = new PrefetchDocumentStorageService(
mockStorage as unknown as IDocumentStorageService,
);
});

afterEach(() => {
prefetchService.stopPrefetch();
});

it("should propagate errors to callers who await readBlob", async () => {
mockStorage.shouldFail = true;
const testError = new Error("Network failure");
mockStorage.failureError = testError;

// Direct readBlob call should receive the error
await assert.rejects(
async () => prefetchService.readBlob("someBlob"),
(error: Error) => error.message === "Network failure",
);
});

it("should clear cache on retryable errors allowing retry", async () => {
const retryableError = new Error("Retryable error");
(retryableError as any).canRetry = true;
mockStorage.failureError = retryableError;
mockStorage.shouldFail = true;

// First call fails
await assert.rejects(async () => prefetchService.readBlob("blob1"));

// Reset mock to succeed
mockStorage.shouldFail = false;
mockStorage.readBlobCalls = [];

// Second call should retry (not use cached error)
const result = await prefetchService.readBlob("blob1");
assert.ok(result);
assert.strictEqual(mockStorage.readBlobCalls.length, 1);
});

Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't fully cover the original bug scenario where fire-and-forget prefetch requests (via void this.cachedRead(blob)) caused unhandled rejections. Consider adding a test that:

  1. Triggers a prefetch via getSnapshotTree() (which uses fire-and-forget pattern)
  2. Has the mock service fail during prefetch
  3. Verifies that no unhandled rejection occurs (the promise rejection is handled silently)
  4. Optionally verifies that a subsequent explicit readBlob() call still receives the error properly

This would directly test the fix for the reported issue of 98 uncaughtException errors.

Copilot uses AI. Check for mistakes.
// Second call should retry (not use cached error)
const result = await prefetchService.readBlob("blob1");
assert.ok(result);
assert.strictEqual(mockStorage.readBlobCalls.length, 1);
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion message is misleading. The test verifies that readBlobCalls.length === 1, which means the blob was fetched once (a retry happened). However, the comment says "Second call should retry (not use cached error)" which might confuse readers. The assertion confirms that a new read was performed (the retry), so the comment correctly describes the behavior, but the assertion message could be clearer about what it's verifying - that a new network call was made because the cache was cleared.

Suggested change
assert.strictEqual(mockStorage.readBlobCalls.length, 1);
assert.strictEqual(
mockStorage.readBlobCalls.length,
1,
"Second readBlob call should perform exactly one new underlying read after cache is cleared",
);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants