feat(browser): proof of concept — IndexedDB offline dead letter queue#3314
Draft
marquesds wants to merge 2 commits intoPostHog:mainfrom
Draft
feat(browser): proof of concept — IndexedDB offline dead letter queue#3314marquesds wants to merge 2 commits intoPostHog:mainfrom
marquesds wants to merge 2 commits intoPostHog:mainfrom
Conversation
Made-with: Cursor
|
@marquesds is attempting to deploy a commit to the PostHog Team on Vercel. A member of the Team first needs to authorize it. |
Contributor
Prompt To Fix All With AIThis is a comment left during a code review.
Path: packages/browser/src/posthog-core.ts
Line: 1180-1196
Comment:
**`stored_at` reset bypasses `dlq_max_age_hours` TTL**
When `_onFlushFailure` writes an already-in-DLQ event back (after a drain attempt exhausts its own retries), it sets `stored_at: Date.now()`. Because `write()` uses `store.put()` (upsert by UUID), the existing entry is silently replaced with a fresh timestamp. As a result, `evictExpired()` will never evict the event — its age keeps resetting with every failed drain, making `dlq_max_age_hours` effectively meaningless for events that fail repeatedly.
The original `stored_at` should be preserved when re-writing an event that is already in the DLQ. One option is to pass the original `stored_at` through the request options (e.g., as a custom field on the queued request) so `_onFlushFailure` can use it instead of `Date.now()`. Another option is to read the existing entry from the DLQ before writing and keep the oldest `stored_at`.
```typescript
// instead of always using now:
events.push({ uuid, data: item, stored_at: item._dlq_stored_at ?? now })
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: packages/browser/src/posthog-core.ts
Line: 1086-1095
Comment:
**`online` event listener is never removed**
`addEventListener(window, 'online', callback)` registers an anonymous arrow function that has no handle, so it can never be deregistered. If `_initDlq()` were ever called more than once (e.g., after a `reset()` + re-`init()`), multiple identical listeners would accumulate, each scheduling a drain on every connectivity change.
Consider storing the listener reference so it can be removed if needed, or guard `_initDlq` with a flag that prevents re-registration:
```typescript
const onlineHandler = () => setTimeout(() => this._drainDlq(), 1000)
addEventListener(window, 'online', onlineHandler)
// store this._dlqOnlineHandler = onlineHandler for future removeEventListener
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: packages/browser/src/__tests__/dlq.test.ts
Line: 108-115
Comment:
**Test name contradicts the implementation**
The test description "deduplicates via ConstraintError" implies a `ConstraintError` thrown by `store.add()`, but the implementation uses `store.put()`, which silently overwrites duplicate keys without any error. The test correctly verifies the deduplication behaviour, but the name will mislead future readers into thinking there is `add()`-based error handling.
```suggestion
it('deduplicates via put upsert', async () => {
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "chore: add changeset for offline DLQ fea..." | Re-trigger Greptile |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Users of offline-first or intermittently-connected applications permanently lose analytics events when the in-memory retry queue exhausts its 10 attempts while the tab is still open. For field/mission apps — where users operate offline for extended periods — this means most captured events never reach PostHog.
Reported in #1583.
Changes
This PR introduces an IndexedDB-backed dead letter queue (DLQ) that catches events after the existing
RetryQueuegives up, persists them to disk, and re-sends them on the next page load or when the browser comes back online.Architecture
flowchart TD A["posthog.capture()"] --> B["RequestQueue\n(batches events)"] B --> C["RetryQueue\n(up to 10 retries with\nexponential backoff)"] C -->|"✅ 200 OK"| D["Event delivered\nto PostHog"] C -->|"4xx client error"| E["Dropped\n(not retriable)"] C -->|"❌ 10 retries exhausted\n(5xx / network error)"| F{{"enable_offline_dlq?"}} F -->|false| G["🗑️ Event lost\n(current behavior)"] F -->|true| H["_onFlushFailure()"] H --> I[("IndexedDB\nOfflineDlq\n(posthog_dlq)")] J["🌐 Browser 'online' event"] --> K["_drainDlq()"] L["📄 Page load\n(3s delay)"] --> K K --> M{"is_capturing()?\n(consent check)"} M -->|"no (opted out)"| N["🗑️ Clear DLQ"] M -->|"yes"| O["evictExpired()\nenforceMaxEntries()"] O --> P["Read all → batch → send\nvia _send_retriable_request()"] P -->|"✅ success"| Q["Delete sent events\nfrom IndexedDB"] P -->|"❌ failure"| R["Stop drain\n(retry next trigger)"] style I fill:#f9e6a0,stroke:#d4a017 style D fill:#d4edda,stroke:#28a745 style G fill:#f8d7da,stroke:#dc3545 style N fill:#f8d7da,stroke:#dc3545What's included
New file:
packages/browser/src/dlq.ts—OfflineDlqclass (~280 lines)posthog_dlqdatabase,eventsobject store,uuidkeyPath)open(),write(),readAll(),delete(),evictExpired(),enforceMaxEntries(),clear(),close()versionchangelistener for graceful close when another tab upgrades the schemaConfig options (added to
PostHogConfigin@posthog/types)enable_offline_dlq— opt-in toggle, defaults tofalse(zero risk for existing users)dlq_max_age_hours(default 24),dlq_max_entries(default 1000)dlq_drain_on_online(default true),dlq_drain_on_load(default true)SDK integration (in
posthog-core.tsandretry-queue.ts)_retryQueuein_init(), gated by_shouldEnableDlq()_onFlushFailure()called fromRetryQueuewhen retries exhaust — writes failed events to DLQ_drainDlq()with concurrent-drain guard, consent re-check, Web Locks (progressive), batched sendopt_out_capturing()clears the DLQ to respect privacyTests
OfflineDlq(usingfake-indexeddb)retry-queue.test.tsupdated and passingWhat's NOT included (future phases)
$$dlq_summaryinternal metric eventsendBeaconremains the safety net)Key design decisions
enable_offline_dlq: falseby default — fully implemented but opt-in only until validatedsendBeaconis the correct mechanismstore.put()for idempotent writes — natural dedup by UUID, simpler thanstore.add()+ ConstraintError handlingimport()) — required for Rollup IIFE output compatibilityRelease info Sub-libraries affected
Libraries affected
Checklist
If releasing new changes
pnpm changesetto generate a changeset file