fix(codex): dedupe repeated token snapshots#878
fix(codex): dedupe repeated token snapshots#878albincsergo wants to merge 1 commit intoryoppippi:mainfrom
Conversation
📝 WalkthroughWalkthroughModified data-loader.ts to always compute per-event usage deltas from cumulative totals when available, falling back to last-usage snapshots only when totals are null. Added deduplication logic and unit tests for duplicate total snapshot handling. Changes
Possibly related issues
Poem
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/codex/src/data-loader.ts (1)
295-306:⚠️ Potential issue | 🟠 MajorAdvance
previousTotalswhen falling back tolast_token_usage.After Line 302 uses
lastUsage,previousTotalsis not advanced. If a later event resumestotal_token_usage, Line 300 diffs against stale/null totals and re-emits already-counted usage. Example: alast=100event followed by atotal=150event currently emits100and150, not100and50. Please also add a regression for alast-only -> totaltransition.Suggested fix
+function addRawUsage(base: RawUsage | null, delta: RawUsage): RawUsage { + return { + input_tokens: (base?.input_tokens ?? 0) + delta.input_tokens, + cached_input_tokens: (base?.cached_input_tokens ?? 0) + delta.cached_input_tokens, + output_tokens: (base?.output_tokens ?? 0) + delta.output_tokens, + reasoning_output_tokens: + (base?.reasoning_output_tokens ?? 0) + delta.reasoning_output_tokens, + total_tokens: (base?.total_tokens ?? 0) + delta.total_tokens, + }; +} ... let raw: RawUsage | null = null; if (totalUsage != null) { // Prefer cumulative totals when available. Codex can emit duplicate token_count // snapshots (e.g. with different rate-limit buckets) where last_token_usage repeats. // Diffing totals collapses those duplicates into zero-delta events. raw = subtractRawUsage(totalUsage, previousTotals); -} else { - raw = lastUsage; -} - -if (totalUsage != null) { previousTotals = totalUsage; +} else { + raw = lastUsage; + if (lastUsage != null) { + previousTotals = addRawUsage(previousTotals, lastUsage); + } }Based on learnings, "Use payload.info.total_token_usage as cumulative totals and payload.info.last_token_usage as per-turn delta; when only cumulative totals exist, compute delta by subtracting previous totals".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/codex/src/data-loader.ts` around lines 295 - 306, When falling back to last_token_usage (the lastUsage branch) we must advance previousTotals so future total_token_usage diffs against an updated cumulative baseline; update the branch that sets raw = lastUsage to also set previousTotals = previousTotals ? addRawUsage(previousTotals, lastUsage) : lastUsage (or equivalent) so the per-turn delta is incorporated into previousTotals, and add a regression test covering a last-only -> total transition (e.g., last=100 then total=150 should emit 100 then 50). Reference variables/functions: lastUsage, totalUsage, previousTotals, subtractRawUsage (and use or add an addRawUsage helper if needed).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@apps/codex/src/data-loader.ts`:
- Around line 295-306: When falling back to last_token_usage (the lastUsage
branch) we must advance previousTotals so future total_token_usage diffs against
an updated cumulative baseline; update the branch that sets raw = lastUsage to
also set previousTotals = previousTotals ? addRawUsage(previousTotals,
lastUsage) : lastUsage (or equivalent) so the per-turn delta is incorporated
into previousTotals, and add a regression test covering a last-only -> total
transition (e.g., last=100 then total=150 should emit 100 then 50). Reference
variables/functions: lastUsage, totalUsage, previousTotals, subtractRawUsage
(and use or add an addRawUsage helper if needed).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bbd53fd3-a9c8-4070-a336-f9c4e6781005
📒 Files selected for processing (1)
apps/codex/src/data-loader.ts
Summary
total_token_usagewhen availabletoken_countsnapshots that repeatlast_token_usageVerification
pnpm --filter @ccusage/codex typecheckpnpm --filter @ccusage/codex testRepro
A duplicated Codex snapshot sequence counted
2540tokens before this change and1440tokens after it.Summary by CodeRabbit
Bug Fixes
Tests