Skip to content

Fix deletion bugs: scope escalation, checkpoint recovery, progress UX#26

Merged
wesm merged 20 commits intomainfrom
deletion-bugs
Feb 3, 2026
Merged

Fix deletion bugs: scope escalation, checkpoint recovery, progress UX#26
wesm merged 20 commits intomainfrom
deletion-bugs

Conversation

@wesm
Copy link
Copy Markdown
Owner

@wesm wesm commented Feb 3, 2026

Summary

Fixes #8, #11. Comprehensive fixes to the deletion system covering OAuth scope handling, checkpoint/resume robustness, and CLI output quality.

OAuth scope escalation

When a user runs delete-staged for the first time, the existing OAuth token typically only has gmail.readonly + gmail.modify scopes. The deletion system now:

  • Proactively detects insufficient scopes before attempting any API calls by checking stored token scope metadata

  • Displays a clear permission upgrade prompt explaining what's needed:

    ======================================================================
    PERMISSION UPGRADE REQUIRED
    ======================================================================
    
    Batch deletion requires elevated Gmail permissions.
    
    Your current OAuth token was granted with limited permissions that
    don't include batch delete. To proceed, msgvault needs to:
    
      1. Delete your existing OAuth token
      2. Re-authorize with full Gmail access (mail.google.com scope)
    
    This elevated permission allows msgvault to permanently delete
    messages in bulk. You can revoke access anytime at:
      https://myaccount.google.com/permissions
    
    Upgrade permissions now? [y/N]:
    

    For trash (non-batch) deletion, a simpler variant is shown requesting gmail.modify scope.

  • Canceling is clean: answering "N" returns to the prompt with no error, using a sentinel errUserCanceled error

  • Reactive fallback: if the proactive check can't run (legacy tokens without scope metadata), scope errors from the API are caught and the same prompt is shown

  • Legacy tokens (saved before scope tracking) are detected via HasScopeMetadata() and handled gracefully

Checkpoint and resume robustness

  • Failed message IDs are tracked in ExecuteBatch (previously only in single-message Execute)
  • Retry on resume: when resuming a manifest with FailedIDs, those IDs are retried before continuing with remaining messages
  • Scope errors save accurate checkpoints: on insufficient-scope errors mid-retry, only unattempted + still-failed IDs are saved (not already-succeeded ones)
  • CancelManifest now surfaces real I/O errors instead of silently continuing
  • Fallback loop (batch→individual) tracks LastProcessedIndex per-message instead of per-batch
  • Bounds checking for corrupted startIndex values

TUI drill-down context fix

  • Fixed SourceID overwrite bug where "All Accounts" filter would clear the drill-down filter's source, causing deletion staging to ignore the current subgroup view

CLI output

  • Progress bar with visual bar, percentage, count, ETA, and failure count
  • ANSI escape sequences guarded by TTY detection — falls back to plain line-per-update when piped/redirected
  • Executor logs downgraded to Debug to avoid colliding with progress output
  • Clean completion summary line

Test plan

  • delete-staged with insufficient scopes → shows permission upgrade prompt
  • Canceling scope upgrade → clean return, no error
  • Batch deletion with 404s (already deleted) → treated as success
  • Resume after partial failure → retries failed IDs first
  • Scope error mid-execution → saves checkpoint, propagates error
  • make test && make lint pass
  • Progress bar displays correctly on TTY
  • Output is clean when redirected (delete-staged 2>&1 | cat)

🤖 Generated with Claude Code

wesm and others added 20 commits February 2, 2026 19:41
When viewing a sub-aggregate (e.g., time periods within a specific
sender), pressing 'd' to stage for deletion only applied the
sub-aggregate filter (time period) and ignored the parent drill-down
filter (sender). This caused far more messages to be staged than
intended.

Pass the drill-down filter through to StageForDeletion and use it as
the base filter when resolving Gmail IDs, so both parent and child
filters are combined.

Fixes #8

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, delete-staged would attempt the API call with the existing
token, get a 403 insufficient scope error, and only then offer to
re-authorize. This was confusing because users expected the tool to
request deletion permissions upfront.

Now tokens are saved with scope metadata, and delete-staged checks
proactively whether the token has the required mail.google.com scope
before making any API calls. Legacy tokens without scope metadata
fall back to the existing reactive detection.

Also extracts the scope escalation prompt into a shared function used
by both the proactive and reactive paths.

Fixes #11

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…wing

The executor's ExecuteBatch and Execute methods were catching 403
insufficient scope errors and silently counting them as failures.
This prevented the scope escalation prompt in delete-staged from ever
triggering, since the executor returned nil (success with failures)
instead of propagating the scope error.

Now both methods detect scope errors and return them immediately,
allowing the command layer to prompt for re-authorization.

Also adds .githooks/post-commit to .gitignore.

Fixes #11

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Three related fixes for deletion reliability:

1. ExecuteBatch's fallback-to-individual loop now detects scope errors
   and returns immediately instead of counting them as generic failures.
   Both ExecuteBatch and Execute save checkpoints before returning on
   scope errors so the resume position is correct.

2. CancelManifest now works for both pending and in-progress manifests,
   not just pending. This lets users clear corrupted in-progress batches.

3. cancel-deletion supports --all to clear all pending and in-progress
   batches at once, useful for recovering from persistent failures.

Also adds .githooks/post-commit to .gitignore.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Set Content-Type: application/json header on API requests with a body,
fixing Gmail batchDelete returning 400 "No message ids specified". Also
add per-message progress reporting in the individual-delete fallback so
the CLI no longer appears frozen.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Return nil on user cancel in promptScopeEscalation for clean exit (#4105)
- Fix CancelManifest to surface real errors, only continue on NotExist (#4108)
- Fix fallback loop LastProcessedIndex to track per-item position (#4108)
- Surface list errors in cancel-deletion --all (#4108)
- Add tests for scope error propagation in Execute, ExecuteBatch, and
  fallback paths (#4107, #4108)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ch-id

promptScopeEscalation now returns errUserCanceled instead of nil so the
proactive caller doesn't proceed with the old token. The proactive path
maps it to a clean exit; the reactive path already exited cleanly.

Also reject --all combined with a positional batch-id argument in
cancel-deletion, and add tests for canceling in-progress manifests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Show initial 0% progress line on start so the CLI doesn't appear frozen.
Add ETA estimate to the progress line. Log each batch before the API call
so users see activity during long-running batch deletes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Only overwrite SourceID from accountFilter when non-nil, preserving
  drillFilter's account context in deletion staging (#4115 high)
- Surface non-cancel errors from promptScopeEscalation in reactive path
  instead of swallowing them (#4114 medium)
- Log manifest.Save errors on scope error checkpoint paths (#4115 low)
- Add HasScopeMetadata() for reliable legacy token detection instead of
  using HasScope(gmail.readonly) as proxy (#4114 low)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…Batch

ExecuteBatch now tracks FailedIDs (like Execute already did) and retries
them via individual deletes on resume instead of carrying forward stale
failure counts. This handles the case where a previous run failed due to
a transient issue (e.g. missing Content-Type header, scope errors) and
left a checkpoint with failed IDs that should be retried.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
On scope error during retry, save only unattempted + already-failed IDs
(not the full original list) so succeeded IDs aren't retried and
double-counted. Set Failed = len(FailedIDs) for internal consistency.

Add test for scope error mid-retry verifying correct checkpoint state.
Add corrupt token file test for HasScopeMetadata. Check json.Marshal
errors in tests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace raw INFO log lines during deletion with Debug level so they
don't collide with the progress bar. Redesign progress display with a
visual progress bar, ETA, and elapsed time. Show batch description
instead of ID in the execution header.

Before:
  Executing: 20260202-202547-Senders-order-update (1879 messages)
  time=... level=INFO msg="executing batch deletion" ...
  Progress: 0/1879 (0.0%) ...time=... level=INFO msg="deleting batch" ...

After:
  [1/2] Senders-order-update@example.com (1879 messages)
  [########----------------------] 25.0%  475/1879  3m12s remaining

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
On non-TTY outputs (pipes, CI, redirected files), ANSI sequences like
\033[K render as raw characters. Detect TTY via os.Stdout.Stat() and
fall back to plain newline-separated progress on non-terminals.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
OnStart now calls OnProgress(0,0,0) so users see the progress bar
at 0% right away instead of a blank line while the first batch
is being sent to the API.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
OnProgress now returns early when total <= 0, preventing division
by zero that would produce NaN% output.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@wesm wesm merged commit 0d91419 into main Feb 3, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deletion batches based on aggregations of sender and time flags too many messages, ignores sender.

1 participant