Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
f1d7d5e
feat: add TUI remote support for NAS deployment
EconoBen Mar 9, 2026
288e061
fix: address review findings for remote engine and API
wesm Mar 9, 2026
70fc6ca
fix: SQLite engine fallback and conversation_id in summaries
wesm Mar 9, 2026
b20680a
fix: wire context filters through remote deep search
wesm Mar 11, 2026
45131d0
fix: restore deep search default limit to 100 rows
wesm Mar 14, 2026
0c4a900
fix: pagination defaults, total counts, and date filter merging
wesm Mar 14, 2026
d5c44cb
fix: accurate has_more, limit=0 rejection, and date bound intersection
wesm Mar 14, 2026
8fa99f9
fix: address roborev findings for stale cache, limit clamping, and da…
wesm Mar 14, 2026
1a3c99e
fix: detect deletion-driven cache staleness, normalize date params to…
wesm Mar 14, 2026
3d94255
fix: normalize cache sync timestamp to match SQLite datetime format
wesm Mar 14, 2026
6eb3755
fix: pre-export cache watermark, exempt thread fetches from page cap
wesm Mar 14, 2026
b55820d
fix: use >= for deletion watermark comparison, cap fast search limit
wesm Mar 14, 2026
93f782d
docs: update API response shapes to match implementation
wesm Mar 14, 2026
8586071
docs: fix inconsistent has_more value in /messages/filter example
wesm Mar 14, 2026
08acfe7
fix: reject unsupported filter params in deep search endpoint
wesm Mar 14, 2026
5697a43
fix: reject unsupported filter params in fast search endpoint
wesm Mar 14, 2026
f66e9b8
fix: rebuild cache on any sync mutation, handle TimeRange in SQLite s…
wesm Mar 14, 2026
dab3423
fix: use full rebuild when cache staleness is from deletions
wesm Mar 15, 2026
7d98ba0
fix: return structured staleness from cacheNeedsBuild
wesm Mar 15, 2026
bacab6f
fix: handle remote deleted messages and cache updates
wesm Mar 18, 2026
2cd0980
fix: use sync run ids for cache staleness
wesm Mar 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .roborev.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,12 @@ HTTP remote defaults, plaintext key display in interactive CLI,
enabled=true override on account creation, and page-aligned pagination
are documented design decisions — see code comments at each site.

Remote engine query string reconstruction in buildSearchQueryString is
intentionally simplified — phrase quoting edge cases are acceptable since
the search parser on the server re-parses the query. Empty search queries
sending q= is expected; the server returns empty results gracefully.
TimeGranularity defaults to "month" when unspecified, which is correct.

This is a single-user personal tool with no privilege separation, no
setuid, no shared directories, and no multi-tenant access. Do not flag
symlink-following, local file overwrites, or similar CWE patterns that
Expand Down
4 changes: 4 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ make lint # Run linter
# TUI and analytics
./msgvault tui # Launch TUI
./msgvault tui --account you@gmail.com # Filter by account
./msgvault tui --local # Force local (override remote config)
./msgvault build-cache # Build Parquet cache
./msgvault build-cache --full-rebuild # Full rebuild
./msgvault stats # Show archive stats
Expand All @@ -63,6 +64,9 @@ make lint # Run linter
./msgvault import-emlx --account me@gmail.com # Specific account(s)
./msgvault import-emlx /path/to/dir --identifier me@gmail.com # Manual fallback

# Daemon mode (NAS/server deployment)
./msgvault serve # Start HTTP API + scheduled syncs

# Maintenance
./msgvault repair-encoding # Fix UTF-8 encoding issues
```
Expand Down
31 changes: 30 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,15 +81,20 @@ msgvault tui
| `add-account EMAIL` | Authorize a Gmail account (use `--headless` for servers) |
| `sync-full EMAIL` | Full sync (`--limit N`, `--after`/`--before` for date ranges) |
| `sync EMAIL` | Sync only new/changed messages |
| `tui` | Launch the interactive TUI (`--account` to filter) |
| `tui` | Launch the interactive TUI (`--account` to filter, `--local` to force local) |
| `search QUERY` | Search messages (`--json` for machine output) |
| `show-message ID` | View full message details (`--json` for machine output) |
| `mcp` | Start the MCP server for AI assistant integration |
| `serve` | Run daemon with scheduled sync and HTTP API for remote TUI |
| `stats` | Show archive statistics |
| `list-accounts` | List synced email accounts |
| `verify EMAIL` | Verify archive integrity against Gmail |
| `export-eml` | Export a message as `.eml` |
| `import-mbox` | Import email from an MBOX export or `.zip` of MBOX files |
| `import-emlx` | Import email from an Apple Mail directory tree |
| `build-cache` | Rebuild the Parquet analytics cache |
| `update` | Update msgvault to the latest version |
| `setup` | Interactive first-run configuration wizard |
| `repair-encoding` | Fix UTF-8 encoding issues |
| `list-senders` / `list-domains` / `list-labels` | Explore metadata |

Expand Down Expand Up @@ -125,6 +130,30 @@ See the [Configuration Guide](https://msgvault.io/configuration/) for all option

msgvault includes an MCP server that lets AI assistants search, analyze, and read your archived messages. Connect it to Claude Desktop or any MCP-capable agent and query your full message history conversationally. See the [MCP documentation](https://msgvault.io/usage/chat/) for setup instructions.

## Daemon Mode (NAS/Server)

Run msgvault as a long-running daemon for scheduled syncs and remote access:

```bash
msgvault serve
```

Configure scheduled syncs in `config.toml`:

```toml
[[accounts]]
email = "you@gmail.com"
schedule = "0 2 * * *" # 2am daily (cron)
enabled = true

[server]
api_port = 8080
bind_addr = "0.0.0.0"
api_key = "your-secret-key"
```

The TUI can connect to a remote server by configuring `[remote].url`. Use `--local` to force local database when remote is configured. See [docs/api.md](docs/api.md) for the HTTP API reference.

## Documentation

- [Setup Guide](https://msgvault.io/guides/oauth-setup/): OAuth, first sync, headless servers
Expand Down
51 changes: 43 additions & 8 deletions cmd/msgvault/cmd/build_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,11 @@ var fullRebuild bool
// files (_last_sync.json, parquet directories) can corrupt the cache.
var buildCacheMu sync.Mutex

// syncState tracks the last exported message ID for incremental updates.
// syncState tracks the message and sync-run watermarks covered by the cache.
type syncState struct {
LastMessageID int64 `json:"last_message_id"`
LastSyncAt time.Time `json:"last_sync_at"`
LastMessageID int64 `json:"last_message_id"`
LastSyncAt time.Time `json:"last_sync_at"`
LastCompletedSyncRunID int64 `json:"last_completed_sync_run_id,omitempty"`
}

var buildCacheCmd = &cobra.Command{
Expand Down Expand Up @@ -116,13 +117,39 @@ func buildCache(dbPath, analyticsDir string, fullRebuild bool) (*buildResult, er
}

var maxMessageID sql.NullInt64
var lastCompletedSyncRunID int64
// Use indexed query: id is PRIMARY KEY, sent_at has an index
maxIDQuery := `SELECT MAX(id) FROM messages WHERE sent_at IS NOT NULL`
if err := sqliteDB.QueryRow(maxIDQuery).Scan(&maxMessageID); err != nil {
sqliteDB.Close()
if closeErr := sqliteDB.Close(); closeErr != nil {
return nil, fmt.Errorf("get max message id: %w; close sqlite: %v", err, closeErr)
}
return nil, fmt.Errorf("get max message id: %w", err)
}
sqliteDB.Close()
var hasSyncRunsTable int
if err := sqliteDB.QueryRow(`
SELECT COUNT(*) FROM sqlite_master
WHERE type = 'table' AND name = 'sync_runs'
`).Scan(&hasSyncRunsTable); err != nil {
if closeErr := sqliteDB.Close(); closeErr != nil {
return nil, fmt.Errorf("check sync_runs table: %w; close sqlite: %v", err, closeErr)
}
return nil, fmt.Errorf("check sync_runs table: %w", err)
}
if hasSyncRunsTable > 0 {
if err := sqliteDB.QueryRow(`
SELECT COALESCE(MAX(id), 0) FROM sync_runs
WHERE status = 'completed' AND completed_at IS NOT NULL
`).Scan(&lastCompletedSyncRunID); err != nil {
if closeErr := sqliteDB.Close(); closeErr != nil {
return nil, fmt.Errorf("get last completed sync run id: %w; close sqlite: %v", err, closeErr)
}
return nil, fmt.Errorf("get last completed sync run id: %w", err)
}
}
if err := sqliteDB.Close(); err != nil {
return nil, fmt.Errorf("close sqlite after metadata check: %w", err)
}

maxID := int64(0)
if maxMessageID.Valid {
Expand Down Expand Up @@ -180,6 +207,12 @@ func buildCache(dbPath, analyticsDir string, fullRebuild bool) (*buildResult, er
fmt.Println("Building cache...")
buildStart := time.Now()

// Capture deletion watermark before export starts. Any deletion
// with deleted_from_source_at after this timestamp may not be
// reflected in the exported Parquet data and will trigger a
// cache rebuild on the next freshness check.
cacheWatermark := time.Now().UTC().Truncate(time.Second)

// Build WHERE clause for incremental exports
idFilter := ""
if !fullRebuild && lastMessageID > 0 {
Expand Down Expand Up @@ -391,10 +424,12 @@ func buildCache(dbPath, analyticsDir string, fullRebuild bool) (*buildResult, er
exportedCount = 0
}

// Save sync state
// Save sync state using the pre-export watermark so any deletion
// that occurs during or after the build is detected as stale.
state := syncState{
LastMessageID: maxID,
LastSyncAt: time.Now(),
LastMessageID: maxID,
LastSyncAt: cacheWatermark,
LastCompletedSyncRunID: lastCompletedSyncRunID,
}
stateData, _ := json.Marshal(state)
if err := os.WriteFile(stateFile, stateData, 0644); err != nil {
Expand Down
Loading
Loading