fix: sync performance, startup logging, and insights fixes by wesm · Pull Request #25 · wesm/agentsview

wesm · 2026-02-24T13:34:48Z

Summary

Sync performance: Replace SHA-256 hash skip check with file_size + file_mtime comparison, add path-based DB lookup for codex/gemini files, and cache non-interactive session results. Reduces sync from ~14s to sub-second for unchanged sessions with 7000+ files.
Startup logging: Add timing to each startup phase (discovery, sync, file watcher, total) so it's clear where time is spent.
Insights filtering: Stop filtering insights list by project so global insights remain visible when a project is selected.
CLI failure recovery: Try parsing claude CLI stdout before checking exit status, so insights are captured when the CLI exits non-zero but produced valid output.

Test plan

CGO_ENABLED=1 go test -tags fts5 ./... passes
npx tsc --noEmit passes in frontend
Start agentsview with large session count, verify timing logs appear and startup is faster on second run
Generate insight with project filter active, verify global insights stay visible
Verify insights still generate and save correctly

🤖 Generated with Claude Code

Log elapsed time for each startup phase so it's clear where time is spent with large session counts. Discovery phase logs per-agent file counts, file watcher logs directory count and setup duration, and the "listening" line shows total startup time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove project filter from the insights list API call so global insights remain visible when a project is selected. The backend still accepts the parameter but the frontend no longer sends it. In generateClaude, try parsing stdout before checking cmd.Run error. If the CLI exits non-zero but produced valid JSON with a non-empty result, use it instead of discarding. This captures insights when claude is terminated after completing generation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Three changes to reduce sync time from ~14s to sub-second for unchanged sessions: 1. Replace SHA-256 hash skip check with file_size + file_mtime. The mtime is free from the existing os.Stat call, eliminating a full file read per session just for the skip check. 2. Add path-based DB lookup (GetFileInfoByPath) for codex and gemini files. These agents require parsing to get the session ID, but the path-based check lets us skip unchanged files before parsing. 3. Generalize the failed-file tombstone into a skip cache that also covers files returning nil sessions (non-interactive codex sessions). On second sync, ~7000 non-interactive files are skipped via in-memory mtime check instead of being re-parsed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T13:40:02Z

roborev: Combined Review (`b50a062`)

Summary Verdict: The PR introduces sync performance optimizations, but contains flaws in the new mtime-based sync skip logic that
can lead to data loss or permanently stale sessions.

High

Race condition in cacheSkipFromPath caches incorrect mtime for actively written files
File: internal/sync/engine.go (in cacheSkipFromPath and its callers)

Description: cacheSkipFromPath calls os.Stat to cache the mtime of a failed/skipped file. If the file is actively being written to (e.g., streaming agent output), its mtime can change between the time it was parsed and the
time this stat call occurs. The newer mtime gets cached as a "skip", causing the new content to be permanently ignored on subsequent syncs.
Suggested remediation: Pass the original info.ModTime().UnixNano() down through the processResult struct during processFile,
and use that exact value for caching instead of re-stating the file.

Medium

Metadata-only sync skip can be bypassed or miss real content changes
Files: [internal/sync/engine.go:419-439](/home/roborev/.rob
orev/clones/wesm/agentsview/internal/sync/engine.go:419), [internal/sync/engine.go:499-525](/home/roborev/.roborev/clones/wesm/agentsview/internal/sync/
engine.go:499), internal/sync/engine_integration_test.go:312
**
Description:** The new fast-path skip trusts only file_size + file_mtime to decide whether to avoid parsing, removing previous hash verifications. mtime is mutable and size is easy to preserve (e.g., same-size rewrites), meaning modified file contents can evade re-sync
and keep stale DB/session state.
Suggested remediation: Keep size+mtime as a fast prefilter, but require a fallback hash verification strategy for ambiguous/equal metadata cases before the final skip, or periodically revalidate hashes.
SyncSingleSession can be short-circuited
by the new skip cache for Codex exec sessions
Files: internal/sync/engine.go#L322, internal
/sync/engine.go#L379, [internal/sync/engine.go#L679](/home/roborev/.
roborev/clones/wesm/agentsview/internal/sync/engine.go#L679), [internal/sync/engine.go#L683](/home/roborev/.roborev/clones/wesm/agentsview/internal/sync/engine
.go#L683)
Description: SyncAll now caches sess == nil files, and processFile returns skip before parsing if mtime matches cache. In SyncSingleSession, if res.skip { return nil } runs before the Codex
includeExec=true fallback, so that fallback can become unreachable.
Suggested remediation: Bypass the skip-cache in SyncSingleSession for Codex IDs, or force includeExec parsing when agent==codex even if res.skip is true.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

- Remove unnecessary selectedId reset and load() from setProject since load no longer filters by project - Update frontend test assertions to match no-args listInsights call - Add TestGenerateClaude_SalvageOnNonZeroExit covering 6 cases for stdout salvage on CLI failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T13:45:26Z

roborev: Combined Review (`02c6d25`)

The code review identified 3 issues (1 High, 2 Medium) requiring attention, primarily affecting concurrency and data integrity in the sync engine.

High

1. TOCT
OU Race Condition Causing Live Session Data Omission

File: internal/sync/engine.go (Lines 309-311, 322-323, 728-732)
Issue: There is a Time-
of-Check to Time-of-Use (TOCTOU) race condition when populating the skipCache for files that fail to parse. cacheSkipFromPath(r.path) calls os.Stat(path) again. If an AI agent has flushed more data between the parse failure and
this new os.Stat call, the new mtime is incorrectly cached as a failure. When the watcher subsequently queues a sync for that new mtime, the file will be permanently skipped and stuck in an incomplete state.
Remediation: Capture the file's
mtime when it is initially stat'd inside processFile and pass it through the worker via the processResult struct. Update collectAndBatch to use r.mtime directly (e.g., e.cacheSkip(r.path, r.mtime) ), completely removing the cacheSkipFromPath helper to eliminate the extra os.Stat call.

Medium

2. Integrity Bypass in Sync Skip Logic

File: internal/sync/engine.go (Lines 417, 419, 4
36, 499, 525)
Issue: Change-detection now trusts only file_size + file_mtime (including the new path-based fast path) to decide whether to skip parsing. If file content changes while preserving size and mtime (
e.g., restoring timestamps, or due to coarse FS timestamp granularity), the sync will incorrectly skip and keep stale indexed content in the DB.
Remediation: Keep the mtime/size fast path, but add a fallback integrity check for edge cases. For example, require a hash confirmation before the
final skip when metadata is suspicious, or add an opt-in periodic hash verification pass.

3. Nondeterministic Database Lookup for File Paths

File: internal/db/sessions.go:405, internal/db/schema.sql:75

Issue: GetFileInfoByPath executes SELECT ... WHERE file_path = ? using QueryRow without ORDER BY or LIMIT semantics. Since file_path is indexed but not unique, if multiple rows share a path, the skip behavior can become nondeterministic and
may skip/parse based on the wrong row.

Remediation: Enforce uniqueness for file_path where valid, or make the lookup deterministic by appending ORDER BY created_at DESC LIMIT 1 (with a defined tie-breaker).

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

- Pass mtime from initial os.Stat through processResult instead of re-stat'ing in cacheSkipFromPath, preventing race where a file written between parse and cache gets its new mtime permanently cached as a skip entry - Remove cacheSkipFromPath; all callers now use cacheSkip with the captured mtime directly - Clear skip cache in SyncSingleSession before processing so codex exec sessions cached as non-interactive during SyncAll can still be found via the includeExec fallback - Make GetFileInfoByPath deterministic with ORDER BY/LIMIT for the theoretical case of duplicate file_path rows - Add TestSyncSingleSessionCodexExecBypassesCache integration test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T13:52:52Z

roborev: Combined Review (`ac3330c`)

Summary Verdict: The code contains one medium severity issue related
to file syncing skip logic; no high or critical issues were found.

Medium

File/Line: internal/sync/engine.go:432, internal/sync/engine.go:451, internal/sync/engine.go:516
*
Issue: Skip checks now rely only on file_size + file_mtime (including Codex/Gemini fast path), so content changes that preserve mtime/size can be silently skipped, leaving stale DB data. This is a correctness regression from hash-based skip behavior.
- Suggested Fix
  : Keep mtime as fast path, but when size+mtime match and a stored hash exists, do a hash confirmation before skipping (or equivalent stronger fingerprint check). Add an integration test for “same-size rewrite with preserved mtime”.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

@EXIT

The fakeClaudeBin helper used a #!/bin/sh script which doesn't exist on Windows. Use a .cmd batch file with @type/@EXIT on Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T14:01:29Z

roborev: Combined Review (`692c8f5`)

The PR introduces optimizations and new features but contains a high-severity test breakage on Windows and medium-severity logic flaws in file syncing and process cancellation.

High Severity

Accidental Find-and-Replace in Windows Test Script
- File/Line: internal/insight/ generate_test.go (around line 282)
- Issue: An accidental global find-and-replace modified the Windows batch script command type to @frontend/src/lib/api/types.ts, which will break the script execution on Windows.

Remediation: Change " @frontend/src/lib/api/types.ts %q\r\n @exit /b %d\r\n" back to " @type %q\r\n @exit /b %d\r\n".

Medium Severity

Integrity Bypass in Sync Skip Decisions
- File/Line: internal/sync/engine.go:430,
  internal/sync/engine.go:452, [internal/db/sessions.go:404](/home/roborev/.robore
  v/clones/wesm/agentsview/internal/db/sessions.go:404)
- Issue: Skip decisions were changed from hash-based validation to file_size + file_mtime only. mtime is mutable and can be spoofed/preserved; with the
  same size and forged/unchanged mtime, modified content can be skipped and never re-parsed. This is an integrity bypass (stale/poisoned session data accepted as unchanged).
- Remediation: Reintroduce content hash verification in skip checks (at least when size+mtime match), or add a secondary integrity check (e.g., periodic hash validation, ctime/inode checks, or hash-on-match fallback).
Broken Cancellation Semantics in Salvage Path
- File/Line: internal/insight/generate.go#L13
  3
- Issue: The code now returns parsed stdout before checking runErr. If the command ends due to context.C anceled/deadline after emitting JSON, this can return success instead of honoring cancellation.
- Remediation: Only salvage when ctx.Err() == nil (or when runErr is non-context exit failure), and return cancellation errors first.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

Check ctx.Err() before attempting to salvage stdout from a failed CLI invocation. Without this, a cancelled/timed-out context could return success if the CLI emitted valid JSON before being killed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T14:20:42Z

roborev: Combined Review (`157649b`)

Summary Verdict: The PR introduces performance optimizations and new features, but contains a high-severity test regression on Windows and a medium-severity cache invalidation edge case.

High

Bad global find-and-replace in Windows batch script payload

Location: internal/insight/generate_test.go:283
Problem: A bad global find-and-replace altered the Windows batch script payload. The script currently uses @frontend/src /lib/api/types.ts instead of the Windows @type command, which will cause the salvage tests to fail on Windows.
Suggested fix: Revert the string literal to use the correct command (e.g., " @type %q\r\n @exit /b %d\ r\n").

Medium

mtime/size-only skip can miss real file changes and leave stale session data

Location: internal/sync/engine.go:432, internal/sync/engine.go:451, internal/sync/ engine.go:516, internal/sync/engine.go:543
Problem: Skip logic now trusts (file_size, file_mtime) (including pre-parse path-based skips for Codex/Gemini). If content changes while mtime is preserved
/coarse-grained and size is unchanged, sync incorrectly skips and DB data goes stale.
Suggested fix: Add a fallback integrity check (hash compare) when (size,mtime) match, or at least gate fallback by conditions where mtime reliability is weak.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

wesm · 2026-02-24T14:24:36Z

"Neither finding is valid.

High — "Bad global find-and-replace": Hallucinated again. The file contains "@type %q\r\n@exit /b %d\r\n" —
verified two messages ago at line 284. The reviewer keeps confusing the batch type command with types.ts. This
is a false positive.

Medium — "mtime/size-only skip": Same finding for the fourth time. Already addressed.
"

wesm and others added 3 commits February 24, 2026 07:16

fix: use .cmd batch file for Windows in salvage test

692c8f5

The fakeClaudeBin helper used a #!/bin/sh script which doesn't exist on Windows. Use a .cmd batch file with @type/@EXIT on Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

wesm merged commit 25ad0a8 into main Feb 24, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: sync performance, startup logging, and insights fixes#25

fix: sync performance, startup logging, and insights fixes#25
wesm merged 7 commits intomainfrom
misc-improvements

wesm commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

wesm commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wesm commented Feb 24, 2026

Summary

Test plan

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (b50a062)

High

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (02c6d25)

High

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (ac3330c)

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (692c8f5)

High Severity

Medium Severity

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (157649b)

High

Medium

Uh oh!

wesm commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

roborev: Combined Review (`b50a062`)

roborev: Combined Review (`02c6d25`)

roborev: Combined Review (`ac3330c`)

roborev: Combined Review (`692c8f5`)

roborev: Combined Review (`157649b`)