|
| 1 | +## Agent Time-Travel — Product and Technical Specification |
| 2 | + |
| 3 | +### Summary |
| 4 | + |
| 5 | +Agent Time-Travel lets a user review an agent’s coding session and jump back to precise moments in time to intervene by inserting a new chat message. Seeking to a timestamp restores the corresponding filesystem state using snapshot anchors. The feature integrates across CLI, TUI, WebUI, and REST, and builds on the snapshot provider model referenced by other docs (see `docs/fs-snapshots/overview.md`). |
| 6 | + |
| 7 | +### Goals |
| 8 | + |
| 9 | +- Enable scrubbing through an agent session with exact visual terminal playback and consistent filesystem state. |
| 10 | +- Allow the user to pause at any moment, inspect the workspace at that time, and branch a new session with an injected instruction. |
| 11 | +- Provide first-class support for ZFS/Btrfs/NILFS2 where available; offer robust fallbacks on APFS (macOS), VSS (Windows), and non‑CoW Linux. |
| 12 | +- Expose a consistent API and UX across WebUI, TUI, and CLI. |
| 13 | + |
| 14 | +### Non-Goals |
| 15 | + |
| 16 | +- Full semantic capture of each application’s internal state (e.g., Vim buffers). We replay terminal output and restore filesystem state. |
| 17 | +- Reflowing terminal content to arbitrary sizes. Playback uses a fixed terminal grid with recorded resize events. |
| 18 | +- A kernel-level journaling subsystem; we rely on filesystem snapshots and pragmatic fallbacks. |
| 19 | + |
| 20 | +### Concepts and Terminology |
| 21 | + |
| 22 | +- **Recording (cast)**: A terminal I/O timeline (e.g., asciinema v2). Visually faithful to what the user saw; does not encode TUI semantics. |
| 23 | +- **Marker**: A labeled point in the recording timeline (auto or manual). Used for navigation. |
| 24 | +- **Anchor**: A marker that has an associated filesystem snapshot reference (snapshots are created near-synchronously with the marker). |
| 25 | +- **Frame/Poster**: A visual state at a specific timestamp; the player can seek and render the frame. |
| 26 | +- **Timeline**: The ordered set of events (log, markers, anchors, resizes) across a session. |
| 27 | +- **Branch**: A new session created from an anchor’s filesystem state with an injected chat message. |
| 28 | + |
| 29 | +### Architecture Overview |
| 30 | + |
| 31 | +- **Recorder**: Captures terminal output as an asciinema cast (preferred) or ttyrec; emits markers at logical boundaries (e.g., per-command). |
| 32 | +- **Anchor Manager**: Creates and tracks snapshot anchors; maintains mapping {timestamp → snapshotId}. |
| 33 | +- **Snapshot Provider Abstraction**: Chooses provider per host (ZFS → Btrfs → APFS/VSS → NILFS2/Overlay → copy). See Provider Matrix below. |
| 34 | +- **Timeline Service (REST)**: Lists anchors/markers, seeks, and branches; streams timeline events via SSE. |
| 35 | +- **Players (WebUI/TUI)**: Embed the recording; render markers; orchestrate seek/branch actions. |
| 36 | +- **Workspace Manager**: Mounts read-only snapshots for inspection and prepares writable clones/upper layers for branches. |
| 37 | + |
| 38 | +### Recording and Timeline Model |
| 39 | + |
| 40 | +- **Format**: asciinema v2 JSON with events [time, type, data]; optional input events for richer analysis. Idle compression is configurable. |
| 41 | +- **Markers**: Auto markers at shell boundaries (preexec/precmd/DEBUG trap) and runtime milestones (provisioned, tests passed). Manual markers via UI/CLI. |
| 42 | +- **Random Access**: Web player supports `startAt`, `poster`, and markers; for power users/offline analysis we may store a parallel ttyrec to enable IPBT usage. |
| 43 | +- **Alternate Screen Semantics**: Full-screen TUIs (vim, less, nano) switch to the alternate screen; scrollback of earlier output is not available while paused on the alternate screen. Navigation uses timeline seek rather than scrollback. |
| 44 | + |
| 45 | +### Snapshot Anchors and Providers (multi‑OS) |
| 46 | + |
| 47 | +- **Creation Policy**: |
| 48 | + - Default: Create an anchor at each shell command boundary and at important runtime milestones. |
| 49 | + - Max frequency controls and deduplication to avoid thrashing during rapid events. |
| 50 | + - Anchors include: id, ts, label, provider, snapshotRef, notes. |
| 51 | + |
| 52 | +- **Provider Preference (host‑specific)**: |
| 53 | + - Linux: |
| 54 | + - ZFS: instantaneous snapshots and cheap writable clones (branch from snapshot via clone). |
| 55 | + - Btrfs: subvolume snapshots (constant-time), cheap writable snapshots for branching. |
| 56 | + - NILFS2: continuous checkpoints; promote relevant checkpoints to snapshots (mkcp -s); mount past checkpoints read-only (`-o ro,cp=<cno>`). |
| 57 | + - Overlay fallback: lower = base tree, upper/work on fast storage (tmpfs or RAM-backed NILFS2/zram/brd) for ephemeral branches. |
| 58 | + - Copy fallback: `cp --reflink=auto` when possible; otherwise deep copy (last resort). |
| 59 | + - macOS: |
| 60 | + - APFS snapshots: read-only, instantaneous; mountable for inspection. For branch, create an overlay-style writable workspace using a read-only snapshot as lower with a writable upper (FSKit/macFUSE backend when available) or fast copy-on-write file clones where feasible. |
| 61 | + - Windows: |
| 62 | + - VSS shadow copies: read-only snapshots at volume level; expose snapshot content for inspection. For branch, materialize a writable workspace via differencing VHD(X) layered over the snapshot materialization or by copying-on-write using a WinFsp-backed overlay. |
| 63 | + |
| 64 | +- **Branch Semantics**: |
| 65 | + - Writable clones are native on ZFS/Btrfs. On APFS/VSS, branching is emulated via overlay or virtual disk differencing over the read-only snapshot view. |
| 66 | + - Branches are isolated workspaces; original session remains immutable. |
| 67 | + |
| 68 | +### Syncing Terminal Time to Filesystem State |
| 69 | + |
| 70 | +- **Shell Integration (default)**: |
| 71 | + - zsh: `preexec`/`precmd` hooks to emit markers and trigger snapshot anchors. |
| 72 | + - bash: `trap DEBUG` + `PROMPT_COMMAND` pair to delimit commands. |
| 73 | + - fish: `fish_preexec`/`fish_postexec` equivalents. |
| 74 | +- **Runtime Integration**: The runner emits timeline events (SSE) at milestones; the anchor manager aligns nearest anchor ≤ timestamp. |
| 75 | +- **Advanced (future)**: eBPF capture of PTY I/O and/or FS mutations; rr-based post‑facto reconstruction of casts; out of scope for v1 but compatible with this model. |
| 76 | + |
| 77 | +### REST API Extensions |
| 78 | + |
| 79 | +- `GET /api/v1/sessions/{id}/timeline` |
| 80 | + - Returns markers and anchors ordered by time. |
| 81 | + - Response: |
| 82 | + ```json |
| 83 | + { |
| 84 | + "sessionId": "...", |
| 85 | + "durationSec": 1234.5, |
| 86 | + "recording": {"format": "cast", "uri": "s3://.../cast.json"}, |
| 87 | + "markers": [ |
| 88 | + {"id": "m1", "ts": 12.34, "label": "git clone", "kind": "auto"} |
| 89 | + ], |
| 90 | + "anchors": [ |
| 91 | + {"id": "a1", "ts": 12.40, "label": "post-clone", "provider": "btrfs", "snapshot": {"id": "repo@tt-001", "mount": "/.snapshots/..."}} |
| 92 | + ] |
| 93 | + } |
| 94 | + ``` |
| 95 | + |
| 96 | +- `POST /api/v1/sessions/{id}/anchors` |
| 97 | + - Create a manual anchor near a timestamp; returns anchor with snapshot ref. |
| 98 | + |
| 99 | +- `POST /api/v1/sessions/{id}/seek` |
| 100 | + - Parameters: `ts`, or `anchorId`. |
| 101 | + - Returns a short‑lived read‑only mount (host path and/or container path) for inspection; optionally pauses the session player at `ts`. |
| 102 | + |
| 103 | +- `POST /api/v1/sessions/{id}/branch` |
| 104 | + - Parameters: `fromTs` or `anchorId`, `name`, optional `injectedMessage`. |
| 105 | + - Creates a new session with a writable workspace cloned/overlaid from the anchor snapshot. |
| 106 | + - Response includes new `sessionId` and workspace mount info. |
| 107 | + |
| 108 | +- `GET /api/v1/sessions/{id}/snapshots` |
| 109 | + - Lists underlying provider snapshots/checkpoints with metadata (for diagnostics and retention tooling). |
| 110 | + |
| 111 | +- SSE additions on `/sessions/{id}/events` |
| 112 | + - New event types: `timeline.marker`, `timeline.anchor.created`, `timeline.branch.created`. |
| 113 | + |
| 114 | +### CLI Additions |
| 115 | + |
| 116 | +- `aw timeline list <SESSION_ID>` — Show markers and anchors. |
| 117 | +- `aw timeline anchor add <SESSION_ID> [--ts <sec>] [--label <str>]` — Create manual anchor. |
| 118 | +- `aw timeline seek <SESSION_ID> (--ts <sec> | --anchor <ID>) [--open-ide]` — Mount read‑only view; optionally open IDE. |
| 119 | +- `aw timeline branch <SESSION_ID> (--ts <sec> | --anchor <ID>) --name <branch-name> [--message <chat>]` — Start a branched session from that point. |
| 120 | + |
| 121 | +### WebUI UX |
| 122 | + |
| 123 | +- **Player Panel**: Embed `<asciinema-player>` with `poster`, markers, and a scrubber. Time cursor shows nearest anchor and label. |
| 124 | +- **Pause & Intervene**: On pause, surface “Inspect snapshot” and “Branch from here”. |
| 125 | +- **Inspect Snapshot**: Mounts read‑only view; open a lightweight file browser and offer “Open IDE at this point”. |
| 126 | +- **Branch From Here**: Dialog to enter an injected message and name; creates a new session; link both sessions for side‑by‑side comparison. |
| 127 | +- **History View**: Timeline list with filters (auto/manual markers, anchors only). |
| 128 | + |
| 129 | +### TUI UX |
| 130 | + |
| 131 | +- **Timeline Bar**: Keyboard scrubbing with markers (jump prev/next), current time, and anchor badges. |
| 132 | +- **Keys**: |
| 133 | + - Space: pause/resume |
| 134 | + - [ / ]: prev/next marker; { / }: prev/next anchor |
| 135 | + - i: Intervene (branch dialog) |
| 136 | + - s: Seek and open read‑only snapshot in left pane; right pane keeps the player/logs |
| 137 | + |
| 138 | +### Data Model Additions (Session) |
| 139 | + |
| 140 | +- `recording`: `{ format: "cast"|"ttyrec", uri, width, height, hasInput }` |
| 141 | +- `timeline`: `{ durationSec, markers: [...], anchors: [...] }` |
| 142 | +- `anchors[*]`: `{ id, ts, label, provider, snapshot: { id, mount?, details? } }` |
| 143 | +- `branchOf` (optional): parent session id and anchor id when branched. |
| 144 | + |
| 145 | +### Security and Privacy |
| 146 | + |
| 147 | +- **Keystrokes**: If input capture is enabled, redact known password prompts (heuristics based on ECHO off and common prompts). Make input capture opt‑in. |
| 148 | +- **Access Control**: Timeline/seek/branch require the same permissions as session access; snapshot mounts use least‑privilege read‑only where applicable. |
| 149 | +- **Data Retention**: Separate retention for recordings vs snapshots; defaults minimize data exposure. Encrypt at rest when stored remotely. |
| 150 | + |
| 151 | +### Performance, Retention, and Limits |
| 152 | + |
| 153 | +- **Snapshot Rate Limits**: Min interval between anchors; coalesce within a small window (e.g., 250–500 ms) to avoid bursty commands creating many anchors. |
| 154 | +- **Retention**: Policies by count/age/size. Prune unreferenced checkpoints (e.g., NILFS2) and expired provider snapshots. |
| 155 | +- **Storage**: Cast files compressed; offload to object storage. Mounts are short‑lived and garbage‑collected. |
| 156 | + |
| 157 | +### Failure Modes and Recovery |
| 158 | + |
| 159 | +- **Snapshot Creation Fails**: Create a marker with `anchor=false` and reason; continue recording; allow manual retry. |
| 160 | +- **Seek Failure**: Report provider error and suggest nearest valid anchor. |
| 161 | +- **Provider Degraded**: Fall back per provider preference, with explicit event logged to the timeline. |
| 162 | + |
| 163 | +### Provider Semantics Matrix (summary) |
| 164 | + |
| 165 | +- **ZFS**: Snapshots and clones — ideal for anchors and branches. |
| 166 | +- **Btrfs**: Subvolume snapshots — ideal for anchors and branches. |
| 167 | +- **NILFS2**: Continuous checkpoints; promote to snapshots; mount via `cp=<cno>`; branch via overlay. |
| 168 | +- **APFS**: Read‑only snapshots; branch via overlay or file clones (no native writable clone of snapshot). |
| 169 | +- **VSS**: Read‑only shadow copies; branch via differencing VHD/overlay. |
| 170 | +- **Overlay/Copy**: Universal fallbacks when CoW is unavailable. |
| 171 | + |
| 172 | +### Open Issues and Future Work |
| 173 | + |
| 174 | +- eBPF PTY and FS hooks for automatic, runner‑independent capture. |
| 175 | +- rr‑based post‑facto reconstruction of casts and fine‑grained anchors. |
| 176 | +- IPBT integration for advanced timeline browsing on ttyrec recordings. |
| 177 | +- FSKit backend maturation on macOS for robust overlay branching without kexts. |
| 178 | +- Windows containers integration to provide stronger per‑session isolation when branching. |
| 179 | + |
| 180 | + |
0 commit comments