Live Rooms V1 + Agent Direct-Client by backnotprop · Pull Request #591 · backnotprop/plannotator

backnotprop · 2026-04-19T18:15:19Z

Status

Draft, not for merge. Opening this so the code is reviewable during the SF hackathon demo window. The room service will be deployed from this branch to plannotator-room-hack.*.workers.dev as a time-boxed preview; production room.plannotator.ai DNS move is a separate follow-up after the event.

Summary

End-to-end Live Rooms V1 plus the Slice 6a agent direct-client CLI. Three cohesive waves of work stacked on main.

Wave 1 — Protocol & server (Slices 1–4)

packages/shared/collab — zero-knowledge room protocol: HKDF-derived keys, HMAC proofs, AES-GCM payload encryption, WebSocket transport contracts, canonical JSON for admin binding, URL helpers, image-stripping for room snapshots.
apps/room-service — Cloudflare Worker + SQLite-backed Durable Object: room creation, event sequencing, admin lock/unlock/delete, lifecycle, static SPA shell served via ASSETS binding. Strict CSP on /c/:roomId.
packages/shared/collab/client-runtime — browser/direct-agent client: auth handshake, snapshot decrypt, event subscription, mutation ops, presence, reconnect, admin command flow. 2709 lines of tests.

Wave 2 — Consolidation pass

Post-Slice-5 cleanup (anchor v0.17.10-preconsolidation): five mechanical phases plus today's ThemeProvider hoist + dialog pattern alignment. Every phase is bisect-friendly. See pure-twirling-swing.md plan trail for the protocol.

Wave 3 — Slice 6a agent direct-client

apps/collab-agent — Bun CLI that lets an AI agent join a room as a first-class peer: join, read-plan (with block-id markers), read-annotations, read-presence, comment (block-level targeting), demo (scripted heading tour with per-heading echo confirmation + randomized cursor x/y so parallel agents don't stack).
Agent identity convention <user>-agent-<type> with CLI-side admin-URL strip guard. packages/ui/utils/agentIdentity.ts holds the pure helpers (slug normalization, identity construction/detection) — deliberately NOT added to identity.ts which imports ConfigStore.
⚙ marker on agent avatars + cursor labels so observers can distinguish bots from humans.
Room menu → "Copy agent instructions" — clipboard payload pre-filled with THIS room's URL and the creator's slugged identity so an agent pasted into an LLM prompt has everything in one message.
Selection-accuracy characterization matrix (packages/ui/hooks/useAnnotationHighlighter.matrix.test.tsx) documents 12 scenarios against the existing /api/external-annotations matcher. Not a fix — a measurement. Follow-up spec at specs/v1-selection-accuracy.md.

Deferred (documented non-goals)

Slice 6b: /api/external-annotations SSE → room-ops forwarding.
Rate limiting / WAF on POST /api/rooms (deployment hardening, tracked).
Selection-accuracy fix (measured, not fixed).
useCollabRoomSession memo-stability verification (no observed bug; deferred with a concrete trigger in the consolidation plan's appendix).
Radix Dialog migration for collab modals (noted; cleaner PR on its own).

Test plan

bun run typecheck green across shared, ai, server, ui.slice5, editor.
bun run test — 1072 pass / 3 skip / 0 fail.
bun run --cwd apps/room-service build:shell green.
bun run --cwd apps/hook build green.
Manual end-to-end: browser creator + browser participant + agent CLI joiner against a deployed Worker.
Manual: Copy agent instructions paste → agent CLI round-trip.

Zero-knowledge room collaboration primitives for Plannotator Live Rooms. No server, no UI — types, crypto, and helpers with 84 tests. - HKDF key derivation (auth, event, presence, admin keys) - HMAC verifier/proof generation with null-byte delimited inputs - AES-256-GCM encrypt/decrypt for event, presence, and snapshot channels - Canonical JSON for admin proof binding - Base64url encoding with padding normalization - Room URL parsing (client-only barrel) and construction - Annotation image stripping for V1 room compatibility - Server-safe vs client barrel exports For provenance purposes, this commit was AI assisted.

…(Slice 2) Cloudflare Worker with raw Durable Object for room.plannotator.ai. Room creation, WebSocket challenge-response auth, and hibernation-safe connection state. 49 tests across validation, auth proofs, and CORS. - POST /api/rooms creates room in DO with verifiers + encrypted snapshot - GET /ws/<roomId> upgrades to WebSocket with 30s challenge-response auth - WebSocket attachments survive DO hibernation (no in-memory Maps) - RoomStatus gains "expired" as first-class terminal state with lazy enforcement - Explicit CORS policy: ALLOW_LOCALHOST_ORIGINS flag, Vary: Origin, no implicit bypass - Room ID validation: exactly 22 base64url chars (matches generateRoomId) - Verifier validation: exactly 43 base64url chars (matches HMAC-SHA-256 output) - Snapshot size limit (1.5 MB) with clear 413 response - Auth response fields validated before proof verification - markExpired() purges sensitive material with try/catch and fail-closed semantics - Smoke test script for repeatable integration testing against wrangler dev For provenance purposes, this commit was AI assisted.

…e (Slice 3) Post-auth WebSocket behavior for Plannotator Live Rooms. Two clients can now exchange encrypted annotations in real time, reconnect with replay, and creator can lock/unlock/delete via challenge-response. - Event sequencing: server-assigned seq, per-key storage (event:NNNNNNNNNN), atomic metadata+event write, broadcast to all including sender for lastSeq tracking - Presence relay: broadcast to others only, no storage, no seq - Reconnect replay: snapshot + incremental based on lastSeq, handles fresh join, future claims, and (future) compaction boundary - Admin challenge-response: fresh per-command challenges stored in hibernation-safe WebSocket attachment, verifyAdminProof with canonicalJson command binding, clientId spoofing rejection, lifecycle enforcement - Lock/unlock/delete: room.status broadcast, locked rooms reject events with room.error but keep socket open, delete purges sensitive material and event keys with fail-closed tombstone write - Per-channel size limits: event 512 KB, presence 8 KB, snapshot 1.5 MB - Batch event-key purge (128 per call) and lazy expiry cleanup - RoomStatus.expired enforced across auth, event, admin, and upgrade paths - Live smoke script: 20 integration checks verified against wrangler dev For provenance purposes, this commit was AI assisted.

Adds packages/shared/collab/client-runtime/ (CollabRoomClient class, createRoom/joinRoom factories, apply-event reducer, backoff, opId dedupe, mock-websocket test harness) and packages/ui/hooks/useCollabRoom.ts as the React wrapper. Extends url.ts with admin-fragment parsing and buildAdminRoomUrl. Server-side validation and replay hardening land alongside: admin error-code contract comment, U+0000 replay cursor, cors/WS close constants shared between client and server. For provenance purposes, this commit was AI assisted.

Anchor commit for the Live Rooms V1 consolidation pass. Captures the working tree as-of Slice 5 + presence fixes + fake-presence harness tuning. This commit is NOT intended to ship; it exists so downstream consolidation phases have a stable rollback point via `git reset --hard` or the `v0.17.10-preconsolidation` tag. Baseline at snapshot time: - typecheck: green (shared, ai, server, ui.slice5, editor all clean). - tests: 1028 pass, 3 skip, 0 fail (bun run test). - branch: feat/collab at b78019e. Scope by category: - collab hooks: useCollabRoom (modified), useCollabRoomSession, usePresenceThrottle, useRoomAdminActions, useRoomMode, useAnnotationController, useAnnotationHighlightReconciler (all new) - collab components: packages/ui/components/collab/* (RoomPanel replaced by RoomHeaderControls + RoomMenu; AdminControls, ParticipantAvatars, RemoteCursorLayer, StartRoomModal, JoinRoomGate, ImageStripNotice, RoomStatusBadge, RoomAdminErrorToast) - collab utils: adminSecretStorage, presenceColor, roomIdentityConfirmed - room-service shell: entry.tsx, index.html, static/favicon.svg, vite.config.ts, tsconfig.browser.json, plus room-do.ts / handler.ts presence + admin updates - editor shell: AppRoot.tsx, RoomApp.tsx, roomIdentityHandoff.ts, env.d.ts, tsconfig.json, plus App.tsx start-room / admin flow - shared collab: validation.ts, redact-url.ts, strip-images update, client-runtime/client.ts presence + admin updates, types.ts with room.participant.left broadcast - dev tooling: apps/room-service/scripts/fake-presence.ts (fake multi-participant harness with continuous-lerp motion model); scripts/dev-live-room-local.sh - CI: minor workflow tweaks (release.yml, test.yml) - specs: v1.md and slice plans updated; v1-cursor-presence.md added DO NOT squash-merge. The consolidation PR preserves this commit as an anchor; revert destination for all downstream phases. For provenance purposes, this commit was AI assisted.

Hooks moved (9 files total — 5 source + 4 test): - useCollabRoom(+test) - useCollabRoomSession(+test) - usePresenceThrottle(+test) - useRoomAdminActions - useRoomMode(+test) Pure structural refactor. Zero behavior change. Behavioral surface preserved: - Every hook exports the same symbol with the same signature. - Every consumer reaches it through the wildcard package export (packages/ui/package.json `./hooks/*`), so no export-map change was needed. - tsconfig.slice5.json include paths updated to the new location so the scoped Slice-5 typecheck still covers these files. Consumers updated (6 import sites): - App.tsx: useRoomAdminActions + dynamic-import type for UseCollabRoomSessionReturn - AppRoot.tsx: useRoomMode - RoomApp.tsx: useCollabRoomSession, usePresenceThrottle - useAnnotationController.ts: type import for UseCollabRoomReturn - useAnnotationController.room.test.tsx: same Intra-group relative imports (sibling `./useCollabRoom`) stayed unchanged since all moved together. One cross-folder relative import (useRoomAdminActions.ts → components/collab/AdminControls) needed a bump to `../../` to account for the extra nesting. Review gate passed: - grep `ui/hooks/useCollabRoom*|useRoomAdminActions|useRoomMode|usePresenceThrottle|useCollabRoomSession` in packages/ and apps/: zero old-path hits. - bun run typecheck: clean across shared, ai, server, ui.slice5, editor. - bun run test: 1028 pass, 3 skip, 0 fail (unchanged from baseline). No barrel file added. packages/ui/package.json wildcard already resolves the new paths; introducing a collab/index.ts would add surface area for no gain. Part of the Live Rooms V1 consolidation pass (Phase 2 of 5). Anchor commit: 38c6c51 (tag v0.17.10-preconsolidation). For provenance purposes, this commit was AI assisted.

Net change to App.tsx: -250 / +73. Start-live-room workflow lives in its own hook under packages/editor/hooks/collab/ alongside the other creator-side room logic. Moved out of App.tsx (~250 lines): - Module-level `getRoomBaseUrl` helper (precedence: window override → VITE_ROOM_BASE_URL → production default). - State: showStartRoomModal, startRoomInFlight, startRoomError, startRoomAbortRef. - Memo: imageAnnotationsToStrip (single-source-of-truth for the modal notice AND the &stripped=N URL handoff count). - Handlers: handleStartLiveRoom, handleCancelStartRoom, handleConfirmStartRoom (including pre-open placeholder tab, identity persistence, createRoom dynamic import, URL fragment construction, popup-blocker fallback). Stays in App.tsx: - canStartLiveRoom gate (isApiMode && !roomModeActive). - <StartRoomModal> JSX block — still reads getIdentity()/getPresenceColor() for initial values and destructures everything else from the hook. Invariants preserved (documented in the new hook's header): - `window.open('', '_blank')` remains synchronous in the click path, before any await. Popup blocker bypass depends on this. - Four `await import(...)` calls stay dynamic for bundle splitting. - Identity persistence (setCustomIdentity / setPresenceColor) happens at the same lifecycle point inside handleConfirmStartRoom, not shared with RoomApp.tsx (different flows, different timings). - Hook does NOT import from App.tsx — hard line to prevent the extraction from re-tangling. Hook lives in packages/editor/hooks/collab/ (not packages/ui/) because it depends on import.meta.env.VITE_ROOM_BASE_URL whose ambient typing lives in packages/editor/env.d.ts, and the flow itself is editor-shell workflow, not generic reusable UI. Review gate: - bun run typecheck: clean. - bun run test: 1028 pass / 3 skip / 0 fail (unchanged from baseline). - grep `getRoomBaseUrl|setCustomIdentity|setPresenceColor|stripRoomAnnotationImages` in App.tsx: zero matches. - grep `showStartRoomModal|handleConfirmStartRoom` in App.tsx: only hook destructuring + modal JSX site (no stray state defs). - grep `from ['"]..` in useStartLiveRoom.ts: no parent-relative imports (hook does not reach into App.tsx). - Sync-before-await invariant: window.open at line 155 precedes the first `await import(...)` at line 171. Depends on Phase 2 (collab hooks move) — born in its final home at packages/editor/hooks/collab/, same pattern as the peer hooks moved in e4ccdcd. Part of the Live Rooms V1 consolidation pass (Phase 1 of 5). Anchor commit: 38c6c51 (tag v0.17.10-preconsolidation). For provenance purposes, this commit was AI assisted.

…heckboxOverrides (Phase 3) Net change to App.tsx: -56 lines of controller-internal knowledge (the pending/failed/pendingAdditions walking + ann-checkbox-<id> prefix matching + blockId resolution) moved to a sibling pure helper `derivePendingCheckboxBlockIds` alongside the consumer hook. Design choice: sibling helper, not a second hook. - Keeps the annotation-controller and checkbox-hook decoupled. - Caller still owns the useMemo (with the correct deps) so the "local mode returns undefined" conditional lives at the App.tsx call site where roomModeActive is already in scope. - Pure function, fully unit-testable in isolation. Signature: export function derivePendingCheckboxBlockIds( controller: AnnotationController, ): ReadonlySet<string>; Helper reads `controller.annotations` directly (same array App.tsx destructures at line 147). Call-site in App.tsx drops from ~30 lines (scan + resolve + walk) to a single useMemo: const pendingCheckboxBlockIds = useMemo( () => roomModeActive ? derivePendingCheckboxBlockIds(annotationController) : undefined, [roomModeActive, controller.pending, controller.failed, controller.pendingAdditions, controller.annotations], ); Review gate: - grep `ann-checkbox-` in App.tsx: one remaining match at line 1282 — that's the delete-handler's local-mode immediate-revert (different concern, intentional, stays). - bun run typecheck: clean. - bun run test: 1028 pass / 3 skip / 0 fail. - bun test --cwd packages/editor: 13/13 pass, covering useCheckboxOverrides. The busy-gate and revert-gate semantics are unchanged; this is a pure relocation. Manual smoke (rapid double-click on a room-mode checkbox) preserved via the existing `pendingBlockIds` wiring downstream — the hook consumes the Set identically to before. Part of the Live Rooms V1 consolidation pass (Phase 3 of 5). Anchor commit: 38c6c51 (tag v0.17.10-preconsolidation). For provenance purposes, this commit was AI assisted.

…ase 5) Three small fixes, no behavior change: 1. packages/ui/components/collab/RemoteCursorLayer.tsx - `findScrollViewport` docstring previously claimed LocalPresenceEmitter emits `document`-space cursors. Current sender (RoomApp.tsx:551) emits `block`-space with a sticky anchor. Rewrote the comment to reflect that all three coordinate spaces are protocol-supported and the layer handles them uniformly; the bundled UI's hot path is `block`. - `case 'block':` inline comment previously said "our bundled UI no longer emits this case." It does. Comment rewritten to say the bundled UI writes block-space with a sticky anchor, and direct-agent senders that emit block still resolve here. 2. specs/v1-slice4-plan.md line 61 - "The smoke.ts reference incorrectly uses `eventKey` for both channels" — no longer true. apps/room-service/scripts/smoke.ts now encrypts presence with `presenceKey` (line 238) and events with `eventKey` (line 211). Comment rewritten to describe the correct architecture without the stale bug note. 3. apps/room-service/scripts/fake-presence.ts - The hardcoded SWATCHES tuple was a 1:1 duplicate of packages/ui/utils/presenceColor.ts's PRESENCE_SWATCHES. Replaced with an import. presenceColor.ts is pure (no React / DOM / ConfigStore per its own header), safe to pull into a Bun script. Palette is now single-sourced — any UI-side swatch change automatically propagates to fake participants. Review gate: - bun run typecheck: clean. - bun run test: 1028 pass / 3 skip / 0 fail (baseline unchanged). - Script load smoke: `bun run apps/room-service/scripts/fake-presence.ts` exits with `Error: Missing --url. …` — confirms the palette import resolved (if it hadn't, Bun would have surfaced a module-not-found before parseArgs). Root typecheck does NOT cover apps/room-service/scripts because the room-service tsconfig excludes scripts/**. The script load smoke is the explicit gate for Phase 5's import change. Part of the Live Rooms V1 consolidation pass (Phase 5 of 5). Anchor commit: 38c6c51 (tag v0.17.10-preconsolidation). For provenance purposes, this commit was AI assisted.

Eliminates the last duplicated literal set between client and server: both sides now reference a single shared tuple in `packages/shared/collab/constants.ts`, enforced by a contract test that iterates every code and asserts runtime admin-scoped routing. Shared source of truth (packages/shared/collab/constants.ts): - `AdminErrorCode` const namespace with 11 keyed codes. - `AdminErrorCode` type = string-literal union derived from the namespace values. - `ADMIN_ERROR_CODES` readonly tuple derived from Object.values(). - Documented add-a-code workflow: "Add a key; use it via sendAdminError; the tuple and runtime Set derive automatically; contract test enforces end-to-end routing." Server (apps/room-service/core/room-do.ts): - Private `sendAdminError(ws, code: AdminErrorCode, message)` wrapper introduced. 12 admin-path call sites rewritten from raw literal strings to `AdminErrorCode.<Key>`. The wrapper's type narrow on `code` means any future typo or non-admin code surfaces as a compile error at the call site, not as silent drift. - Non-admin paths (`validation_error`, `room_locked`, `event_persist_failed`) keep using raw `sendError` — they're event-channel errors and are explicitly excluded from the admin contract. Client (packages/shared/collab/client-runtime/client.ts): - `ADMIN_SCOPED_ERROR_CODES` Set now derived from `ADMIN_ERROR_CODES` tuple (`new Set<string>(ADMIN_ERROR_CODES)`), replacing the hardcoded 11-literal duplicate. Kept as `ReadonlySet<string>` (not `Set<AdminErrorCode>`) because .has() is called on wire-string input — forward-compatibility with unknown future codes is intentional, they fall through as non-admin. Contract test (client-runtime/client.test.ts): - New behavior test iterates every entry in `ADMIN_ERROR_CODES`, sets up a pending admin command, injects a server error with that code, asserts the promise rejects with AdminRejectedError within 500ms (not at 5s admin timeout). This catches: * Adding a tuple entry without wiring it to the runtime Set. * Any regression in the admin-scoped dispatch path. * Typos that would make a code silently route as mutation. Audit anchor (from plan's snapshot action): - 12 admin-path sendError sites confirmed in room-do.ts (matches plan expectation: 11 distinct codes with `invalid_state` emitted from 3 sites — lock/unlock/delete guards). - Every code previously in client.ts's literal Set is present in the new tuple. No silent drops, no dead handlers. Review gate: - grep admin-code literals (`'admin_validation_error'` etc.) in client.ts + room-do.ts: zero matches. Only legit occurrences are in constants.ts (definition) and test files (mock wire data). - grep `this.sendError(ws, '<admin-code>'` in room-do.ts: zero — every admin path goes through `sendAdminError`. - bun run typecheck: clean (root + room-service). - bun run test: 1029 pass / 3 skip / 0 fail (up from 1028 — one new contract test). - Targeted: client.test.ts 76 pass (up from 75). Part of the Live Rooms V1 consolidation pass (Phase 4 of 5 — final). Anchor commit: 38c6c51 (tag v0.17.10-preconsolidation). For provenance purposes, this commit was AI assisted.

…pattern Two related visual bugs on the room origin addressed in one commit: 1. ThemeProvider lived inside `<App>` (App.tsx:1606), so the pre-join screens rendered by `<RoomApp>` — `<JoinRoomGate>` and the room-deleted/expired terminal screens — showed up BEFORE the theme class was applied to `<html>`. Every theme-token Tailwind class (`bg-background`, `bg-card`, `text-muted-foreground`, `border-border`, `bg-foreground text-background` on the primary submit) resolved to empty `var(--*)` lookups, so the gate looked unstyled (black text on white, no visible borders, no depth). Hoist `ThemeProvider` to wrap `AppRoot` — it now encloses all three mode-fork branches (local, invalid-room terminal, and room mode including the gate). Removed the now-duplicate wrapper inside `<App>` since every entry point (apps/hook, apps/portal, apps/room-service) mounts AppRoot as the default export. The `invalid-room` terminal screen in AppRoot itself benefits too — its `bg-background` and `text-muted-foreground` classes now resolve correctly on the public room origin. 2. StartRoomModal and JoinRoomGate had drifted from the canonical Plannotator dialog pattern that ConfirmDialog, Settings, ExportModal, AISetupDialog, PermissionModeSetup, etc. all share. The review showed the backdrop dim felt too weak on Start and the Join gate looked inconsistent with every other dialog in the product. Aligned both: Backdrop (was hand-rolled, now matches the 8-dialog canon): fixed inset-0 z-[100] flex items-center justify-center bg-background/80 backdrop-blur-sm p-4 Content card (was `bg-background shadow-lg` or `bg-card border` with no shadow, now matches): bg-card border border-border rounded-xl shadow-2xl Net effect: - Start: proper theme-tinted 80% backdrop + blur instead of flat `bg-black/40`; card now has depth via shadow-2xl and an explicit border (previously `bg-background` + `shadow-lg` only). - Join: same backdrop + card treatment as every other dialog; no longer reads as a raw form on a blank page. Accessibility gaps (missing `role="dialog"`, `aria-modal`, focus trap, Esc handling, hover states on swatch + primary buttons) are NOT addressed here — those require either hand-rolled ARIA + focus plumbing per dialog, or a Radix Dialog migration across all Plannotator dialogs. Tracked separately; the theme + pattern fix above doesn't depend on either. Review gate: - bun run typecheck: clean. - bun run test: 1029 pass / 3 skip / 0 fail (unchanged). - bun run --cwd apps/room-service build:shell: green. For provenance purposes, this commit was AI assisted.

First commit of the Slice 6a agent direct-client work. Creates the `apps/collab-agent/` workspace package with a minimal CLI skeleton whose only job right now is to verify the dependency graph resolves cleanly under Bun. Layout: - apps/collab-agent/package.json — workspace, deps on shared + ui only - apps/collab-agent/tsconfig.json — Bun target, strict mode - apps/collab-agent/index.ts — dispatcher skeleton with --help - apps/collab-agent/subcommands/ — empty dir (populated in Phases 3/5/6) Import smoke: index.ts eagerly references `parseRoomUrl` from the shared client barrel, `parseMarkdownToBlocks` from `@plannotator/ui/utils/parser`, and `hashNameToSwatch` + `PRESENCE_SWATCHES` from `@plannotator/ui/utils/presenceColor`. None of these modules pull ConfigStore, React, or DOM — verified by the skeleton running `bun run apps/collab-agent/index.ts --help` with exit 0 and no resolution errors. This is the Phase 1 gate from the plan: if any utility grew non-pure imports, Bun would surface it on load here before any subcommand code lands. Package.json deliberately omits `@plannotator/ui` from the avoid list — we import specific utils only. `packages/ui/utils/identity.ts` stays untouched; the agent-identity detector goes into a new pure module in Phase 2. No subcommands wired yet. CLI returns exit 2 for unknown subcommands. --help prints the planned subcommand list so the interface shape is visible. Review gate: - bun run apps/collab-agent/index.ts --help → exit 0, prints help - bunx tsc --noEmit (package-local) → clean - bun run typecheck (root) → clean - bun run test → 1029 pass / 3 skip / 0 fail (unchanged) Part of Slice 6a (Phase 1 of 8). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

…rtbeat (Phase 2) Three primitives the subcommands in later phases will rely on: 1. `packages/ui/utils/agentIdentity.ts` — NEW pure module (no imports) with: - `AGENT_TYPES` / `AgentType` — canonical set {claude, codex, opencode, junie, other}. - `isAgentIdentity(name)` — true when `name` ends in `-agent-<known-type>`. Case-sensitive so human names containing "Agent" don't false-positive. - `getAgentType(name)` — returns the type for later UI tooltip/icon choices. - `constructAgentIdentity({ user, type })` — single source of truth for the `<user>-agent-<type>` format. Normalises `user` to lowercase-trimmed and validates charset. Throws `InvalidAgentIdentityError` on malformed input. Deliberately NOT placed next to `packages/ui/utils/identity.ts`, which imports `../config` (ConfigStore with cookies + server-sync). The agent CLI imports this new file directly and never touches ConfigStore. 2. `apps/collab-agent/identity.ts` — CLI-side identity helpers: - Re-exports the pure primitives for ergonomic import. - `isAgentType(v)` — argv type guard. - `stripAdminFragment(url)` — removes `admin=<secret>` from the fragment while preserving `key=<secret>` and other params. Returns `{ url, stripped }` so the CLI can print a one-shot warning when an admin link was given. Hard default; NO `--as-admin` opt-in in V1 (agents are not admins, adding the flag without a use case is footgun creation per the plan). 3. `apps/collab-agent/heartbeat.ts` — presence heartbeat manager. `startHeartbeat(client, initial)` returns an `{ update, stop }` handle; the caller sends the initial presence themselves, then the manager re-emits the last-known presence every 10 s (3× headroom under the 30s `PRESENCE_TTL_MS`). Swallows send failures (presence is lossy by protocol contract) and no-ops during non-authenticated windows. This solves the "agent vanishes from the observer while thinking" problem without any change to receiver-side TTL logic: the agent's cursor stays fresh because the agent keeps re-asserting it. Test coverage: - `packages/ui/utils/agentIdentity.test.ts` — 12 tests: detection, construction round-trip, case sensitivity, invalid-user / invalid-type rejection, null/undefined handling. - `apps/collab-agent/identity.test.ts` — 7 tests: admin-fragment strip with and without other params, no-fragment and no-admin passthrough, argv-type guard. No heartbeat unit test yet. The manager's contract is "re-send on a timer while authenticated"; it's exercised end-to-end in Phase 3's `join` subcommand against a live local room. Adding a timer-mock unit test now would be ceremony over insight. Review gate: - bun run typecheck (root) — clean. - bunx tsc --noEmit in apps/collab-agent — clean. - bun test (targeted) — 19 new pass, 0 fail. - bun run test (full suite) — 1048 pass / 3 skip / 0 fail (+19 from 1029 Phase 1 baseline). Part of Slice 6a (Phase 2 of 8). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

…ce (Phase 3) The agent CLI now has a real dispatcher and four read-oriented subcommands. Each is a thin wrapper over `CollabRoomClient` with shared helpers for the identical "parse argv → strip admin → construct identity → joinRoom → emit presence" boilerplate. New files: - apps/collab-agent/subcommands/_lib.ts Shared helpers: parseCommonArgs (--url/--user/--type), flag readers (string/bool/number), openAgentSession (strips admin, constructs identity, joinRoom with autoConnect), awaitInitialSnapshot (10s timeout), wireSignalShutdown. `UsageError` class bubbles up to the dispatcher for consistent argv-error handling. - apps/collab-agent/subcommands/join.ts Connect, await snapshot, emit initial presence, start 10s heartbeat, stream room.event / room.presence / room.status / room.error to stdout as NDJSON until SIGINT. - apps/collab-agent/subcommands/read-plan.ts One-shot. `--with-block-ids` prefixes each block with `[block:<id>]` using the same `parseMarkdownToBlocks` the browser calls, so ids round-trip to what the observer sees in the DOM. - apps/collab-agent/subcommands/read-annotations.ts One-shot. Prints the annotations array as pretty JSON. - apps/collab-agent/subcommands/read-presence.ts One-shot with a 2s settle delay so inbound presence can land before we print. Prints `remotePresence` as JSON with a stderr banner clarifying "recent presence, NOT a participant roster" — connected-but-idle peers won't appear. index.ts rewritten from Phase 1 skeleton to a real dispatcher. Exit codes: 0 success, 1 runtime error, 2 usage error. Root package.json gains `agent:run` convenience script. Key design decisions (from the plan): - `join` emits an initial presence on connect AND runs a 10s heartbeat — the V1 protocol has no roster broadcast, peers only appear via received presence, and the observer's 30s TTL sweeps idle cursors. Heartbeat at 3× headroom keeps the agent visible while it's "thinking". - Admin-fragment strip is applied every subcommand through openAgentSession. Agents never get admin capability. - Block ids derive from the SAME `parseMarkdownToBlocks` the browser uses, so an agent's `--block <id>` (Phase 5) targets the same block the observer renders. - One-shot subcommands emit presence once (visible flash) without heartbeat — the process exits before the TTL matters. - `read-presence` deliberately-named and banner-labeled to not lie about what it returns. Review gate: - bunx tsc --noEmit in apps/collab-agent — clean. - bun run typecheck (root) — clean. - bun run test — 1048 pass / 3 skip / 0 fail (unchanged). - bun run agent:run --help — prints usage, exit 0. - bun run apps/collab-agent/index.ts read-plan (no --url) — prints "Missing --url", exit 2. Manual end-to-end smoke against a live local room is deferred to the full Slice 6a verification pass (Phase 8 final). Part of Slice 6a (Phase 3 of 8). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

Minimal UI allowance so observers can tell agent peers apart from human peers. Detection is purely identity-based via `isAgentIdentity` from the pure `packages/ui/utils/agentIdentity.ts` module (added in Phase 2). No protocol change; no new fields on PresenceState. Two surfaces updated: 1. `packages/ui/components/collab/ParticipantAvatars.tsx` - Added `isAgent` + `agentType` derivation per participant. - Agent avatars get a small `⚙` marker pinned to the bottom- right corner of the chip, styled with theme tokens so it reads on both light and dark themes. - Tooltip on agents reads `<name> (agent · <type>)`; human tooltips unchanged. - New `data-participant-kind="agent"|"human"` attribute for test assertions and any future selector work. 2. `packages/ui/components/collab/RemoteCursorLayer.tsx` - `isAgentIdentity(p?.user?.name)` passed to each `<RemoteCursor>` instance. - Onscreen cursor label prepends `⚙ ` when the peer is an agent. Same color, same position. - Offscreen edge pill (shown when the cursor is outside the viewport) also prepends `⚙ ` so the marker persists in both render modes. - New `data-client-kind="agent"|"human"` attribute on the cursor node for test/debug inspection. Agent cursor persistence is handled separately by the agent-side heartbeat (Phase 2) — the receiver's TTL sweep is unchanged. An agent that just emits presence once and goes quiet still disappears after 30s unless the agent keeps its heartbeat running, which the `join` / `demo` subcommands do. Test coverage: - `ParticipantAvatars.test.tsx` gains one case asserting the marker renders on agent identities and does NOT render on humans; tooltip text is also asserted. - `RemoteCursorLayer` doesn't have an existing test file (happy- dom + rAF + getBoundingClientRect make it hard to unit-test reliably). Marker is covered by manual smoke in the Slice 6a final verification. Review gate: - bun run typecheck — clean. - bun run test — 1049 pass / 3 skip / 0 fail (+1 new avatar agent-marker case from 1048 Phase 3 baseline). Part of Slice 6a (Phase 4 of 8). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

…(Phase 5) Adds the `comment` subcommand so an agent can post block-level COMMENT annotations to a room. Usage: --block <blockId> target block id (from read-plan --with-block-ids) --text <body> comment body --list-blocks print available blocks as JSON and exit without posting Targeting is block-level only in V1 by plan decision: startOffset=0, endOffset=block.content.length, originalText=block.content. This side-steps the `/api/external-annotations` selection-accuracy issue that plagues inline-text matching against rendered DOM. The agent doesn't have to "extract" any text; it just references the block whose content it wants to comment on. Block ids are derived from `parseMarkdownToBlocks` — the same call the browser makes on the same markdown — so the --block id the agent passes is byte-identical to what the observer renders in the DOM. The command is echo-driven, not send-driven: `sendAnnotationAdd` resolves when the op is queued, but we wait for the `state` event showing our annotation id in canonical state (echoed) or a mutation-scoped `lastError` (server rejected the op). 10s timeout so a silent failure doesn't hang the CLI. Exit codes: 0 annotation echoed back from server (success) 1 snapshot timeout, echo timeout, unknown --block id, or server-side mutation rejection (e.g. room locked) 2 argv/usage error (propagated from dispatcher) Annotation ids are `ann-agent-<uuid>` — distinct prefix makes agent-posted rows identifiable in logs and exports; the server treats the id as opaque so the prefix is purely a readability convention. Agent identity flows through `author` on the annotation. An observer sees the annotation attributed to e.g. `swift-falcon-tater-agent-claude` and (per Phase 4) an avatar / cursor marker indicating an agent. Review gate: - bunx tsc --noEmit in apps/collab-agent — clean. - bun run typecheck (root) — clean. - bun run test — 1049 pass / 3 skip / 0 fail (unchanged from Phase 4). - `bun run apps/collab-agent/index.ts comment` (no flags) — prints "Missing --url", exit 2. Manual end-to-end smoke against a live local room is deferred to the Slice 6a final verification after Phase 6 lands. Part of Slice 6a (Phase 5 of 8). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

…cursor (Phase 6) Adds the `demo` subcommand that gives a visible "an agent is participating" narrative for observers watching a browser tab. It walks the plan's heading blocks in document order, anchors the agent's cursor to each heading via `coordinateSpace: 'block'` presence (so the observer's RemoteCursorLayer resolves against its own rendered block rect, not a fake y-coordinate), pauses, and posts a block-level comment at each stop. Args: --duration <sec> total wall time, pauses scaled to fit (default 120s; clamped per-heading to 3–6s so dense plans stay readable and sparse plans don't camp) --comment-template <str> comment body per heading; {heading} and {level} get substituted (default "[demo] reviewing {heading}") --dry-run move the cursor + heartbeat without posting — useful for demoing the visual presence path in isolation Design decisions locked in from the plan: - Cursor is block-space, not rank × 200px document-y. Block anchoring is robust to the observer's viewport; approximate document coords would clamp to the edge indicator on narrow viewports or miss short sections on tall ones. The protocol already supports this; we just use it. - Comments are fire-and-forget within the demo loop. `sendAnnotationAdd` resolves on SEND, not on echo. Awaiting echo per heading would stall the visible cursor animation noticeably. A silent mutation rejection still surfaces via the `error` event stream in `join` logs if the user runs demo while tailing join output, which is the expected demo posture. - Heartbeat runs for the demo duration so the agent avatar stays in the observer's avatar row between cursor moves (otherwise the 3–6s pauses would sometimes cross the 30s TTL boundary in aggregate over long durations, but only theoretically — heartbeat is belt-and-braces). - NDJSON progress events on stdout (`demo.start`, `demo.visit` per heading, `demo.end`) so an invoking agent can see what the demo is doing programmatically. Exit codes: 0 demo completed 1 snapshot timeout, no headings in plan, unexpected runtime error during walk 2 argv/usage error Review gate: - bunx tsc --noEmit in apps/collab-agent — clean. - bun run typecheck (root) — clean. - bun run test — 1049 pass / 3 skip / 0 fail (unchanged). - Help text now lists all six subcommands. Manual smoke (run against a live local room with a browser observer) is deferred to the final Slice 6a verification in Phase 7/8. Part of Slice 6a (Phase 6 of 8). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

Two docs: - AGENT_INSTRUCTIONS.md — prose for an AI agent's prompt. Covers identity convention, the six subcommands, block-level-only comment targeting, rules (no admin, no images, server- authoritative mutation echo), and troubleshooting. Explicitly tells agents not to attempt inline text-range targeting and to put specific wording in the comment body rather than as an anchor — this steers around the selection-accuracy issue in /api/external-annotations without changing any server code. - README.md — human-readable CLI usage. Subcommand table, common-flag reference, exit-code table, admin-URL strip note, local-dev end-to-end recipe (dev:live-room + agent:run join), and a short "internals" section pointing at the reused CollabRoomClient primitives for anyone extending the tool. No code or test changes. Part of Slice 6a (Phase 7 of 8). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

Measurable characterization of the current `/api/external-annotations` text matcher (`packages/ui/hooks/useAnnotationHighlighter.ts:173-236`). Twelve scenarios covering the drift shapes the user has observed: trailing/leading whitespace, extra / tab / non-breaking spaces, markdown syntax drift (bold / italic), cross-block spans, repeated-substring ambiguity, smart quotes, exact whole-block content. This is NOT a fix. It's a regression fence: - Current classification recorded per case. - Summary count (`found=4 missed=8 ambiguous=1`) surfaces in test output so any fix effort sees the number drop. - Any future change to the matcher that flips behavior in a case surfaces as a test failure pointing at the row, forcing the author to update the classification or the matcher intentionally. Files: - packages/ui/hooks/useAnnotationHighlighter.matrix.test.tsx 13 tests total: 12 per-case characterization + 1 summary count invariant. Matcher logic is vendored verbatim from the source closure (line 173-236) because the closure isn't exported; in-sync-with-source comment marks the dependency. - specs/v1-selection-accuracy.md Human-readable write-up. Bucket the 8 misses into three bug families (whitespace drift, markdown drift, block-boundary spans), sketch fix candidates per family, flag the ambiguity case as a separate product decision, point at the candidate fix site, and give a sequencing suggestion for any slice that picks this up. The Slice 6a agent CLI sidesteps this entire surface by using block-level targeting for comments — it ships `originalText = block.content` (always exactly one of the DOM strings), so the matcher's strictness doesn't bite. Agents continuing to use `/api/external-annotations` for inline targeting ARE exposed; the AGENT_INSTRUCTIONS.md from Phase 7 tells them to put specific wording in the comment body rather than as an anchor. Review gate: - bun run typecheck (root) — clean. - bun run test — 1062 pass / 3 skip / 0 fail (+13 matrix tests from Phase 7's 1049 baseline). - bun test --cwd packages/ui hooks/useAnnotationHighlighter.matrix.test.tsx — 13 pass, prints the summary line "[selection-accuracy matrix] found=4 missed=8 ambiguous=1". Part of Slice 6a (Phase 8 of 8 — FINAL). Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

…omization Two follow-ups to the Slice 6a `demo` subcommand, landing together because both touch the per-heading presence + annotation-post loop. **Echo confirmation (reviewer high-severity finding).** Prior demo.ts issued `void client.sendAnnotationAdd([annotation])` — fire-and-forget — which meant demo could exit 0 while no comments actually landed. `sendAnnotationAdd`'s own promise resolves when queued for send, not when echoed (see `packages/shared/collab/client-runtime/client.ts:493`), and the runtime's send path rejects on disconnected or non-active rooms (same file:1435). With the promise discarded, those rejections were either unhandled or silently swallowed. The commit message claim that "failures surface through the error stream" was also wrong — demo never subscribed to the `error` event. Fix: - Extract the echo-confirmation pattern (previously inline in `comment.ts`) into a shared helper `awaitAnnotationEcho` in `apps/collab-agent/subcommands/_lib.ts`. Resolves when the annotation appears in canonical state, rejects on a mutation-scoped lastError advance or timeout. The helper's header doc pins the "subscribe before send" ordering contract so future callers don't reintroduce a race. - Refactor `comment.ts` to use the helper (DRY, ~30 lines out). - `demo.ts` now subscribes via the helper before each `sendAnnotationAdd`, awaits the echo with a tighter 5s timeout (vs 10s in `comment.ts`, because demo is time-boxed and a locked-room reject should fall through fast), collects per-heading failures into a `CommentFailure[]` array, keeps walking the tour so the observer still sees the cursor finish, and returns exit code 1 if any comment failed to land. The final `demo.end` NDJSON event now carries `headings`, `failed`, and `failures` so an invoking script can see what happened. **Cursor x/y randomization.** When multiple agents run demo against the same room, their cursor labels previously stacked at the exact `(x: 0, y: 0)` corner of each heading block — same pixel, same label, visually one agent. User testing flagged the case. demo.ts now picks integer-floored `x ∈ [20, 200)` and `y ∈ [0, 24)` per visit. Parallel agents land at different pixels; a single agent's cursor nudges around within the block rather than parking at the origin. Ranges are conservative — most block widths handle 200px horizontal without clipping, and when they don't the existing RemoteCursorLayer edge-indicator path takes over. Vertical range stays near the heading's text baseline. Review gate: - bun run typecheck — clean. - bun run test — 1072 pass / 3 skip / 0 fail. - Manual smoke: demo ran end-to-end against a live local room with --dry-run (cursor tour visible to the observer, clean disconnect at end, exit 0). Part of Slice 6a follow-up. Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

Adds the user-facing surface the user asked for: "Copy agent instructions" in the room dropdown, parallel to the existing "Copy agent instructions for external annotations" item that `/api/external-annotations` has at `PlanHeaderMenu.tsx:156`. The clipboard payload pre-fills THIS room's URL and the inviter's identity so an agent pasted into an LLM prompt has everything it needs to join in a single message — no "also, the URL is…" follow-ups. Files: - `packages/ui/utils/roomAgentInstructions.ts` (new) — builds the markdown payload. Same shape as `planAgentInstructions.ts`; tailored to the `apps/collab-agent/` CLI instead of /api/external-annotations. Covers identity, subcommand reference with runnable examples (read-plan, read-annotations, read-presence, comment, join, demo), block-level-only targeting rule, admin-strip guarantee, and troubleshooting. Crucially the URL and user identity are pre-filled — agents don't have to parse them out of prose. - `packages/ui/utils/agentIdentity.ts` — adds `toAgentUserSlug(name, fallback='participant')`. Normalizes any display name into a CLI-safe `--user` slug: trim → lowercase → non-[a-z0-9-] → `-` → collapse runs → trim leading/trailing. Fallback when slug is empty. This exists because `constructAgentIdentity` enforces a strict `/^[a-z0-9][a-z0-9-]*$/` charset, and human display names routinely carry spaces, uppercase, and punctuation — feeding raw names into templated CLI examples would have the shell split on spaces (silent truncation, e.g. `--user Michael Ramos` → agent joins as `michael-agent-claude`, dropping "Ramos") or throw `InvalidAgentIdentityError` if quoted. The builder now runs the user identity through this slug before templating. - `packages/ui/utils/agentIdentity.test.ts` — 10 new tests for `toAgentUserSlug`: space→dash, already-slug passthrough, trim, lowercase, punctuation/symbol replacement, collapse runs, strip edges, fallback on empty/all-punctuation, custom fallback, round-trip through `constructAgentIdentity`. - `packages/ui/components/collab/RoomMenu.tsx` — new "Copy agent instructions" menu item between "Copy consolidated feedback" and the admin section. Non-admin participants see it too (any participant can invite an agent on their behalf). New robot icon distinct from the existing link/copy icons. - `packages/ui/components/collab/RoomHeaderControls.tsx` — threads the new `onCopyAgentInstructions` prop through. - `packages/editor/App.tsx` — new `handleCopyRoomAgentInstructions` using the existing `copyToClipboardWithToast` helper. Reads `roomSession.joinUrl` (never the admin URL — defensive layering on top of the CLI's own `#admin=` strip) and `getIdentity()`, hands both to the builder, copies the result, toasts "Agent instructions copied". Review gate: - bun run typecheck — clean. - bun run test — 1072 pass / 3 skip / 0 fail (+10 new slug tests from 1062 baseline). - bun run --cwd apps/room-service build:shell — green. - Manual smoke against a live local room: menu item appears in the Room dropdown, click copies a markdown payload with the current room's URL and the user's slugged identity pre-filled; the payload's example commands are shell-safe and CLI-valid. Part of Slice 6a follow-up. Anchor: v0.17.10-preslice6a. For provenance purposes, this commit was AI assisted.

Needed so `bunx wrangler login` / `bunx wrangler deploy` work from the repo root without a separate global install. Already a transitive dep via apps/room-service; hoisting explicitly makes hackathon-style ad-hoc deploys reproducible from any checkout. For provenance purposes, this commit was AI assisted.

Reworks the clipboard payload so any agent receiving it kicks off visible activity without waiting for the user to give a task. The hackathon demo needs this — the presenter shares a link in front of an audience and the room needs to show cursor + comments within seconds, not after a back-and-forth. New section "## What to do right now" sits between the URL block and the reference material. It tells the agent: - Run `demo` as the default first action (cursor tour + per-heading comments; ~2 minutes of immediate visible activity with the distinguishing `⚙` marker so the audience can tell bot vs human). - Follow up with one or two thoughtful observations via `comment` on substantive blocks — real opinions, not template text — as the signal that the agent actually read the plan. - Drop the default if the user has given a specific task. The former standalone "## Demo mode" section becomes a flag reference ("## Demo mode — flags reference") since the canonical usage moved up-front. The three read-* examples moved under a new "## Reference — reading the plan and peers" header so the top-of-doc reads "identity → URL → ACTION → reference material". No API change, no tests to update — this is pure prose in a template string that's already covered by the surrounding UI flow (click menu item, get string, paste into agent). Review gate: - bun run typecheck — clean. - bun run test — 1072 pass / 3 skip / 0 fail (unchanged). - bun run --cwd apps/room-service build:shell — green. Next: rebuild shell + redeploy `plannotator-room-hack` so participants hitting the deployed Worker also see the updated instruction payload (not just the creator's local editor). For provenance purposes, this commit was AI assisted.

backnotprop added 23 commits April 18, 2026 12:08

backnotprop changed the title ~~Live Rooms V1 + Agent Direct-Client (Slice 6a)~~ Live Rooms V1 + Agent Direct-Client Apr 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Live Rooms V1 + Agent Direct-Client #591

Live Rooms V1 + Agent Direct-Client #591
backnotprop wants to merge 23 commits intomainfrom
feat/collab

backnotprop commented Apr 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

backnotprop commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Status

Summary

Wave 1 — Protocol & server (Slices 1–4)

Wave 2 — Consolidation pass

Wave 3 — Slice 6a agent direct-client

Deferred (documented non-goals)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

backnotprop commented Apr 19, 2026 •

edited

Loading