reactive
diff --git a/‎.cursor/rules/benchmarking.mdc‎
Lines changed: 5 additions & 5 deletions b/‎.cursor/rules/benchmarking.mdc‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎examples/benchmark-react/README.md‎
Lines changed: 50 additions & 14 deletions b/‎examples/benchmark-react/README.md‎
Lines changed: 50 additions & 14 deletions
@@ -16,7 +16,7 @@ When working on **`packages/react`** or comparing data-client to other React dat
 
 - **Where it lives**: `examples/benchmark-react/`
 - **How to run**: From repo root: `yarn build:benchmark-react`, then `yarn workspace example-benchmark-react preview &` and in another terminal `cd examples/benchmark-react && yarn bench`
-- **What it measures**: Browser-based mount/update duration, ref-stability counts, sorted-view (Query memoization), optional memory (heap delta), startup metrics (FCP/TBT), and React Profiler commit times. Compares data-client, TanStack Query, SWR, and a plain React baseline.
+- **What it measures**: Browser-based init/update duration, ref-stability counts, sorted-view (Query memoization), optional memory (heap delta), startup metrics (FCP/TBT), and React Profiler commit times. Compares data-client, TanStack Query, SWR, and a plain React baseline.
 - **CI**: `.github/workflows/benchmark-react.yml` runs on changes to `packages/react/src/**`, `packages/core/src/**`, `packages/endpoint/src/schemas/**`, `packages/normalizr/src/**`, or `examples/benchmark-react/**` and reports via `rhysd/github-action-benchmark` (customSmallerIsBetter). CI runs **data-client only** (hot-path scenarios) to track regressions; competitor libraries (TanStack Query, SWR, baseline) are for local comparison only.
 - **Report viewer**: Open `examples/benchmark-react/bench/report-viewer.html` in a browser and paste `react-bench-output.json` to view a comparison table and charts. Toggle "React commit" and "Trace" filters. Use "Load history" for time-series.
 
@@ -26,8 +26,8 @@ See `@examples/benchmark-react/README.md` for methodology, adding a new library,
 
 Use this mapping when deciding which React benchmark scenarios are relevant to a change:
 
-- **Mount scenarios** (`mount-100-items`, `mount-500-items`, `bulk-ingest-500`)
-  - Exercises: initial render with pre-populated cache
+- **Init scenarios** (`init-100`, `init-500`)
+  - Exercises: full fetch + normalization + render pipeline (ListView auto-fetches from list endpoint)
   - Relevant for: `@data-client/react` hooks, `@data-client/core` store initialization
   - All libraries
 
@@ -58,8 +58,8 @@ Use this mapping when deciding which React benchmark scenarios are relevant to a
 
 | Category | Scenarios | Typical run-to-run spread |
 |---|---|---|
-| **Stable** | `mount-*`, `update-single-entity`, `ref-stability-*`, `sorted-view-mount-*` | 2–5% |
-| **Moderate** | `update-shared-author-*`, `bulk-ingest-*`, `sorted-view-update-*` | 5–10% |
+| **Stable** | `init-*`, `update-single-entity`, `ref-stability-*`, `sorted-view-mount-*` | 2–5% |
+| **Moderate** | `update-shared-author-*`, `sorted-view-update-*` | 5–10% |
 | **Volatile** | `memory-mount-unmount-cycle`, `startup-*`, `(react commit)` suffixes | 10–25% |
 
 Regressions >5% on stable scenarios or >15% on volatile scenarios are worth investigating.
 
@@ -11,14 +11,14 @@ The repo has two benchmark suites:
 
 ## Methodology
 
-- **What we measure:** Wall-clock time from triggering an action (e.g. `mount(100)` or `updateAuthor('author-0')`) until the harness sets `data-bench-complete` (after two `requestAnimationFrame` callbacks). Optionally we also record React Profiler commit duration and, with `BENCH_TRACE=true`, Chrome trace duration.
+- **What we measure:** Wall-clock time from triggering an action (e.g. `init(100)` or `updateAuthor('author-0')`) until the harness sets `data-bench-complete` (after two `requestAnimationFrame` callbacks). Optionally we also record React Profiler commit duration and, with `BENCH_TRACE=true`, Chrome trace duration.
 - **Why:** Normalized caching should show wins on shared-entity updates (one store write, many components update), ref stability (fewer new object references), and derived-view memoization (`Query` schema avoids re-sorting when entities haven't changed). See [js-framework-benchmark "How the duration is measured"](https://github.com/krausest/js-framework-benchmark/wiki/How-the-duration-is-measured) for a similar timeline-based approach.
 - **Statistical:** Warmup runs are discarded; we report median and 95% CI. Libraries are interleaved per round to reduce environmental variance.
-- **No CPU throttling:** Runs at native speed with more samples (3 warmup + 30 measurement locally, 15 in CI) for statistical significance rather than artificial slowdown.
+- **No CPU throttling:** Runs at native speed with more samples for statistical significance rather than artificial slowdown. Small (cheap) scenarios use 3 warmup + 15 measurement runs locally (10 in CI); large (expensive) scenarios use 1 warmup + 4 measurement runs.
 
 ## Scenario categories
 
-- **Hot path (in CI, data-client only)** — JS-only: mount, update propagation, ref-stability, sorted-view, bulk-ingest. No simulated network. CI runs only `data-client` scenarios to track our own regressions; competitor libraries are benchmarked locally for comparison.
+- **Hot path (in CI, data-client only)** — JS-only: init (fetch + render), update propagation, ref-stability, sorted-view. No simulated network. CI runs only `data-client` scenarios to track our own regressions; competitor libraries are benchmarked locally for comparison.
 - **With network (local comparison)** — Same shared-author update but with simulated network delay (consistent ms per "request"). Used to compare overfetching: data-client needs one store update (1 × delay); non-normalized libs typically invalidate/refetch multiple queries (N × delay). **Not run in CI** — run locally with `yarn bench` (no `CI` env) to include these.
 - **Memory (local only)** — Heap delta after repeated mount/unmount cycles.
 - **Startup (local only)** — FCP and task duration via CDP `Performance.getMetrics`.
@@ -27,20 +27,19 @@ The repo has two benchmark suites:
 
 **Hot path (CI)**
 
-- **Mount** (`mount-100-items`, `mount-500-items`) — Time to mount 100 or 500 item rows (unit: ms).
+- **Init** (`init-100`, `init-500`) — Time to show a ListView component that auto-fetches 100 or 500 items from the list endpoint, then renders (unit: ms). Exercises the full fetch + normalization + render pipeline.
 - **Update single entity** (`update-single-entity`) — Time to update one item and propagate to the UI (unit: ms).
 - **Update shared author** (`update-shared-author-duration`) — 100 components, shared authors; update one author. Measures time to propagate (unit: ms). Normalized cache: one store update, all views of that author update.
 - **Update shared author (scaling)** (`update-shared-author-500-mounted`, `update-shared-author-1000-mounted`) — Same update with 500/1000 mounted components to test subscriber scaling.
 - **Ref-stability** (`ref-stability-item-changed`, `ref-stability-author-changed`) — Count of components that received a **new** object reference after an update (unit: count; smaller is better). Normalization keeps referential equality for unchanged entities.
 - **Sorted view mount** (`sorted-view-mount-500`) — Mount 500 items through a sorted/derived view. data-client uses `useQuery(sortedItemsQuery)` with `Query` schema memoization; competitors use `useMemo` + sort.
 - **Sorted view update** (`sorted-view-update-entity`) — After mounting a sorted view, update one entity. data-client's `Query` memoization avoids re-sorting when sort keys are unchanged.
-- **Bulk ingest** (`bulk-ingest-500`) — Generate fresh data at runtime, ingest into cache, and render. Exercises the full normalization pipeline.
 - **Optimistic update** (`optimistic-update`) — data-client only; applies an optimistic mutation via `getOptimisticResponse`.
 - **Invalidate and resolve** (`invalidate-and-resolve`) — data-client only; invalidates a cached endpoint and immediately re-resolves. Measures Suspense boundary round-trip.
 
 **With network (local comparison)**
 
-- **Update shared author with network** (`update-shared-author-with-network`) — Same as above with a simulated delay (e.g. 50 ms) per "request." data-client uses 1 request; other libs use N requests (one per affected item query) to model overfetching.
+- **Update shared author with network** (`update-shared-author-with-network`) — Same as above with a simulated delay (e.g. 50 ms) per "request." data-client propagates via normalization (no extra request); other libs invalidate/refetch the list endpoint.
 
 **Memory (local only)**
 
@@ -57,19 +56,18 @@ These are approximate values to help calibrate expectations. Exact numbers vary
 
 | Scenario | data-client | tanstack-query | swr | baseline |
 |---|---|---|---|---|
-| `mount-100-items` | ~similar | ~similar | ~similar | ~similar |
-| `update-shared-author-duration` (100 mounted) | Low (one store write propagates) | Higher (N cache writes) | Higher (N cache writes) | Higher (full array map) |
-| `ref-stability-item-changed` (100 mounted) | ~1 changed | ~1 changed | ~1 changed | ~100 changed |
-| `ref-stability-author-changed` (100 mounted) | ~5 changed | ~100 changed | ~100 changed | ~100 changed |
+| `init-100` | ~similar | ~similar | ~similar | ~similar |
+| `update-shared-author-duration` (100 mounted) | Low (one store write propagates) | Higher (list refetch) | Higher (list refetch) | Higher (list refetch) |
+| `ref-stability-item-changed` (100 mounted) | ~1 changed | ~100 changed (list refetch) | ~100 changed (list refetch) | ~100 changed (list refetch) |
+| `ref-stability-author-changed` (100 mounted) | ~5 changed | ~100 changed (list refetch) | ~100 changed (list refetch) | ~100 changed (list refetch) |
 | `sorted-view-update-entity` | Fast (Query memoization skips re-sort) | Re-sorts on every item change | Re-sorts on every item change | Re-sorts on every item change |
-| `bulk-ingest-500` | Normalization pipeline + render | Per-item cache seed + render | Per-item cache seed + render | Set state + render |
 
 ## Expected variance
 
 | Category | Scenarios | Typical run-to-run spread |
 |---|---|---|
-| **Stable** | `mount-*`, `update-single-entity`, `ref-stability-*`, `sorted-view-mount-*` | 2-5% |
-| **Moderate** | `update-shared-author-*`, `bulk-ingest-*`, `sorted-view-update-*` | 5-10% |
+| **Stable** | `init-*`, `update-single-entity`, `ref-stability-*`, `sorted-view-mount-*` | 2-5% |
+| **Moderate** | `update-shared-author-*`, `sorted-view-update-*` | 5-10% |
 | **Volatile** | `memory-mount-unmount-cycle`, `startup-*`, `(react commit)` suffixes | 10-25% |
 
 Regressions >5% on stable scenarios or >15% on volatile scenarios are worth investigating.
@@ -84,7 +82,7 @@ Regressions >5% on stable scenarios or >15% on volatile scenarios are worth inve
 ## Adding a new library
 
 1. Add a new app under `src/<lib>/index.tsx` (e.g. `src/urql/index.tsx`).
-2. Implement the `BenchAPI` interface on `window.__BENCH__`: `mount`, `updateEntity`, `updateAuthor`, `unmountAll`, `getRenderedCount`, `captureRefSnapshot`, `getRefStabilityReport`, and optionally `mountUnmountCycle`, `bulkIngest`, `mountSortedView`. Use the shared presentational `ItemRow` from `@shared/components` and fixtures from `@shared/data`.
+2. Implement the `BenchAPI` interface on `window.__BENCH__`: `init`, `updateEntity`, `updateAuthor`, `unmountAll`, `getRenderedCount`, `captureRefSnapshot`, `getRefStabilityReport`, and optionally `mountUnmountCycle`, `mountSortedView`. Use the shared presentational `ItemsRow` from `@shared/components` and fixtures from `@shared/data`. The harness (`useBenchState`) provides default `init`, `unmountAll`, `mountUnmountCycle`, `getRenderedCount`, and ref-stability methods; libraries only need to supply `updateEntity`, `updateAuthor`, and any overrides.
 3. Add the library to `LIBRARIES` in `bench/scenarios.ts`.
 4. Add a webpack entry in `webpack.config.cjs` for the new app and an `HtmlWebpackPlugin` entry so the app is served at `/<lib>/`.
 5. Add the dependency to `package.json` and run `yarn install`.
@@ -133,6 +131,44 @@ Regressions >5% on stable scenarios or >15% on volatile scenarios are worth inve
    - `BENCH_PORT=<port>` — port for `preview` server and bench runner (default `5173`)
    - `BENCH_BASE_URL=<url>` — full base URL override (takes precedence over `BENCH_PORT`)
 
+4. **Filtering scenarios**
+
+   The runner supports CLI flags (with env var fallbacks) to select a subset of scenarios:
+
+   | CLI flag | Env var | Description |
+   |---|---|---|
+   | `--lib <names>` | `BENCH_LIB` | Comma-separated library names (e.g. `data-client,swr`) |
+   | `--size <small\|large>` | `BENCH_SIZE` | Run only `small` (cheap, full rigor) or `large` (expensive, reduced runs) scenarios |
+   | `--action <group\|action>` | `BENCH_ACTION` | Filter by action group (`mount`, `update`, `mutation`, `memory`) or exact action name |
+   | `--scenario <pattern>` | `BENCH_SCENARIO` | Substring filter on scenario name |
+
+   CLI flags take precedence over env vars. Examples:
+
+   ```bash
+   yarn bench --lib data-client                # only data-client
+   yarn bench --size small                      # only cheap scenarios (full warmup/measurement)
+   yarn bench --action mount                    # init, mountSortedView
+   yarn bench --action update --lib swr         # update scenarios for swr only
+   yarn bench --scenario sorted-view            # only sorted-view scenarios
+   ```
+
+   Convenience scripts:
+
+   ```bash
+   yarn bench:small       # --size small
+   yarn bench:large       # --size large
+   yarn bench:dc          # --lib data-client
+   ```
+
+5. **Scenario sizes**
+
+   Scenarios are classified as `small` or `large` based on their cost:
+
+   - **Small** (3 warmup + 15 measurement): `init-100`, `update-single-entity`, `update-shared-author-duration`, `ref-stability-*`, `optimistic-update`, `invalidate-and-resolve`, `create-item`, `delete-item`
+   - **Large** (1 warmup + 4 measurement): `init-500`, `update-shared-author-500-mounted`, `update-shared-author-2000-mounted`, `memory-mount-unmount-cycle`, `update-shared-author-with-network`, `sorted-view-mount-500`, `sorted-view-update-entity`
+
+   When running all scenarios (`yarn bench`), each group runs with its own warmup/measurement count. Use `--size` to run only one group.
+
 ## Output
 
 The runner prints a JSON array in `customSmallerIsBetter` format (name, unit, value, range) to stdout. In CI this is written to `react-bench-output.json` and sent to the benchmark action.