Skip to content

docs(propose): init/increment perf — bulk graph writes + cached ignore#338

Merged
HumanBean17 merged 2 commits into
masterfrom
plan/init-increment-perf-propose
Jun 22, 2026
Merged

docs(propose): init/increment perf — bulk graph writes + cached ignore#338
HumanBean17 merged 2 commits into
masterfrom
plan/init-increment-perf-propose

Conversation

@HumanBean17

@HumanBean17 HumanBean17 commented Jun 21, 2026

Copy link
Copy Markdown
Owner

What

Adds propose/active/INIT-INCREMENT-PERF-PROPOSE.md — a proposal-only design for the init/increment performance program. No production code changed.

Why now

Profiling java-codebase-rag init on a medium Java corpus (Shopizer: 1210 files → 1167 indexed, 3879 chunks, ~32k graph edges, 395s total) showed graph writes are ~81% of init, not the vectors stage:

phase time share
LadybugDB graph write (per-row MERGE/CREATE) ~321s ~81%
cocoindex vectors ~68s ~17%
optimize ~5s ~1%

Init/increment latency is the project's stated pain point, so this redirects effort to the real bottleneck.

Highlights

Three PRs under one proposal:

  • PR-1 — Bulk in-memory-pyarrow COPY FROM for the full rebuild path (init/reprocess). Replaces ~21 per-row conn.execute sites (build_ast_graph.py:3096/3250-3398) with staged bulk loads via COPY <table> FROM $param. A micro-benchmark on the real Symbol schema measured ~300× (5.6 ms/row → 0.018 ms/row). Projected ~321s graph-write → tens of seconds; init ~395s → ~120s. Staging invariants spelled out: REL FROM/TO column rule, CALLS dedup + callee_declaring_role materialized at staging, node-before-edge order. Carries a mandatory equivalence harness (old per-row build vs new bulk build → identical counts + meta + full edge property rows + query results).
  • PR-2 — Same primitive extended to the incremental path (preserves the Route-MERGE dedup at build_ast_graph.py:3819-3821). Depends on PR-1.
  • PR-3 — Hoist LayeredIgnore(project_root) to a flow-lifespan cocoindex ContextKey and memoize is_ignored~25s → ~0s. Independent of PR-1/2.

No ontology bump (graph contents identical; proven by the harness). All PRs re-index-free — only the write mechanism / a cache change.

Reviewed

A 5-lens subagent review (39 agents, ~1.3M tokens) empirically validated the load-bearing claims (kuzu 0.11.3 COPY FROM into REL tables works; ~300× speedup; line citations spot-checked). Its main catch: an earlier PR-4 (default embedding device to MPS) was dropped — its premise was false. The flow already auto-selects MPS (SBERT_DEVICE unset → device=Nonecuda→mps→cpu), so the profiled init ran on MPS (~16s), not CPU; there was no CPU→MPS win to recover. That rejection is recorded in the proposal's Out of scope.

Tests

Docs-only; baseline unchanged.

Out of scope

References

HumanBean17 and others added 2 commits June 21, 2026 21:35
… ignore, mps)

Proposal-only. Profiles init at ~395s on a medium Java corpus and sequenced
three measured, independent levers as four PRs:
- PR-1: bulk COPY FROM for the full rebuild path (init/reprocess) — the ~81%
  graph-write lever; init projected ~395s -> ~120s.
- PR-2: same primitive extended to the incremental path.
- PR-3: hoist LayeredIgnore to a flow-lifespan ContextKey — ~25s -> ~0s.
- PR-4: default embedding device cuda -> mps -> cpu — ~28s -> ~16s on Apple Silicon.

No ontology bump; PR-1/2/3 re-index-free; PR-4 optional re-index callout.
ANN index (parked, #337) and watch mode (#336) explicitly out of scope.

Co-Authored-By: Claude <noreply@anthropic.com>
…4, align format)

5-lens subagent review of the proposal found:
- PR-4 (MPS device default) was built on a false premise: the flow already
  auto-selects MPS (SBERT_DEVICE unset -> device=None -> cuda->mps->cpu), so
  the profiled init embedded on MPS (~16s), not CPU. Dropped; rationale moved
  to Out of scope.
- PR-1 mechanism corrected to in-memory pyarrow COPY FROM $param (not Parquet
  file); staging invariants made explicit (REL FROM/TO column rule, CALLS dedup
  + callee_declaring_role materialization at staging, node-before-edge order);
  atomicity note added.
- PR-3 broadened: also memoize is_ignored, not just hoist the constructor.
- Citations fixed: full-rebuild node writer is _write_nodes at :3096 (not the
  incremental MERGE path at 824-825); ~21 per-row sites in write fns (not 44);
  _CREATE_SYMBOL/_MERGE_SYMBOL at :3007-3026.

Also aligned the doc to the repo's current propose format (matches
LADYBUG-DB-MIGRATE-PROPOSE): natural-English H1, Scope with In/Out subsections,
no TL;DR, no PR-body-template section, no edit-history narration.

Co-Authored-By: Claude <noreply@anthropic.com>
@HumanBean17 HumanBean17 changed the title docs(propose): init/increment perf program (bulk graph writes, cached ignore, mps) docs(propose): init/increment perf — bulk graph writes + cached ignore Jun 21, 2026
@HumanBean17 HumanBean17 merged commit 0396492 into master Jun 22, 2026
1 check passed
HumanBean17 added a commit that referenced this pull request Jun 22, 2026
)

* docs(plans): execution plan for init/increment perf (PR-P1..PR-P3)

Adds plans/active/PLAN-INIT-INCREMENT-PERF.md and the companion
plans/AGENT-PROMPTS-INIT-INCREMENT-PERF.md implementing the approved proposal
propose/active/INIT-INCREMENT-PERF-PROPOSE.md.

Three PRs:
- PR-P1: bulk in-memory-pyarrow COPY FROM for the full rebuild path; equivalence
  harness is the merge gate.
- PR-P2: same primitive for the incremental path (Route-MERGE dedup retained).
- PR-P3: lifespan-cached LayeredIgnore (ContextKey) + is_ignored _mega memo.

No production code. Stacks behind proposal PR #338.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs(plans): apply review feedback to init/increment perf plan

5-lens subagent review of the plan found the PR-P1/P2 boundary was
architecturally wrong: the graph write helpers are SHARED between the full
and incremental paths, so a "full-path-only" split is impossible.
- Verified call graph: _write_edges/_write_routes_and_exposes/_write_nodes_impl/
  _write_meta are each called by BOTH paths; _write_clients_producers_and_calls
  is incremental-only (global pass5/6).
- Re-split by write-FUNCTION: PR-P1 = _bulk_copy + _write_edges (the ~250s
  prize, accelerates both paths); PR-P2 = _write_nodes_impl +
  _write_routes_and_exposes + _write_clients_producers_and_calls; PR-P3 = ignore
  cache (independent).
- GraphMeta (_write_meta) left on MERGE (shared, one row) — reverses Open Q1.
- Fixed all binding sentinel greps: PR-P1 zeros the edge _CREATE_* only;
  PR-P2 zeros node/route/client constants + _MERGE_SYMBOL only after both
  routes functions convert; PR-P3 sentinel narrowed to
  LayeredIgnore(project_root).is_ignored (the bare-constructor grep wrongly
  matched once-per-run sites :177/:569, which are correctly left alone).
- Load-order §1f corrected (UnresolvedCallSite before UNRESOLVED_AT;
  Route/Client/Producer before their edges). Test files qualified
  (test_brownfield_routes / test_mcp_v2_compose / test_vectors_progress /
  test_path_filtering). PR-P2 tests placed in TestIncrementalOrchestrator.
  Baseline flagged as equivalence anchor, not production invariant.
  PR-P1 DoD lists the four test names.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
HumanBean17 added a commit that referenced this pull request Jun 22, 2026
)

All of the init/increment-perf work has landed — the original plan
(PR-P1..P3: #340 cached ignore, #341 _write_edges bulk, #342 nodes/routes
bulk) and the post-review follow-ups (PR-P4 #343 dependent refresh +
DECLARES dedup, PR-P5 #344 annotation-scope fix + route bulk + overrides
invariant), plus its proposal (#338). Relocate the plan, agent-prompts,
and proposal from active/ to completed/, matching the Ladybug/INDEX-OUTPUT
close-out convention (pure rename, no content edits).

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant