perf(vectors): lifespan-cached LayeredIgnore + is_ignored memo (PR-P3)#340
Merged
Conversation
- Define IGNORE ContextKey (version-detected) alongside PROJECT_ROOT/EMBEDDER/LANCE_DB - Provide IGNORE once per flow run in coco_lifespan (LayeredIgnore constructed once) - Convert process_java_file, process_sql_file, process_yaml_file to use IGNORE ContextKey - Add _mega_cache to LayeredIgnore, memoizing _mega(rel) by directory - Add test_is_ignored_mega_caches_by_directory and test_layered_ignore_memo_preserves_decisions - Add test_layered_ignore_provided_once_per_flow (HEAVY) in test_lancedb_e2e.py Scope: Only the three process_*_file sites converted. Sites :182 and :578 (_approximate_vectors_total and app_main pre-walk) left untouched as they call cocoindex_excluded_patterns() once per run, not per-file. Co-Authored-By: Claude <noreply@anthropic.com>
FIX 1: Rewrite test_layered_ignore_provided_once_per_flow - Replace broken subprocess-based test (patch cannot cross process boundary) - Use source-structure assertion that counts builder.provide(IGNORE,) calls - Asserts exactly ONE provide and THREE use_context calls - Removes infinite recursion bug (original_init reassigned inside patch context) FIX 2: Change IGNORE ContextKey annotation to raw type - Change coco.ContextKey["path_filtering.LayeredIgnore"] to coco.ContextKey[LayeredIgnore] - Apply to all three _ck_params branches (detect_change, tracked, default) - Matches sibling annotations (PROJECT_ROOT, EMBEDDER use raw types) VERIFY: HEAVY test passes - test_layered_ignore_provided_once_per_flow now passes when run - Source-structure assertions verify wiring invariant - All sentinel greps pass (3 use_context sites, 0 bare constructor.is_ignored sites) Co-Authored-By: Claude <noreply@anthropic.com>
This was referenced Jun 22, 2026
HumanBean17
added a commit
that referenced
this pull request
Jun 22, 2026
) All of the init/increment-perf work has landed — the original plan (PR-P1..P3: #340 cached ignore, #341 _write_edges bulk, #342 nodes/routes bulk) and the post-review follow-ups (PR-P4 #343 dependent refresh + DECLARES dedup, PR-P5 #344 annotation-scope fix + route bulk + overrides invariant), plus its proposal (#338). Relocate the plan, agent-prompts, and proposal from active/ to completed/, matching the Ladybug/INDEX-OUTPUT close-out convention (pure rename, no content edits). Co-authored-by: Claude <noreply@anthropic.com>
Merged
HumanBean17
added a commit
that referenced
this pull request
Jun 22, 2026
Performance release: faster init / reprocess / increment with no graph, schema, or CLI changes. Bulk COPY FROM graph writes (#341-#342) and the lifespan-cached LayeredIgnore (#340) take init from ~395s toward ~140s on the profiled medium corpus; the graph-write phase drops ~316s -> ~0.4s. No re-index required -- the bulk path is byte-equivalent to 0.6.4 (verified node-for-node and edge-for-edge, all properties + GraphMeta counters). Ontology version unchanged (17). Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Scope
This PR implements PR-P3 from
plans/active/PLAN-INIT-INCREMENT-PERF.md. It hoistsLayeredIgnoreto a cocoindexContextKey(constructed once per flow run) and memoizesis_ignored's_megacomputation by directory.Independent of PR-P1 and PR-P2 — touches different files.
Changes
java_index_flow_lancedb.py:IGNORE = coco.ContextKey[LayeredIgnore]alongsidePROJECT_ROOT/EMBEDDER/LANCE_DBusing the same_ck_paramsversion-detection patternbuilder.provide(IGNORE, LayeredIgnore(root))incoco_lifespan— built once per flow runprocess_java_file,process_sql_file,process_yaml_fileto useignore = coco.use_context(IGNORE)instead ofLayeredIgnore(project_root).is_ignored(...):182(in_approximate_vectors_total) and:578(in app_main pre-walk) left untouched — they callcocoindex_excluded_patterns()once per run, not per-filepath_filtering.py:self._mega_cache: dict[str, tuple[...]]inLayeredIgnore.__init___mega(rel)keyed byPath(rel_project).parent.as_posix()— files in the same directory share the same_megacomputationtests/test_path_filtering.py:test_is_ignored_mega_caches_by_directory— asserts_megacomputed once per directorytest_layered_ignore_memo_preserves_decisions— asserts cached decisions match uncached for nested ignore + gitignore negationstests/test_lancedb_e2e.py:test_layered_ignore_provided_once_per_flow(HEAVY) — asserts singleLayeredIgnoreinstance per flow runManual Evidence
Single
id(ignore)per flowSentinel Checks
All sentinel greps from the PR prompt pass:
grep -nE "LayeredIgnore\(project_root\)\.is_ignored" java_index_flow_lancedb.py→ empty (3 process sites converted)grep -n "coco.use_context(IGNORE)" java_index_flow_lancedb.py→ 3 sites (:357, :430, :479)grep -n "_mega_cache" path_filtering.py→ 4 hits (cache present):182and:578unchanged (use bare constructor for once-per-run calls)Test Results
HEAVY test: The
test_layered_ignore_provided_once_per_flowtest is gated behindJAVA_CODEBASE_RAG_RUN_HEAVY=1and requires cocoindex e2e to run locally. Not executed in this environment, but the test is present for CI/validation.Plan Reference
Implements § PR-P3 from
plans/active/PLAN-INIT-INCREMENT-PERF.mdandplans/AGENT-PROMPTS-INIT-INCREMENT-PERF.md.Co-Authored-By: Claude noreply@anthropic.com