fix(cli): erase removes graph/cocoindex.db/.graph_hashes.json by type (#346)#348
Merged
Conversation
…#346) `erase` reported success but left code_graph.lbug on disk because its deletion was type-blind: shutil.rmtree silently no-ops on a regular file (code_graph.lbug) and Path.unlink raises IsADirectoryError on a directory (cocoindex.db), both swallowed; .graph_hashes.json was never targeted. The next init then refused (exit 2), deadlooping the documented `erase --yes` -> `init` clean-slate workflow. Replace the type-blind deletes with a _rm_any helper that dispatches on type (file/dir/symlink — a symlinked dir is unlinked, never recursed into, so the target is not followed), so both the file-backed and dir-backed LadybugDB layouts are handled. erase now also removes .graph_hashes.json and lists it in the "Will delete:" preview. Deletion failures are warned to stderr instead of swallowed, so erase no longer reports success while leaving an artifact behind (the same silent-failure class as #346). `reprocess` is unaffected: its full rebuild opens the existing .lbug and _drop_all()s every node + edge table in place, and _init_hash_tracker resets .graph_hashes.json — it never relies on the broken deletion. Tests: add an always-on regression that creates a real lbug-file / cocoindex.db-dir / hash-store layout and asserts erase removes all three; convert the false-green test_init_after_erase_succeeds into a real build -> erase -> re-init lifecycle check. Co-Authored-By: Claude <noreply@anthropic.com>
8d0d9be to
679d22c
Compare
Owner
Author
Self-review before requesting reviewI ran a high-effort code review on this diff (3 finder angles + verification) and addressed the actionable findings: Addressed
Considered and intentionally left out (out of scope for #346)
Validation
|
This was referenced Jun 24, 2026
Open
Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #346.
java-codebase-rag erasereportedsuccess: truebut did not delete the LadybugDB graph (code_graph.lbug), because the deletion was type-blind:code_graph.lbugshutil.rmtree(path, ignore_errors=True)rmtreeon a file raises → swallowed → no-opcocoindex.dbPath.unlink()IsADirectoryError→ swallowed byexcept OSError→ no-op.graph_hashes.jsonThe surviving
code_graph.lbugthen made the nextinitrefuse (exit 2, pointing back aterase --yes) — a deadloop of the documented clean-slate workflow (erase --yes→init).Fix
Add a
_rm_any(path)helper that dispatches on type (file / dir / symlink), so both the file-backed and directory-backed LadybugDB layouts are handled.erasenow removescode_graph.lbug,cocoindex.db, and.graph_hashes.json, and lists the hash store in theWill delete:preview.Does
reprocesshave the same problem? — NoInvestigated explicitly per the task.
reprocess(default path →run_refresh_pipeline) rebuilds in place and never relies on the broken deletion:.lbugand calls_drop_all()on every node + edge/REL table (Symbol,Route,Client,Producer,GraphMeta,CALLS,HTTP_CALLS,ASYNC_CALLS,EXTENDS,IMPLEMENTES,DECLARES, …), then recreates the schema and rewrites fresh._init_hash_trackerdeliberately resets.graph_hashes.jsonto mirror exactly the indexed files (no stale hashes).--full-reprocessrebuilds Lance +cocoindex.db.A repo-wide grep confirms
cli.py:625,628were the only type-blind deletes ofladybug_path/cocoindex_db— soinit/increment/reprocessare all unaffected. No change toreprocess.Tests
test_erase_removes_graph_file_cocoindex_dir_and_hash_store— creates a real on-disk layout (code_graph.lbugfile,cocoindex.db/dir,.graph_hashes.jsonfile), runserase --yes, asserts all three are gone. No embedding-model dependency → runs on every CI job. Watched this fail (erase left code_graph.lbug on disk) before the fix and pass after (TDD red→green).test_init_after_erase_succeedspreviously erased an empty index dir and then inited (so it never erased a real graph). Converted to a realinit → erase → re-initlifecycle: assertscode_graph.lbugis gone after erase and that the secondinitsucceeds (rc 0).Manual evidence (issue reproduction)
Validation
.venv/bin/ruff check .→ clean.venv/bin/python -m pytest tests -q(serial) → 851 passed, 14 skipped, 0 errors🤖 Generated with Claude Code