-
Notifications
You must be signed in to change notification settings - Fork 871
Description
Summary
git filter-repo --analyze fails with "fatal: unable to read ea5ee0839a0c30442f568e9672dd614a988a39c6" after processing ~1.5M commits in a large mono-repo. The critical issue: this commit doesn't exist in the repository and no Git command can find any reference to it, yet filter-repo consistently attempts to read it.
Error Message
Processed 7971661 blob sizes
Processed 1596704 commitsfatal: unable to read ea5ee0839a0c30442f568e9672dd614a988a39c6
Processed 1596737 commits
Error: rev-list|diff-tree pipeline failed; see above.
Key Findings
✅ Reproducible: Error occurs consistently at the same commit count with the same SHA
✅ Object doesn't exist: git cat-file -t ea5ee0839a0c30442f568e9672dd614a988a39c6 fails
✅ Git reports healthy: git fsck --full completes with no errors or warnings
✅ No Git command finds it: Tested extensively with no results:
git rev-list --all --objects | grep ea5ee...(empty)git log --all --format="%H %P" | grep ea5ee...(empty)git verify-pack -v .git/objects/pack/*.idx | grep ea5ee...(empty)git reflog --all | grep ea5ee...(empty)grep -r ea5ee... .git/(no matches)
✅ Other tools work: BFG Repo-Cleaner runs successfully on the same repository
✅ Not a clone issue: Fresh clone, not shallow (git rev-parse --is-shallow-repository = false)
Environment
- Repository: Large mono-repo (~1.5M commits, ~8M blobs)
- Command:
git filter-repo --analyze - Repository state: Fresh clone
Questions for Maintainers
-
How does filter-repo find this commit? What mechanism is it using that differs from standard Git commands like
git rev-list,git fsck, andgit log? -
Could this be a filter-repo internal state issue? Since no Git tool can locate this SHA, could filter-repo be generating or caching it internally?
-
Why does BFG succeed? What does BFG do differently in traversing the repository that allows it to handle this scenario?
-
Workaround? Are there any flags or options to make filter-repo skip missing objects or be more lenient with repository inconsistencies?
This appears to be an edge case where filter-repo's repository traversal encounters a commit reference that Git's own integrity checking doesn't flag as problematic. Any guidance would be greatly appreciated!