You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
filter-repo: accelerate is_ancestor() for --analyze mode
The --analyze mode was extremely slow for the freebsd/freebsd repo on
github; digging in, the is_ancestor() function was being called a huge
number of times -- about 22 times per commit on average (and about 17
million times overall). The analyze mode uses is_ancestor() to
determine whether a rename equivalency class should be broken (i.e.
renaming A->B mean all versions of A and B are just different versions
of the same file, but if someone adds a new A in some commit which
contains the A->B rename in its history then this equivalence class no
longer holds). Each is_ancestor() call potentially has to walk a tree
of dependencies all the way back to a sufficient depth where it can
realize that the commit cannot be an ancestor; this can be a very long
walk.
We can speed this up by keeping track of some previous is_ancestor()
results. If commit F is not an ancestor of commit G, then F cannot be
an ancestor of children of G (unless that child has multiple parents;
but even in that case F can only be an ancestor through one of the
parents other than G). Similarly, if F is an ancestor of commit G, then
F will always be an ancestor of any children of G. Cache results from
previous calls to is_ancestor() and use them to accelerate subsequent
calls.
Signed-off-by: Elijah Newren <[email protected]>
0 commit comments