Skip to content

Commit 242e520

Browse files
committed
Help minimize export dumps differences when inspecting --dry-run results
When diffing the fast-export dumps produces by `--dry-run`, it appears that git-filter-repo is using Python's default sorting algorithm for Git filenames, rather than the fast-export-specific algorithm, which is implemented at: https://github.com/git/git/blob/14de3eb34435db79c6e7edc8082c302a26a8330a/builtin/fast-export.c#L444-L448 ``` $ ./git-filter-repo --dry-run --proceed $ diff -u .git/filter-repo/fast-export.original .git/filter-repo/fast-export.filtered ... @@ -1451,25 +1329,23 @@ D testcases/expected/case1-twenty D testcases/inputs/case1 M 100755 0a13abf testcases/t9390-repo-filter.sh +M 100644 de3799f testcases/t9390/case1 M 100644 e0c8845 testcases/t9390/case1-filename M 100644 a1aa78f testcases/t9390/case1-ten M 100644 488cbd9 testcases/t9390/case1-twenty -M 100644 de3799f testcases/t9390/case1 ``` Note: this has no consequences on the resulting Git repository. git-filter-repo doesn't write tree objects directly; it has git-fast-import do that. git-fast-import states that the order of the filemodify directives given to it does not matter. This change only avoids distracting differences when inspecting the modified fast-export stream. Signed-off-by: Sylvain Beucler <[email protected]>
1 parent 2d39146 commit 242e520

File tree

1 file changed

+24
-1
lines changed

1 file changed

+24
-1
lines changed

git-filter-repo

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ operations; however:
3333
import argparse
3434
import collections
3535
import fnmatch
36+
import functools
3637
import gettext
3738
import io
3839
import os
@@ -3940,7 +3941,29 @@ class RepoFilter(object):
39403941
continue
39413942
# Otherwise, record the change
39423943
new_file_changes[change.filename] = change
3943-
commit.file_changes = [v for k,v in sorted(new_file_changes.items())]
3944+
3945+
# Use the git fast-export sorting algorithm for filenames
3946+
# https://github.com/git/git/blob/14de3eb34435db79c6e7edc8082c302a26a8330a/builtin/fast-export.c#L444-L448
3947+
def depth_first(a, b):
3948+
fn_a = a[0]
3949+
fn_b = b[0]
3950+
3951+
# Sort 'd/e' before 'd'
3952+
# first compare common length, then if equal give priority to longer one
3953+
min_len = min(len(fn_a), len(fn_b))
3954+
# memcmp equivalent https://docs.python.org/3.0/whatsnew/3.0.html#ordering-comparisons
3955+
cmp = (fn_a[:min_len] > fn_b[:min_len]) - (fn_a[:min_len] < fn_b[:min_len])
3956+
if cmp != 0: # different content
3957+
return cmp # return normal comparison
3958+
cmp = len(fn_b) - len(fn_a)
3959+
if cmp != 0: # different size
3960+
return cmp # longer one first
3961+
3962+
# 'R' (rename) entries last
3963+
cmp = (a[1].type == 'R') - (b[1].type == 'R')
3964+
return cmp
3965+
commit.file_changes = [v for k,v in sorted(new_file_changes.items(),
3966+
key=functools.cmp_to_key(depth_first))]
39443967

39453968
def _tweak_commit(self, commit, aux_info):
39463969
if self._args.replace_message:

0 commit comments

Comments
 (0)