Skip to content

Commit 94b82d5

Browse files
newrengitster
authored andcommitted
rename: bump limit defaults yet again
These were last bumped in commit 92c57e5 (bump rename limit defaults (again), 2011-02-19), and were bumped both because processors had gotten faster, and because people were getting ugly merges that caused problems and reporting it to the mailing list (suggesting that folks were willing to spend more time waiting). Since that time: * Linus has continued recommending kernel folks to set diff.renameLimit=0 (maps to 32767, currently) * Folks with repositories with lots of renames were happy to set merge.renameLimit above 32767, once the code supported that, to get correct cherry-picks * Processors have gotten faster * It has been discovered that the timing methodology used last time probably used too large example files. The last point is probably worth explaining a bit more: * The "average" file size used appears to have been average blob size in the linux kernel history at the time (probably v2.6.25 or something close to it). * Since bigger files are modified more frequently, such a computation weights towards larger files. * Larger files may be more likely to be modified over time, but are not more likely to be renamed -- the mean and median blob size within a tree are a bit higher than the mean and median of blob sizes in the history leading up to that version for the linux kernel. * The mean blob size in v2.6.25 was half the average blob size in history leading to that point * The median blob size in v2.6.25 was about 40% of the mean blob size in v2.6.25. * Since the mean blob size is more than double the median blob size, any file as big as the mean will not be compared to any files of median size or less (because they'd be more than 50% dissimilar). * Since it is the number of files compared that provides the O(n^2) behavior, median-sized files should matter more than mean-sized ones. The combined effect of the above is that the file size used in past calculations was likely about 5x too large. Combine that with a CPU performance improvement of ~30%, and we can increase the limits by a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated time limits. Keeping the same approximate time limit probably makes sense for diff.renameLimit (there is no progress feedback in e.g. git log -p), but the experience above suggests merge.renameLimit could be extended significantly. In fact, it probably would make sense to have an unlimited default setting for merge.renameLimit, but that would likely need to be coupled with changes to how progress is displayed. (See https://lore.kernel.org/git/YOx+Ok%[email protected]/ for details in that area.) For now, let's just bump the approximate time limit from 10s to 1m. (Note: We do not want to use actual time limits, because getting results that depend on how loaded your system is that day feels bad, and because we don't discover that we won't get all the renames until after we've put in a lot of work rather than just upfront telling the user there are too many files involved.) Using the original time limit of 2s for diff.renameLimit, and bumping merge.renameLimit from 10s to 60s, I found the following timings using the simple script at the end of this commit message (on an AWS c5.xlarge which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"): N Timing 1300 1.995s 7100 59.973s So let's round down to nice even numbers and bump the limits from 400->1000, and from 1000->7000. Here is the measure_rename_perf script (adapted from https://lore.kernel.org/git/[email protected]/ in particular to avoid triggering the linear handling from basename-guided rename detection): #!/bin/bash n=$1; shift rm -rf repo mkdir repo && cd repo git init -q -b main mkdata() { mkdir $1 for i in `seq 1 $2`; do (sed "s/^/$i /" <../sample echo tag: $1 ) >$1/$i done } mkdata initial $n git add . git commit -q -m initial mkdata new $n git add . cd new for i in *; do git mv $i $i.renamed; done cd .. git rm -q -rf initial git commit -q -m new time git diff-tree -M -l0 --summary HEAD^ HEAD Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 9dd29db commit 94b82d5

File tree

5 files changed

+5
-5
lines changed

5 files changed

+5
-5
lines changed

Documentation/config/diff.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ diff.orderFile::
120120
diff.renameLimit::
121121
The number of files to consider in the exhaustive portion of
122122
copy/rename detection; equivalent to the 'git diff' option
123-
`-l`. If not set, the default value is currently 400. This
123+
`-l`. If not set, the default value is currently 1000. This
124124
setting has no effect if rename detection is turned off.
125125

126126
diff.renames::

Documentation/config/merge.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ merge.renameLimit::
3737
rename detection during a merge. If not specified, defaults
3838
to the value of diff.renameLimit. If neither
3939
merge.renameLimit nor diff.renameLimit are specified,
40-
currently defaults to 1000. This setting has no effect if
40+
currently defaults to 7000. This setting has no effect if
4141
rename detection is turned off.
4242

4343
merge.renames::

diff.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535

3636
static int diff_detect_rename_default;
3737
static int diff_indent_heuristic = 1;
38-
static int diff_rename_limit_default = 400;
38+
static int diff_rename_limit_default = 1000;
3939
static int diff_suppress_blank_empty;
4040
static int diff_use_color_default = -1;
4141
static int diff_color_moved_default;

merge-ort.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2558,7 +2558,7 @@ static void detect_regular_renames(struct merge_options *opt,
25582558
diff_opts.detect_rename = DIFF_DETECT_RENAME;
25592559
diff_opts.rename_limit = opt->rename_limit;
25602560
if (opt->rename_limit <= 0)
2561-
diff_opts.rename_limit = 1000;
2561+
diff_opts.rename_limit = 7000;
25622562
diff_opts.rename_score = opt->rename_score;
25632563
diff_opts.show_rename_progress = opt->show_rename_progress;
25642564
diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;

merge-recursive.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1879,7 +1879,7 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *opt,
18791879
*/
18801880
if (opts.detect_rename > DIFF_DETECT_RENAME)
18811881
opts.detect_rename = DIFF_DETECT_RENAME;
1882-
opts.rename_limit = (opt->rename_limit >= 0) ? opt->rename_limit : 1000;
1882+
opts.rename_limit = (opt->rename_limit >= 0) ? opt->rename_limit : 7000;
18831883
opts.rename_score = opt->rename_score;
18841884
opts.show_rename_progress = opt->show_rename_progress;
18851885
opts.output_format = DIFF_FORMAT_NO_OUTPUT;

0 commit comments

Comments
 (0)