Skip to content

Commit 34d9819

Browse files
peffgitster
authored andcommitted
diff-highlight: match multi-line hunks
Currently we only bother highlighting single-line hunks. The rationale was that the purpose of highlighting is to point out small changes between two similar lines that are otherwise hard to see. However, that meant we missed similar cases where two lines were changed together, like: -foo(buf); -bar(buf); +foo(obj->buf); +bar(obj->buf); Each of those changes is simple, and would benefit from highlighting (the "obj->" parts in this case). This patch considers whole hunks at a time. For now, we consider only the case where the hunk has the same number of removed and added lines, and assume that the lines from each segment correspond one-to-one. While this is just a heuristic, in practice it seems to generate sensible results (especially because we now omit highlighting on completely-changed lines, so when our heuristic is wrong, we tend to avoid highlighting at all). Based on an original idea and implementation by Michał Kiedrowicz. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 6463fd7 commit 34d9819

File tree

2 files changed

+52
-34
lines changed

2 files changed

+52
-34
lines changed

contrib/diff-highlight/README

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,15 @@ Instead, this script post-processes the line-oriented diff, finds pairs
1414
of lines, and highlights the differing segments. It's currently very
1515
simple and stupid about doing these tasks. In particular:
1616

17-
1. It will only highlight a pair of lines if they are the only two
18-
lines in a hunk. It could instead try to match up "before" and
19-
"after" lines for a given hunk into pairs of similar lines.
20-
However, this may end up visually distracting, as the paired
21-
lines would have other highlighted lines in between them. And in
22-
practice, the lines which most need attention called to their
23-
small, hard-to-see changes are touching only a single line.
17+
1. It will only highlight hunks in which the number of removed and
18+
added lines is the same, and it will pair lines within the hunk by
19+
position (so the first removed line is compared to the first added
20+
line, and so forth). This is simple and tends to work well in
21+
practice. More complex changes don't highlight well, so we tend to
22+
exclude them due to the "same number of removed and added lines"
23+
restriction. Or even if we do try to highlight them, they end up
24+
not highlighting because of our "don't highlight if the whole line
25+
would be highlighted" rule.
2426

2527
2. It will find the common prefix and suffix of two lines, and
2628
consider everything in the middle to be "different". It could

contrib/diff-highlight/diff-highlight

Lines changed: 43 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -10,23 +10,28 @@ my $UNHIGHLIGHT = "\x1b[27m";
1010
my $COLOR = qr/\x1b\[[0-9;]*m/;
1111
my $BORING = qr/$COLOR|\s/;
1212

13-
my @window;
13+
my @removed;
14+
my @added;
15+
my $in_hunk;
1416

1517
while (<>) {
16-
# We highlight only single-line changes, so we need
17-
# a 4-line window to make a decision on whether
18-
# to highlight.
19-
push @window, $_;
20-
next if @window < 4;
21-
if ($window[0] =~ /^$COLOR*(\@| )/ &&
22-
$window[1] =~ /^$COLOR*-/ &&
23-
$window[2] =~ /^$COLOR*\+/ &&
24-
$window[3] !~ /^$COLOR*\+/) {
25-
print shift @window;
26-
show_hunk(shift @window, shift @window);
18+
if (!$in_hunk) {
19+
print;
20+
$in_hunk = /^$COLOR*\@/;
21+
}
22+
elsif (/^$COLOR*-/) {
23+
push @removed, $_;
24+
}
25+
elsif (/^$COLOR*\+/) {
26+
push @added, $_;
2727
}
2828
else {
29-
print shift @window;
29+
show_hunk(\@removed, \@added);
30+
@removed = ();
31+
@added = ();
32+
33+
print;
34+
$in_hunk = /^$COLOR*[\@ ]/;
3035
}
3136

3237
# Most of the time there is enough output to keep things streaming,
@@ -42,26 +47,37 @@ while (<>) {
4247
}
4348
}
4449

45-
# Special case a single-line hunk at the end of file.
46-
if (@window == 3 &&
47-
$window[0] =~ /^$COLOR*(\@| )/ &&
48-
$window[1] =~ /^$COLOR*-/ &&
49-
$window[2] =~ /^$COLOR*\+/) {
50-
print shift @window;
51-
show_hunk(shift @window, shift @window);
52-
}
53-
54-
# And then flush any remaining lines.
55-
while (@window) {
56-
print shift @window;
57-
}
50+
# Flush any queued hunk (this can happen when there is no trailing context in
51+
# the final diff of the input).
52+
show_hunk(\@removed, \@added);
5853

5954
exit 0;
6055

6156
sub show_hunk {
6257
my ($a, $b) = @_;
6358

64-
print highlight_pair($a, $b);
59+
# If one side is empty, then there is nothing to compare or highlight.
60+
if (!@$a || !@$b) {
61+
print @$a, @$b;
62+
return;
63+
}
64+
65+
# If we have mismatched numbers of lines on each side, we could try to
66+
# be clever and match up similar lines. But for now we are simple and
67+
# stupid, and only handle multi-line hunks that remove and add the same
68+
# number of lines.
69+
if (@$a != @$b) {
70+
print @$a, @$b;
71+
return;
72+
}
73+
74+
my @queue;
75+
for (my $i = 0; $i < @$a; $i++) {
76+
my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
77+
print $rm;
78+
push @queue, $add;
79+
}
80+
print @queue;
6581
}
6682

6783
sub highlight_pair {

0 commit comments

Comments
 (0)