Skip to content

Commit ca1f4ae

Browse files
stefanbellergitster
authored andcommitted
diff.c: add white space mode to move detection that allows indent changes
The option of --color-moved has proven to be useful as observed on the mailing list. However when refactoring sometimes the indentation changes, for example when partitioning a functions into smaller helper functions the code usually mostly moved around except for a decrease in indentation. To just review the moved code ignoring the change in indentation, a mode to ignore spaces in the move detection as implemented in a previous patch would be enough. However the whole move coloring as motivated in commit 2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought up the notion of the reviewer being able to trust the move of a "block". As there are languages such as python, which depend on proper relative indentation for the control flow of the program, ignoring any white space change in a block would not uphold the promises of 2e2d5ac that allows reviewers to pay less attention to the inside of a block, as inside the reviewer wants to assume the same program flow. This new mode of white space ignorance will take this into account and will only allow the same white space changes per line in each block. This patch even allows only for the same change at the beginning of the lines. As this is a white space mode, it is made exclusive to other white space modes in the move detection. This patch brings some challenges, related to the detection of blocks. We need a wide net to catch the possible moved lines, but then need to narrow down to check if the blocks are still intact. Consider this example (ignoring block sizes): - A - B - C + A + B + C At the beginning of a block when checking if there is a counterpart for A, we have to ignore all space changes. However at the following lines we have to check if the indent change stayed the same. Checking if the indentation change did stay the same, is done by computing the indentation change by the difference in line length, and then assume the change is only in the beginning of the longer line, the common tail is the same. That is why the test contains lines like: - <TAB> A ... + A <TAB> ... As the first line starting a block is caught using a compare function that ignores white spaces unlike the rest of the block, where the white space delta is taken into account for the comparison, we also have to think about the following situation: - A - B - A - B + A + B + A + B When checking if the first A (both in the + and - lines) is a start of a block, we have to check all 'A' and record all the white space deltas such that we can find the example above to be just one block that is indented. Signed-off-by: Stefan Beller <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent e2fe6ab commit ca1f4ae

File tree

4 files changed

+252
-2
lines changed

4 files changed

+252
-2
lines changed

Documentation/diff-options.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,11 @@ ignore-space-change::
307307
ignore-all-space::
308308
Ignore whitespace when comparing lines. This ignores differences
309309
even if one line has whitespace where the other line has none.
310+
allow-indentation-change::
311+
Initially ignore any white spaces in the move detection, then
312+
group the moved code blocks only into a block if the change in
313+
whitespace is the same per line. This is incompatible with the
314+
other modes.
310315
--
311316

312317
--word-diff[=<mode>]::

diff.c

Lines changed: 156 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -302,12 +302,18 @@ static int parse_color_moved_ws(const char *arg)
302302
ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
303303
else if (!strcmp(sb.buf, "ignore-all-space"))
304304
ret |= XDF_IGNORE_WHITESPACE;
305+
else if (!strcmp(sb.buf, "allow-indentation-change"))
306+
ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
305307
else
306308
error(_("ignoring unknown color-moved-ws mode '%s'"), sb.buf);
307309

308310
strbuf_release(&sb);
309311
}
310312

313+
if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
314+
(ret & XDF_WHITESPACE_FLAGS))
315+
die(_("color-moved-ws: allow-indentation-change cannot be combined with other white space modes"));
316+
311317
string_list_clear(&l, 0);
312318

313319
return ret;
@@ -737,7 +743,91 @@ struct moved_entry {
737743
struct hashmap_entry ent;
738744
const struct emitted_diff_symbol *es;
739745
struct moved_entry *next_line;
746+
struct ws_delta *wsd;
747+
};
748+
749+
/**
750+
* The struct ws_delta holds white space differences between moved lines, i.e.
751+
* between '+' and '-' lines that have been detected to be a move.
752+
* The string contains the difference in leading white spaces, before the
753+
* rest of the line is compared using the white space config for move
754+
* coloring. The current_longer indicates if the first string in the
755+
* comparision is longer than the second.
756+
*/
757+
struct ws_delta {
758+
char *string;
759+
unsigned int current_longer : 1;
740760
};
761+
#define WS_DELTA_INIT { NULL, 0 }
762+
763+
static int compute_ws_delta(const struct emitted_diff_symbol *a,
764+
const struct emitted_diff_symbol *b,
765+
struct ws_delta *out)
766+
{
767+
const struct emitted_diff_symbol *longer = a->len > b->len ? a : b;
768+
const struct emitted_diff_symbol *shorter = a->len > b->len ? b : a;
769+
int d = longer->len - shorter->len;
770+
771+
out->string = xmemdupz(longer->line, d);
772+
out->current_longer = (a == longer);
773+
774+
return !strncmp(longer->line + d, shorter->line, shorter->len);
775+
}
776+
777+
static int cmp_in_block_with_wsd(const struct diff_options *o,
778+
const struct moved_entry *cur,
779+
const struct moved_entry *match,
780+
struct moved_entry *pmb,
781+
int n)
782+
{
783+
struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n];
784+
int al = cur->es->len, cl = l->len;
785+
const char *a = cur->es->line,
786+
*b = match->es->line,
787+
*c = l->line;
788+
789+
int wslen;
790+
791+
/*
792+
* We need to check if 'cur' is equal to 'match'.
793+
* As those are from the same (+/-) side, we do not need to adjust for
794+
* indent changes. However these were found using fuzzy matching
795+
* so we do have to check if they are equal.
796+
*/
797+
if (strcmp(a, b))
798+
return 1;
799+
800+
if (!pmb->wsd)
801+
/*
802+
* No white space delta was carried forward? This can happen
803+
* when we exit early in this function and do not carry
804+
* forward ws.
805+
*/
806+
return 1;
807+
808+
/*
809+
* The indent changes of the block are known and carried forward in
810+
* pmb->wsd; however we need to check if the indent changes of the
811+
* current line are still the same as before.
812+
*
813+
* To do so we need to compare 'l' to 'cur', adjusting the
814+
* one of them for the white spaces, depending which was longer.
815+
*/
816+
817+
wslen = strlen(pmb->wsd->string);
818+
if (pmb->wsd->current_longer) {
819+
c += wslen;
820+
cl -= wslen;
821+
} else {
822+
a += wslen;
823+
al -= wslen;
824+
}
825+
826+
if (strcmp(a, c))
827+
return 1;
828+
829+
return 0;
830+
}
741831

742832
static int moved_entry_cmp(const void *hashmap_cmp_fn_data,
743833
const void *entry,
@@ -750,6 +840,16 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data,
750840
unsigned flags = diffopt->color_moved_ws_handling
751841
& XDF_WHITESPACE_FLAGS;
752842

843+
if (diffopt->color_moved_ws_handling &
844+
COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE)
845+
/*
846+
* As there is not specific white space config given,
847+
* we'd need to check for a new block, so ignore all
848+
* white space. The setup of the white space
849+
* configuration for the next block is done else where
850+
*/
851+
flags |= XDF_IGNORE_WHITESPACE;
852+
753853
return !xdiff_compare_lines(a->es->line, a->es->len,
754854
b->es->line, b->es->len,
755855
flags);
@@ -765,6 +865,7 @@ static struct moved_entry *prepare_entry(struct diff_options *o,
765865
ret->ent.hash = xdiff_hash_string(l->line, l->len, flags);
766866
ret->es = l;
767867
ret->next_line = NULL;
868+
ret->wsd = NULL;
768869

769870
return ret;
770871
}
@@ -820,6 +921,37 @@ static void pmb_advance_or_null(struct diff_options *o,
820921
}
821922
}
822923

924+
static void pmb_advance_or_null_multi_match(struct diff_options *o,
925+
struct moved_entry *match,
926+
struct hashmap *hm,
927+
struct moved_entry **pmb,
928+
int pmb_nr, int n)
929+
{
930+
int i;
931+
char *got_match = xcalloc(1, pmb_nr);
932+
933+
for (; match; match = hashmap_get_next(hm, match)) {
934+
for (i = 0; i < pmb_nr; i++) {
935+
struct moved_entry *prev = pmb[i];
936+
struct moved_entry *cur = (prev && prev->next_line) ?
937+
prev->next_line : NULL;
938+
if (!cur)
939+
continue;
940+
if (!cmp_in_block_with_wsd(o, cur, match, pmb[i], n))
941+
got_match[i] |= 1;
942+
}
943+
}
944+
945+
for (i = 0; i < pmb_nr; i++) {
946+
if (got_match[i]) {
947+
/* Carry the white space delta forward */
948+
pmb[i]->next_line->wsd = pmb[i]->wsd;
949+
pmb[i] = pmb[i]->next_line;
950+
} else
951+
pmb[i] = NULL;
952+
}
953+
}
954+
823955
static int shrink_potential_moved_blocks(struct moved_entry **pmb,
824956
int pmb_nr)
825957
{
@@ -837,6 +969,10 @@ static int shrink_potential_moved_blocks(struct moved_entry **pmb,
837969

838970
if (lp < pmb_nr && rp > -1 && lp < rp) {
839971
pmb[lp] = pmb[rp];
972+
if (pmb[rp]->wsd) {
973+
free(pmb[rp]->wsd->string);
974+
FREE_AND_NULL(pmb[rp]->wsd);
975+
}
840976
pmb[rp] = NULL;
841977
rp--;
842978
lp++;
@@ -924,7 +1060,11 @@ static void mark_color_as_moved(struct diff_options *o,
9241060
if (o->color_moved == COLOR_MOVED_PLAIN)
9251061
continue;
9261062

927-
pmb_advance_or_null(o, match, hm, pmb, pmb_nr);
1063+
if (o->color_moved_ws_handling &
1064+
COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE)
1065+
pmb_advance_or_null_multi_match(o, match, hm, pmb, pmb_nr, n);
1066+
else
1067+
pmb_advance_or_null(o, match, hm, pmb, pmb_nr);
9281068

9291069
pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr);
9301070

@@ -935,7 +1075,17 @@ static void mark_color_as_moved(struct diff_options *o,
9351075
*/
9361076
for (; match; match = hashmap_get_next(hm, match)) {
9371077
ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
938-
pmb[pmb_nr++] = match;
1078+
if (o->color_moved_ws_handling &
1079+
COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) {
1080+
struct ws_delta *wsd = xmalloc(sizeof(*match->wsd));
1081+
if (compute_ws_delta(l, match->es, wsd)) {
1082+
match->wsd = wsd;
1083+
pmb[pmb_nr++] = match;
1084+
} else
1085+
free(wsd);
1086+
} else {
1087+
pmb[pmb_nr++] = match;
1088+
}
9391089
}
9401090

9411091
flipped_block = (flipped_block + 1) % 2;
@@ -5583,6 +5733,10 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
55835733
if (o->color_moved) {
55845734
struct hashmap add_lines, del_lines;
55855735

5736+
if (o->color_moved_ws_handling &
5737+
COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE)
5738+
o->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE;
5739+
55865740
hashmap_init(&del_lines, moved_entry_cmp, o, 0);
55875741
hashmap_init(&add_lines, moved_entry_cmp, o, 0);
55885742

diff.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -214,6 +214,9 @@ struct diff_options {
214214
} color_moved;
215215
#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
216216
#define COLOR_MOVED_MIN_ALNUM_COUNT 20
217+
218+
/* XDF_WHITESPACE_FLAGS regarding block detection are set at 2, 3, 4 */
219+
#define COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE (1<<5)
217220
int color_moved_ws_handling;
218221
};
219222

t/t4015-diff-whitespace.sh

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1814,4 +1814,92 @@ test_expect_success 'only move detection ignores white spaces' '
18141814
test_cmp expected actual
18151815
'
18161816

1817+
test_expect_success 'compare whitespace delta across moved blocks' '
1818+
1819+
git reset --hard &&
1820+
q_to_tab <<-\EOF >text.txt &&
1821+
QIndented
1822+
QText across
1823+
Qsome lines
1824+
QBut! <- this stands out
1825+
QAdjusting with
1826+
QQdifferent starting
1827+
Qwhite spaces
1828+
QAnother outlier
1829+
QQQIndented
1830+
QQQText across
1831+
QQQfive lines
1832+
QQQthat has similar lines
1833+
QQQto previous blocks, but with different indent
1834+
QQQYetQAnotherQoutlierQ
1835+
EOF
1836+
1837+
git add text.txt &&
1838+
git commit -m "add text.txt" &&
1839+
1840+
q_to_tab <<-\EOF >text.txt &&
1841+
QQIndented
1842+
QQText across
1843+
QQsome lines
1844+
QQQBut! <- this stands out
1845+
Adjusting with
1846+
Qdifferent starting
1847+
white spaces
1848+
AnotherQoutlier
1849+
QQIndented
1850+
QQText across
1851+
QQfive lines
1852+
QQthat has similar lines
1853+
QQto previous blocks, but with different indent
1854+
QQYetQAnotherQoutlier
1855+
EOF
1856+
1857+
git diff --color --color-moved --color-moved-ws=allow-indentation-change >actual.raw &&
1858+
grep -v "index" actual.raw | test_decode_color >actual &&
1859+
1860+
q_to_tab <<-\EOF >expected &&
1861+
<BOLD>diff --git a/text.txt b/text.txt<RESET>
1862+
<BOLD>--- a/text.txt<RESET>
1863+
<BOLD>+++ b/text.txt<RESET>
1864+
<CYAN>@@ -1,14 +1,14 @@<RESET>
1865+
<BOLD;MAGENTA>-QIndented<RESET>
1866+
<BOLD;MAGENTA>-QText across<RESET>
1867+
<BOLD;MAGENTA>-Qsome lines<RESET>
1868+
<RED>-QBut! <- this stands out<RESET>
1869+
<BOLD;MAGENTA>-QAdjusting with<RESET>
1870+
<BOLD;MAGENTA>-QQdifferent starting<RESET>
1871+
<BOLD;MAGENTA>-Qwhite spaces<RESET>
1872+
<RED>-QAnother outlier<RESET>
1873+
<BOLD;MAGENTA>-QQQIndented<RESET>
1874+
<BOLD;MAGENTA>-QQQText across<RESET>
1875+
<BOLD;MAGENTA>-QQQfive lines<RESET>
1876+
<BOLD;MAGENTA>-QQQthat has similar lines<RESET>
1877+
<BOLD;MAGENTA>-QQQto previous blocks, but with different indent<RESET>
1878+
<RED>-QQQYetQAnotherQoutlierQ<RESET>
1879+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>Indented<RESET>
1880+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>Text across<RESET>
1881+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>some lines<RESET>
1882+
<GREEN>+<RESET>QQQ<GREEN>But! <- this stands out<RESET>
1883+
<BOLD;CYAN>+<RESET><BOLD;CYAN>Adjusting with<RESET>
1884+
<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>different starting<RESET>
1885+
<BOLD;CYAN>+<RESET><BOLD;CYAN>white spaces<RESET>
1886+
<GREEN>+<RESET><GREEN>AnotherQoutlier<RESET>
1887+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>Indented<RESET>
1888+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>Text across<RESET>
1889+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>five lines<RESET>
1890+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>that has similar lines<RESET>
1891+
<BOLD;CYAN>+<RESET>QQ<BOLD;CYAN>to previous blocks, but with different indent<RESET>
1892+
<GREEN>+<RESET>QQ<GREEN>YetQAnotherQoutlier<RESET>
1893+
EOF
1894+
1895+
test_cmp expected actual
1896+
'
1897+
1898+
test_expect_success 'compare whitespace delta incompatible with other space options' '
1899+
test_must_fail git diff \
1900+
--color-moved-ws=allow-indentation-change,ignore-all-space \
1901+
2>err &&
1902+
test_i18ngrep allow-indentation-change err
1903+
'
1904+
18171905
test_done

0 commit comments

Comments
 (0)