Skip to content

Commit 0133dab

Browse files
jherlandgitster
authored andcommitted
--dirstat-by-file: Make it faster and more correct
Currently, when using --dirstat-by-file, it first does the full --dirstat analysis (using diffcore_count_changes()), and then resets 'damage' to 1, if any damage was found by diffcore_count_changes(). But --dirstat-by-file is not interested in the file damage per se. It only cares if the file changed at all. In that sense it only cares if the blob object for a file has changed. We therefore only need to compare the object names of each file pair in the diff queue and we can skip the entire --dirstat analysis and simply set 'damage' to 1 for each entry where the object name has changed. This makes --dirstat-by-file faster, and also bypasses --dirstat's practice of ignoring rearranged lines within a file. The patch also contains an added testcase verifying that --dirstat-by-file now detects changes that only rearrange lines within a file. Signed-off-by: Johan Herland <[email protected]> Acked-by: Linus Torvalds <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 204f01a commit 0133dab

File tree

3 files changed

+25
-5
lines changed

3 files changed

+25
-5
lines changed

diff.c

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1539,9 +1539,27 @@ static void show_dirstat(struct diff_options *options)
15391539
struct diff_filepair *p = q->queue[i];
15401540
const char *name;
15411541
unsigned long copied, added, damage;
1542+
int content_changed;
15421543

15431544
name = p->one->path ? p->one->path : p->two->path;
15441545

1546+
if (p->one->sha1_valid && p->two->sha1_valid)
1547+
content_changed = hashcmp(p->one->sha1, p->two->sha1);
1548+
else
1549+
content_changed = 1;
1550+
1551+
if (DIFF_OPT_TST(options, DIRSTAT_BY_FILE)) {
1552+
/*
1553+
* In --dirstat-by-file mode, we don't really need to
1554+
* look at the actual file contents at all.
1555+
* The fact that the SHA1 changed is enough for us to
1556+
* add this file to the list of results
1557+
* (with each file contributing equal damage).
1558+
*/
1559+
damage = content_changed ? 1 : 0;
1560+
goto found_damage;
1561+
}
1562+
15451563
if (DIFF_FILE_VALID(p->one) && DIFF_FILE_VALID(p->two)) {
15461564
diff_populate_filespec(p->one, 0);
15471565
diff_populate_filespec(p->two, 0);
@@ -1564,14 +1582,11 @@ static void show_dirstat(struct diff_options *options)
15641582
/*
15651583
* Original minus copied is the removed material,
15661584
* added is the new material. They are both damages
1567-
* made to the preimage. In --dirstat-by-file mode, count
1568-
* damaged files, not damaged lines. This is done by
1569-
* counting only a single damaged line per file.
1585+
* made to the preimage.
15701586
*/
15711587
damage = (p->one->size - copied) + added;
1572-
if (DIFF_OPT_TST(options, DIRSTAT_BY_FILE) && damage > 0)
1573-
damage = 1;
15741588

1589+
found_damage:
15751590
ALLOC_GROW(dir.files, dir.nr + 1, dir.alloc);
15761591
dir.files[dir.nr].name = name;
15771592
dir.files[dir.nr].changed = damage;

t/t4013-diff-various.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,8 @@ diff master master^ side
302302
diff --dirstat master~1 master~2
303303
# --dirstat doesn't notice changes that simply rearrange existing lines
304304
diff --dirstat initial rearrange
305+
# ...but --dirstat-by-file does notice changes that only rearrange lines
306+
diff --dirstat-by-file initial rearrange
305307
EOF
306308

307309
test_expect_success 'log -S requires an argument' '
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
$ git diff --dirstat-by-file initial rearrange
2+
100.0% dir/
3+
$

0 commit comments

Comments
 (0)