Skip to content

Commit 957876f

Browse files
peffgitster
authored andcommitted
combine-diff: handle --find-object in multitree code path
When doing combined diffs, we have two possible code paths: - a slower one which independently diffs against each parent, applies any filters, and then intersects the resulting paths - a faster one which walks all trees simultaneously When the diff options specify that we must do certain filters, like pickaxe, then we always use the slow path, since the pickaxe code only knows how to handle filepairs, not the n-parent entries generated for combined diffs. But there are two problems with the slow path: 1. It's slow. Running: git rev-list HEAD | git diff-tree --stdin -r -c in git.git takes ~3s on my machine. But adding "--find-object" to that increases it to ~6s, even though find-object itself should incur only a few extra oid comparisons. On linux.git, it's even worse: 35s versus 215s. 2. It doesn't catch all cases where a particular path is interesting. Consider a merge with parent blobs X and Y for a particular path, and end result Z. That should be interesting according to "-c", because the result doesn't match either parent. And it should be interesting even with "--find-object=X", because "X" went away in the merge. But because we perform each pairwise diff independently, this confuses the intersection code. The change from X to Z is still interesting according to --find-object. But in the other parent we went from Y to Z, so the diff appears empty! That causes the intersection code to think that parent didn't change the path, and thus it's not interesting for "-c". This patch fixes both by implementing --find-object for the multitree code. It's a bit unfortunate that we have to duplicate some logic from diffcore-pickaxe, but this is the best we can do for now. In an ideal world, all of the diffcore code would stop thinking about filepairs and start thinking about n-parent sets, and we could use the multitree walk with all of it. Until then, there are some leftover warts: - other pickaxe operations, like -S or -G, still suffer from both problems. These would be hard to adapt because they rely on having a diff_filespec() for each path to look at content. And we'd need to define what an n-way "change" means in each case (probably easy for "-S", which can compare counts, but not so clear for -G, which is about grepping diffs). - other options besides --find-object may cause us to use the slow pairwise path, in which case we'll go back to producing a different (wrong) answer for the X/Y/Z case above. We may be able to hack around these, but I think the ultimate solution will be a larger rewrite of the diffcore code. For now, this patch improves one specific case but leaves the rest. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 47ae905 commit 957876f

File tree

2 files changed

+96
-2
lines changed

2 files changed

+96
-2
lines changed

combine-diff.c

Lines changed: 41 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1451,6 +1451,42 @@ static struct combine_diff_path *find_paths_multitree(
14511451
return paths_head.next;
14521452
}
14531453

1454+
static int match_objfind(struct combine_diff_path *path,
1455+
int num_parent,
1456+
const struct oidset *set)
1457+
{
1458+
int i;
1459+
if (oidset_contains(set, &path->oid))
1460+
return 1;
1461+
for (i = 0; i < num_parent; i++) {
1462+
if (oidset_contains(set, &path->parent[i].oid))
1463+
return 1;
1464+
}
1465+
return 0;
1466+
}
1467+
1468+
static struct combine_diff_path *combined_objfind(struct diff_options *opt,
1469+
struct combine_diff_path *paths,
1470+
int num_parent)
1471+
{
1472+
struct combine_diff_path *ret = NULL, **tail = &ret;
1473+
struct combine_diff_path *p = paths;
1474+
1475+
while (p) {
1476+
struct combine_diff_path *next = p->next;
1477+
1478+
if (match_objfind(p, num_parent, opt->objfind)) {
1479+
p->next = NULL;
1480+
*tail = p;
1481+
tail = &p->next;
1482+
} else {
1483+
free(p);
1484+
}
1485+
p = next;
1486+
}
1487+
1488+
return ret;
1489+
}
14541490

14551491
void diff_tree_combined(const struct object_id *oid,
14561492
const struct oid_array *parents,
@@ -1506,10 +1542,10 @@ void diff_tree_combined(const struct object_id *oid,
15061542
opt->flags.follow_renames ||
15071543
opt->break_opt != -1 ||
15081544
opt->detect_rename ||
1509-
(opt->pickaxe_opts & DIFF_PICKAXE_KINDS_MASK) ||
1545+
(opt->pickaxe_opts &
1546+
(DIFF_PICKAXE_KINDS_MASK & ~DIFF_PICKAXE_KIND_OBJFIND)) ||
15101547
opt->filter;
15111548

1512-
15131549
if (need_generic_pathscan) {
15141550
/*
15151551
* NOTE generic case also handles --stat, as it computes
@@ -1523,6 +1559,9 @@ void diff_tree_combined(const struct object_id *oid,
15231559
int stat_opt;
15241560
paths = find_paths_multitree(oid, parents, &diffopts);
15251561

1562+
if (opt->pickaxe_opts & DIFF_PICKAXE_KIND_OBJFIND)
1563+
paths = combined_objfind(opt, paths, num_parent);
1564+
15261565
/*
15271566
* show stat against the first parent even
15281567
* when doing combined diff.

t/t4064-diff-oidfind.sh

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,4 +65,59 @@ test_expect_success 'find a submodule' '
6565
test_cmp expect actual
6666
'
6767

68+
test_expect_success 'set up merge tests' '
69+
test_commit base &&
70+
71+
git checkout -b boring base^ &&
72+
echo boring >file &&
73+
git add file &&
74+
git commit -m boring &&
75+
76+
git checkout -b interesting base^ &&
77+
echo interesting >file &&
78+
git add file &&
79+
git commit -m interesting &&
80+
81+
blob=$(git rev-parse interesting:file)
82+
'
83+
84+
test_expect_success 'detect merge which introduces blob' '
85+
git checkout -B merge base &&
86+
git merge --no-commit boring &&
87+
echo interesting >file &&
88+
git commit -am "introduce blob" &&
89+
git diff-tree --format=%s --find-object=$blob -c --name-status HEAD >actual &&
90+
cat >expect <<-\EOF &&
91+
introduce blob
92+
93+
AM file
94+
EOF
95+
test_cmp expect actual
96+
'
97+
98+
test_expect_success 'detect merge which removes blob' '
99+
git checkout -B merge interesting &&
100+
git merge --no-commit base &&
101+
echo boring >file &&
102+
git commit -am "remove blob" &&
103+
git diff-tree --format=%s --find-object=$blob -c --name-status HEAD >actual &&
104+
cat >expect <<-\EOF &&
105+
remove blob
106+
107+
MA file
108+
EOF
109+
test_cmp expect actual
110+
'
111+
112+
test_expect_success 'do not detect merge that does not touch blob' '
113+
git checkout -B merge interesting &&
114+
git merge -m "untouched blob" base &&
115+
git diff-tree --format=%s --find-object=$blob -c --name-status HEAD >actual &&
116+
cat >expect <<-\EOF &&
117+
untouched blob
118+
119+
EOF
120+
test_cmp expect actual
121+
'
122+
68123
test_done

0 commit comments

Comments
 (0)