Skip to content

Commit 10dd50e

Browse files
jacob-kellergitster
authored andcommitted
diff --no-index: support limiting by pathspec
The --no-index option of git-diff enables using the diff machinery from git while operating outside of a repository. This mode of git diff is able to compare directories and produce a diff of their contents. When operating git diff in a repository, git has the notion of "pathspecs" which can specify which files to compare. In particular, when using git to diff two trees, you might invoke: $ git diff-tree -r <treeish1> <treeish2>. where the treeish could point to a subdirectory of the repository. When invoked this way, users can limit the selected paths of the tree by using a pathspec. Either by providing some list of paths to accept, or by removing paths via a negative refspec. The git diff --no-index mode does not support pathspecs, and cannot limit the diff output in this way. Other diff programs such as GNU difftools have options for excluding paths based on a pattern match. However, using git diff as a diff replacement has several advantages over many popular diff tools, including coloring moved lines, rename detections, and similar. Teach git diff --no-index how to handle pathspecs to limit the comparisons. This will only be supported if both provided paths are directories. For comparisons where one path isn't a directory, the --no-index mode already has some DWIM shortcuts implemented in the fixup_paths() function. Modify the fixup_paths function to return 1 if both paths are directories. If this is the case, interpret any extra arguments to git diff as pathspecs via parse_pathspec. Add a new PATHSPEC_NO_REPOSITORY flag to indicate to the parser that we do not have repository. Use this to aid in error message reporting in the event that the user happens to invoke git diff --no-index from a repository. Use parse_pathspec to load the remaining arguments (if any) to git diff --no-index as pathspec items. Disable PATHSPEC_ATTR support since we do not have a repository to do attribute lookup. Disable PATHSPEC_FROMTOP since we do not have a repository root. All pathspecs are treated as rooted at the provided comparison paths. Load the pathspecs twice, once prefixed with paths[0] and once prefixed with paths[1]. I considered trying to avoid this, but don't yet have a workable solution that reuses the same pathspec objects. We need the directory paths as prefixes otherwise the match_pathspec won't compare properly. Pass the pathspec object for both paths into queue_diff, who in-turn passes these along to read_directory_contents. Modify read_directory_contents to check against the pathspecs when scanning the directory. In order to properly recurse, we need to ensure that directories match, and that we continue iterating down the nested directory structure: $ git diff --no-index a b c/d This should include all paths in a and b which match the c/d pathspec. In particular, if there was 'a/c/d' we need to match it. But to check 'a/c/d', we need to first get to that part, which requires comparing 'a/c' first. With the match_pathspec() function, 'a/c' won't match the 'a/c/d' pathspec string. However, if we always passed DO_MATCH_LEADING_PATHSPEC, then we will incorrectly match in certain cases such as matching 'a/c' against ':(glob)**/d'. The match logic will see that a matches the leading part of the **/ and accept this even tho c doesn't match. The trick seems to be setting both DO_MATCH_LEADING_PATHSPEC and DO_MATCH_DIRECTORY when checking directories, but set neither of them when checking files. Some other gotchas and open questions: 1) pathspecs must all come after the first two path arguments, you can't re-arrange them to come first. I'm treating them sort of like the treeish arguments to git diff-tree. 2) The pathspecs are interpreted relative to the provided paths, and thus will always need to be specified as relative paths, and will be interpreted as relative to the root of the search for each path separately. 3) negative pathspecs have to be fully qualified from the root, i.e. ':(exclude)file' will only exclude 'a/file' and not 'a/b/file' unless you also use '(glob)' or similar. I think this matches the other pathspec support, but I an not 100% sure. 4) I'm not certain about exposing match_pathspec_with_flags as-is, since DO_MATCH_EXCLUDE shouldn't be passed. I got the behavior I expected with DO_MATCH_LEADING_PATHSPEC, but it feels a bit of a weird API. Perhaps match_pathspec could set both flags when is_dir is true? But would that break other callers? Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 569e259 commit 10dd50e

File tree

4 files changed

+189
-19
lines changed

4 files changed

+189
-19
lines changed

Documentation/git-diff.adoc

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ git diff [<options>] --cached [--merge-base] [<commit>] [--] [<path>...]
1414
git diff [<options>] [--merge-base] <commit> [<commit>...] <commit> [--] [<path>...]
1515
git diff [<options>] <commit>...<commit> [--] [<path>...]
1616
git diff [<options>] <blob> <blob>
17-
git diff [<options>] --no-index [--] <path> <path>
17+
git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]
1818

1919
DESCRIPTION
2020
-----------
@@ -31,14 +31,18 @@ files on disk.
3131
further add to the index but you still haven't. You can
3232
stage these changes by using linkgit:git-add[1].
3333

34-
`git diff [<options>] --no-index [--] <path> <path>`::
34+
`git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]`::
3535

3636
This form is to compare the given two paths on the
3737
filesystem. You can omit the `--no-index` option when
3838
running the command in a working tree controlled by Git and
3939
at least one of the paths points outside the working tree,
4040
or when running the command outside a working tree
41-
controlled by Git. This form implies `--exit-code`.
41+
controlled by Git. This form implies `--exit-code`. If both
42+
paths point to directories, additional pathspecs may be
43+
provided. These will limit the files included in the
44+
difference. All such pathspecs must be relative as they
45+
apply to both sides of the diff.
4246

4347
`git diff [<options>] --cached [--merge-base] [<commit>] [--] [<path>...]`::
4448

builtin/diff.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ static const char builtin_diff_usage[] =
3535
" or: git diff [<options>] [--merge-base] <commit> [<commit>...] <commit> [--] [<path>...]\n"
3636
" or: git diff [<options>] <commit>...<commit> [--] [<path>...]\n"
3737
" or: git diff [<options>] <blob> <blob>\n"
38-
" or: git diff [<options>] --no-index [--] <path> <path>"
38+
" or: git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]"
3939
"\n"
4040
COMMON_DIFF_OPTIONS_HELP;
4141

diff-no-index.c

Lines changed: 74 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,20 +15,47 @@
1515
#include "gettext.h"
1616
#include "revision.h"
1717
#include "parse-options.h"
18+
#include "pathspec.h"
1819
#include "string-list.h"
1920
#include "dir.h"
2021

21-
static int read_directory_contents(const char *path, struct string_list *list)
22+
static int read_directory_contents(const char *path, struct string_list *list,
23+
const struct pathspec *pathspec)
2224
{
25+
struct strbuf match = STRBUF_INIT;
26+
int len;
2327
DIR *dir;
2428
struct dirent *e;
2529

2630
if (!(dir = opendir(path)))
2731
return error("Could not open directory %s", path);
2832

29-
while ((e = readdir_skip_dot_and_dotdot(dir)))
33+
if (pathspec) {
34+
strbuf_addstr(&match, path);
35+
strbuf_complete(&match, '/');
36+
len = match.len;
37+
}
38+
39+
while ((e = readdir_skip_dot_and_dotdot(dir))) {
40+
if (pathspec) {
41+
int flags = 0;
42+
43+
strbuf_setlen(&match, len);
44+
strbuf_addstr(&match, e->d_name);
45+
46+
if (e->d_type == DT_DIR)
47+
flags = DO_MATCH_LEADING_PATHSPEC | DO_MATCH_DIRECTORY;
48+
49+
if (!match_pathspec_with_flags(NULL, pathspec,
50+
match.buf, match.len,
51+
0, NULL, flags))
52+
continue;
53+
}
54+
3055
string_list_insert(list, e->d_name);
56+
}
3157

58+
strbuf_release(&match);
3259
closedir(dir);
3360
return 0;
3461
}
@@ -131,7 +158,8 @@ static struct diff_filespec *noindex_filespec(const struct git_hash_algo *algop,
131158
}
132159

133160
static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop,
134-
const char *name1, const char *name2, int recursing)
161+
const char *name1, const char *name2, int recursing,
162+
const struct pathspec *ps1, const struct pathspec *ps2)
135163
{
136164
int mode1 = 0, mode2 = 0;
137165
enum special special1 = SPECIAL_NONE, special2 = SPECIAL_NONE;
@@ -171,9 +199,9 @@ static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop,
171199
int i1, i2, ret = 0;
172200
size_t len1 = 0, len2 = 0;
173201

174-
if (name1 && read_directory_contents(name1, &p1))
202+
if (name1 && read_directory_contents(name1, &p1, ps1))
175203
return -1;
176-
if (name2 && read_directory_contents(name2, &p2)) {
204+
if (name2 && read_directory_contents(name2, &p2, ps2)) {
177205
string_list_clear(&p1, 0);
178206
return -1;
179207
}
@@ -218,7 +246,7 @@ static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop,
218246
n2 = buffer2.buf;
219247
}
220248

221-
ret = queue_diff(o, algop, n1, n2, 1);
249+
ret = queue_diff(o, algop, n1, n2, 1, ps1, ps2);
222250
}
223251
string_list_clear(&p1, 0);
224252
string_list_clear(&p2, 0);
@@ -258,8 +286,10 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
258286
* DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
259287
* Note that we append the basename of F to D/, so "diff a/b/file D"
260288
* becomes "diff a/b/file D/file", not "diff a/b/file D/a/b/file".
289+
*
290+
* Return 1 if both paths are directories, 0 otherwise.
261291
*/
262-
static void fixup_paths(const char **path, struct strbuf *replacement)
292+
static int fixup_paths(const char **path, struct strbuf *replacement)
263293
{
264294
struct stat st;
265295
unsigned int isdir0 = 0, isdir1 = 0;
@@ -282,25 +312,30 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
282312
if ((isdir0 && ispipe1) || (ispipe0 && isdir1))
283313
die(_("cannot compare a named pipe to a directory"));
284314

285-
if (isdir0 == isdir1)
286-
return;
315+
/* if both paths are directories, we will enable pathspecs */
316+
if (isdir0 && isdir1)
317+
return 1;
318+
287319
if (isdir0) {
288320
append_basename(replacement, path[0], path[1]);
289321
path[0] = replacement->buf;
290-
} else {
322+
} else if (isdir1) {
291323
append_basename(replacement, path[1], path[0]);
292324
path[1] = replacement->buf;
293325
}
326+
327+
return 0;
294328
}
295329

296330
static const char * const diff_no_index_usage[] = {
297-
N_("git diff --no-index [<options>] <path> <path>"),
331+
N_("git diff --no-index [<options>] <path> <path> [<pathspec>...]"),
298332
NULL
299333
};
300334

301335
int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
302336
int implicit_no_index, int argc, const char **argv)
303337
{
338+
struct pathspec pathspec1, pathspec2, *ps1 = NULL, *ps2 = NULL;
304339
int i, no_index;
305340
int ret = 1;
306341
const char *paths[2];
@@ -317,13 +352,12 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
317352
options = add_diff_options(no_index_options, &revs->diffopt);
318353
argc = parse_options(argc, argv, revs->prefix, options,
319354
diff_no_index_usage, 0);
320-
if (argc != 2) {
355+
if (argc < 2) {
321356
if (implicit_no_index)
322357
warning(_("Not a git repository. Use --no-index to "
323358
"compare two paths outside a working tree"));
324359
usage_with_options(diff_no_index_usage, options);
325360
}
326-
FREE_AND_NULL(options);
327361
for (i = 0; i < 2; i++) {
328362
const char *p = argv[i];
329363
if (!strcmp(p, "-"))
@@ -337,7 +371,28 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
337371
paths[i] = p;
338372
}
339373

340-
fixup_paths(paths, &replacement);
374+
/* TODO: should we try to catch pathspec-like paths first and warn or
375+
* error? We accepted those as valid 'paths' before so it seems
376+
* unlikely we can change that behavior.
377+
*/
378+
if (fixup_paths(paths, &replacement)) {
379+
parse_pathspec(&pathspec1, PATHSPEC_FROMTOP | PATHSPEC_ATTR,
380+
PATHSPEC_PREFER_FULL | PATHSPEC_NO_REPOSITORY,
381+
paths[0], &argv[2]);
382+
if (pathspec1.nr)
383+
ps1 = &pathspec1;
384+
385+
parse_pathspec(&pathspec2, PATHSPEC_FROMTOP | PATHSPEC_ATTR,
386+
PATHSPEC_PREFER_FULL | PATHSPEC_NO_REPOSITORY,
387+
paths[1], &argv[2]);
388+
if (pathspec2.nr)
389+
ps2 = &pathspec2;
390+
} else if (argc > 2) {
391+
warning(_("Limiting comparison with pathspecs is only "
392+
"supported if both paths are directories."));
393+
usage_with_options(diff_no_index_usage, options);
394+
}
395+
FREE_AND_NULL(options);
341396

342397
revs->diffopt.skip_stat_unmatch = 1;
343398
if (!revs->diffopt.output_format)
@@ -354,7 +409,7 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
354409
setup_diff_pager(&revs->diffopt);
355410
revs->diffopt.flags.exit_with_status = 1;
356411

357-
if (queue_diff(&revs->diffopt, algop, paths[0], paths[1], 0))
412+
if (queue_diff(&revs->diffopt, algop, paths[0], paths[1], 0, ps1, ps2))
358413
goto out;
359414
diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
360415
diffcore_std(&revs->diffopt);
@@ -370,5 +425,9 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
370425
for (i = 0; i < ARRAY_SIZE(to_free); i++)
371426
free(to_free[i]);
372427
strbuf_release(&replacement);
428+
if (ps1)
429+
clear_pathspec(ps1);
430+
if (ps2)
431+
clear_pathspec(ps2);
373432
return ret;
374433
}

t/t4053-diff-no-index.sh

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -295,4 +295,111 @@ test_expect_success PIPE,SYMLINKS 'diff --no-index reads from pipes' '
295295
test_cmp expect actual
296296
'
297297

298+
test_expect_success 'diff --no-index F F rejects pathspecs' '
299+
test_must_fail git diff --no-index -- a/1 a/2 a 2>actual.err &&
300+
test_grep "usage: git diff --no-index" actual.err
301+
'
302+
303+
test_expect_success 'diff --no-index D F rejects pathspecs' '
304+
test_must_fail git diff --no-index -- a a/2 a 2>actual.err &&
305+
test_grep "usage: git diff --no-index" actual.err
306+
'
307+
308+
test_expect_success 'diff --no-index F D rejects pathspecs' '
309+
test_must_fail git diff --no-index -- a/1 b b 2>actual.err &&
310+
test_grep "usage: git diff --no-index" actual.err
311+
'
312+
313+
test_expect_success 'diff --no-index with pathspec' '
314+
test_expect_code 1 git diff --no-index a b 1 >actual &&
315+
cat >expect <<-EOF &&
316+
diff --git a/a/1 b/a/1
317+
deleted file mode 100644
318+
index d00491f..0000000
319+
--- a/a/1
320+
+++ /dev/null
321+
@@ -1 +0,0 @@
322+
-1
323+
EOF
324+
test_cmp expect actual
325+
'
326+
327+
test_expect_success 'diff --no-index with pathspec no matches' '
328+
test_expect_code 0 git diff --no-index a b missing
329+
'
330+
331+
test_expect_success 'diff --no-index with negative pathspec' '
332+
test_expect_code 1 git diff --no-index a b ":!2" >actual &&
333+
cat >expect <<-EOF &&
334+
diff --git a/a/1 b/a/1
335+
deleted file mode 100644
336+
index d00491f..0000000
337+
--- a/a/1
338+
+++ /dev/null
339+
@@ -1 +0,0 @@
340+
-1
341+
EOF
342+
test_cmp expect actual
343+
'
344+
345+
test_expect_success 'setup nested' '
346+
mkdir -p c/1/2 &&
347+
mkdir -p d/1/2 &&
348+
echo 1 >c/1/2/a &&
349+
echo 2 >c/1/2/b
350+
'
351+
352+
test_expect_success 'diff --no-index with pathspec nested negative pathspec' '
353+
test_expect_code 0 git diff --no-index c d ":!1"
354+
'
355+
356+
test_expect_success 'diff --no-index with pathspec nested pathspec' '
357+
test_expect_code 1 git diff --no-index c d 1/2 >actual &&
358+
cat >expect <<-EOF &&
359+
diff --git a/c/1/2/a b/c/1/2/a
360+
deleted file mode 100644
361+
index d00491f..0000000
362+
--- a/c/1/2/a
363+
+++ /dev/null
364+
@@ -1 +0,0 @@
365+
-1
366+
diff --git a/c/1/2/b b/c/1/2/b
367+
deleted file mode 100644
368+
index 0cfbf08..0000000
369+
--- a/c/1/2/b
370+
+++ /dev/null
371+
@@ -1 +0,0 @@
372+
-2
373+
EOF
374+
test_cmp expect actual
375+
'
376+
377+
test_expect_success 'diff --no-index with pathspec glob' '
378+
test_expect_code 1 git diff --no-index c d ":(glob)**/a" >actual &&
379+
cat >expect <<-EOF &&
380+
diff --git a/c/1/2/a b/c/1/2/a
381+
deleted file mode 100644
382+
index d00491f..0000000
383+
--- a/c/1/2/a
384+
+++ /dev/null
385+
@@ -1 +0,0 @@
386+
-1
387+
EOF
388+
test_cmp expect actual
389+
'
390+
391+
test_expect_success 'diff --no-index with pathspec glob and exclude' '
392+
test_expect_code 1 git diff --no-index c d ":(glob,exclude)**/a" >actual &&
393+
cat >expect <<-EOF &&
394+
diff --git a/c/1/2/b b/c/1/2/b
395+
deleted file mode 100644
396+
index 0cfbf08..0000000
397+
--- a/c/1/2/b
398+
+++ /dev/null
399+
@@ -1 +0,0 @@
400+
-2
401+
EOF
402+
test_cmp expect actual
403+
'
404+
298405
test_done

0 commit comments

Comments
 (0)