Skip to content

Commit aa6e1b2

Browse files
newrengitster
authored andcommitted
dir: avoid unnecessary traversal into ignored directory
The show_other_directories case in treat_directory() tried to handle both excludes and untracked files with the same logic, and mishandled both the excludes and the untracked files in the process, in different ways. Split that logic apart, and then focus on the logic for the excludes; a subsequent commit will address the logic for untracked files. For show_other_directories, an excluded directory means that every path underneath that directory will also be excluded. Given that the calling code requested to just show directories when everything under a directory had the same state (that's what the "DIR_SHOW_OTHER_DIRECTORIES" flag means), we generally do not need to traverse into such directories and can just immediately mark them as ignored (i.e. as path_excluded). The only reason we cannot just immediately return path_excluded is the DIR_HIDE_EMPTY_DIRECTORIES flag and the possibility that the ignored directory is an empty directory. The code previously treated DIR_SHOW_IGNORED_TOO in most cases as an exception as well, which was wrong. It can sometimes reduce the number of cases where we need to recurse (namely if DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set), but should not be able to increase the number of cases where we need to recurse. Fix the logic accordingly. Some sidenotes about possible confusion with dir.c: * "ignored" often refers to an untracked ignore", i.e. a file which is not tracked which matches one of the ignore/exclusion rules. But you can also have a "tracked ignore", a tracked file that happens to match one of the ignore/exclusion rules and which dir.c has to worry about since "git ls-files -c -i" is supposed to list them. * The dir code often uses "ignored" and "excluded" interchangeably, which you need to keep in mind while reading the code. * "exclude" is used multiple ways in the code: * As noted above, "exclude" is often a synonym for "ignored". * The logic for parsing .gitignore files was re-used in .git/info/sparse-checkout, except there it is used to mark paths that the user wants to *keep*. This was mostly addressed by commit 65edd96 ("treewide: rename 'exclude' methods to 'pattern'", 2019-09-03), but every once in a while you'll find a comment about "exclude" referring to these patterns that might in fact be in use by the sparse-checkout machinery for inclusion rules. * The word "EXCLUDE" is also used for pathspec negation, as in (pathspec->items[3].magic & PATHSPEC_EXCLUDE) Thus if a user had a .gitignore file containing *~ *.log !settings.log And then ran git add -- 'settings.*' ':^settings.log' Then :^settings.log is a pathspec negation making settings.log not be requested to be added even though all other settings.* files are being added. Also, !settings.log in the gitignore file is a negative exclude pattern meaning that settings.log is normally a file we want to track even though all other *.log files are ignored. Sometimes it feels like dir.c needs its own glossary with its many definitions, including the multiply-defined terms. Reported-by: Jason Gore <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent a97c7a8 commit aa6e1b2

File tree

2 files changed

+30
-16
lines changed

2 files changed

+30
-16
lines changed

dir.c

Lines changed: 29 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1835,6 +1835,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir,
18351835
}
18361836

18371837
/* This is the "show_other_directories" case */
1838+
assert(dir->flags & DIR_SHOW_OTHER_DIRECTORIES);
18381839

18391840
/*
18401841
* If we have a pathspec which could match something _below_ this
@@ -1845,27 +1846,40 @@ static enum path_treatment treat_directory(struct dir_struct *dir,
18451846
if (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)
18461847
return path_recurse;
18471848

1849+
/* Special cases for where this directory is excluded/ignored */
1850+
if (excluded) {
1851+
/*
1852+
* In the show_other_directories case, if we're not
1853+
* hiding empty directories, there is no need to
1854+
* recurse into an ignored directory.
1855+
*/
1856+
if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))
1857+
return path_excluded;
1858+
1859+
/*
1860+
* Even if we are hiding empty directories, we can still avoid
1861+
* recursing into ignored directories for DIR_SHOW_IGNORED_TOO
1862+
* if DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set.
1863+
*/
1864+
if ((dir->flags & DIR_SHOW_IGNORED_TOO) &&
1865+
(dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING))
1866+
return path_excluded;
1867+
}
1868+
18481869
/*
1849-
* Other than the path_recurse case immediately above, we only need
1850-
* to recurse into untracked/ignored directories if either of the
1851-
* following bits is set:
1870+
* Other than the path_recurse case above, we only need to
1871+
* recurse into untracked directories if either of the following
1872+
* bits is set:
18521873
* - DIR_SHOW_IGNORED_TOO (because then we need to determine if
18531874
* there are ignored entries below)
18541875
* - DIR_HIDE_EMPTY_DIRECTORIES (because we have to determine if
18551876
* the directory is empty)
18561877
*/
1857-
if (!(dir->flags & (DIR_SHOW_IGNORED_TOO | DIR_HIDE_EMPTY_DIRECTORIES)))
1858-
return excluded ? path_excluded : path_untracked;
1859-
1860-
/*
1861-
* ...and even if DIR_SHOW_IGNORED_TOO is set, we can still avoid
1862-
* recursing into ignored directories if the path is excluded and
1863-
* DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set.
1864-
*/
1865-
if (excluded &&
1866-
(dir->flags & DIR_SHOW_IGNORED_TOO) &&
1867-
(dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING))
1868-
return path_excluded;
1878+
if (!excluded &&
1879+
!(dir->flags & (DIR_SHOW_IGNORED_TOO |
1880+
DIR_HIDE_EMPTY_DIRECTORIES))) {
1881+
return path_untracked;
1882+
}
18691883

18701884
/*
18711885
* Even if we don't want to know all the paths under an untracked or

t/t7300-clean.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -746,7 +746,7 @@ test_expect_success 'clean untracked paths by pathspec' '
746746
test_must_be_empty actual
747747
'
748748

749-
test_expect_failure 'avoid traversing into ignored directories' '
749+
test_expect_success 'avoid traversing into ignored directories' '
750750
test_when_finished rm -f output error trace.* &&
751751
test_create_repo avoid-traversing-deep-hierarchy &&
752752
(

0 commit comments

Comments
 (0)