Skip to content

Commit 38ccaf9

Browse files
committed
Merge branch 'nd/untracked-cache'
Teach the index to optionally remember already seen untracked files to speed up "git status" in a working tree with tons of cruft. * nd/untracked-cache: (24 commits) git-status.txt: advertisement for untracked cache untracked cache: guard and disable on system changes mingw32: add uname() t7063: tests for untracked cache update-index: test the system before enabling untracked cache update-index: manually enable or disable untracked cache status: enable untracked cache untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE untracked cache: mark index dirty if untracked cache is updated untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS untracked cache: avoid racy timestamps read-cache.c: split racy stat test to a separate function untracked cache: invalidate at index addition or removal untracked cache: load from UNTR index extension untracked cache: save to an index extension ewah: add convenient wrapper ewah_serialize_strbuf() untracked cache: don't open non-existent .gitignore untracked cache: mark what dirs should be recursed/saved untracked cache: record/validate dir mtime and reuse cached output untracked cache: make a wrapper around {open,read,close}dir() ...
2 parents a26d48a + aeb6f8b commit 38ccaf9

21 files changed

+1822
-58
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,7 @@
184184
/test-delta
185185
/test-dump-cache-tree
186186
/test-dump-split-index
187+
/test-dump-untracked-cache
187188
/test-scrap-cache-tree
188189
/test-genrandom
189190
/test-hashmap

Documentation/git-status.txt

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,10 @@ When `-u` option is not used, untracked files and directories are
6666
shown (i.e. the same as specifying `normal`), to help you avoid
6767
forgetting to add newly created files. Because it takes extra work
6868
to find untracked files in the filesystem, this mode may take some
69-
time in a large working tree. You can use `no` to have `git status`
69+
time in a large working tree.
70+
Consider enabling untracked cache and split index if supported (see
71+
`git update-index --untracked-cache` and `git update-index
72+
--split-index`), Otherwise you can use `no` to have `git status`
7073
return more quickly without showing untracked files.
7174
+
7275
The default can be changed using the status.showUntrackedFiles

Documentation/git-update-index.txt

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,20 @@ may not support it yet.
170170
the shared index file. This mode is designed for very large
171171
indexes that take a significant amount of time to read or write.
172172

173+
--untracked-cache::
174+
--no-untracked-cache::
175+
Enable or disable untracked cache extension. This could speed
176+
up for commands that involve determining untracked files such
177+
as `git status`. The underlying operating system and file
178+
system must change `st_mtime` field of a directory if files
179+
are added or deleted in that directory.
180+
181+
--force-untracked-cache::
182+
For safety, `--untracked-cache` performs tests on the working
183+
directory to make sure untracked cache can be used. These
184+
tests can take a few seconds. `--force-untracked-cache` can be
185+
used to skip the tests.
186+
173187
\--::
174188
Do not interpret any more arguments as options.
175189

Documentation/technical/index-format.txt

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,3 +233,65 @@ Git index format
233233
The remaining index entries after replaced ones will be added to the
234234
final index. These added entries are also sorted by entry name then
235235
stage.
236+
237+
== Untracked cache
238+
239+
Untracked cache saves the untracked file list and necessary data to
240+
verify the cache. The signature for this extension is { 'U', 'N',
241+
'T', 'R' }.
242+
243+
The extension starts with
244+
245+
- A sequence of NUL-terminated strings, preceded by the size of the
246+
sequence in variable width encoding. Each string describes the
247+
environment where the cache can be used.
248+
249+
- Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
250+
ctime field until "file size".
251+
252+
- Stat data of core.excludesfile
253+
254+
- 32-bit dir_flags (see struct dir_struct)
255+
256+
- 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file
257+
does not exist.
258+
259+
- 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does
260+
not exist.
261+
262+
- NUL-terminated string of per-dir exclude file name. This usually
263+
is ".gitignore".
264+
265+
- The number of following directory blocks, variable width
266+
encoding. If this number is zero, the extension ends here with a
267+
following NUL.
268+
269+
- A number of directory blocks in depth-first-search order, each
270+
consists of
271+
272+
- The number of untracked entries, variable width encoding.
273+
274+
- The number of sub-directory blocks, variable width encoding.
275+
276+
- The directory name terminated by NUL.
277+
278+
- A number of untrached file/dir names terminated by NUL.
279+
280+
The remaining data of each directory block is grouped by type:
281+
282+
- An ewah bitmap, the n-th bit marks whether the n-th directory has
283+
valid untracked cache entries.
284+
285+
- An ewah bitmap, the n-th bit records "check-only" bit of
286+
read_directory_recursive() for the n-th directory.
287+
288+
- An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data
289+
is valid for the n-th directory and exists in the next data.
290+
291+
- An array of stat data. The n-th data corresponds with the n-th
292+
"one" bit in the previous ewah bitmap.
293+
294+
- An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit
295+
in the previous ewah bitmap.
296+
297+
- One NUL.

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -574,6 +574,7 @@ TEST_PROGRAMS_NEED_X += test-date
574574
TEST_PROGRAMS_NEED_X += test-delta
575575
TEST_PROGRAMS_NEED_X += test-dump-cache-tree
576576
TEST_PROGRAMS_NEED_X += test-dump-split-index
577+
TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
577578
TEST_PROGRAMS_NEED_X += test-genrandom
578579
TEST_PROGRAMS_NEED_X += test-hashmap
579580
TEST_PROGRAMS_NEED_X += test-index-version

builtin/commit.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1366,13 +1366,14 @@ int cmd_status(int argc, const char **argv, const char *prefix)
13661366
refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, &s.pathspec, NULL, NULL);
13671367

13681368
fd = hold_locked_index(&index_lock, 0);
1369-
if (0 <= fd)
1370-
update_index_if_able(&the_index, &index_lock);
13711369

13721370
s.is_initial = get_sha1(s.reference, sha1) ? 1 : 0;
13731371
s.ignore_submodule_arg = ignore_submodule_arg;
13741372
wt_status_collect(&s);
13751373

1374+
if (0 <= fd)
1375+
update_index_if_able(&the_index, &index_lock);
1376+
13761377
if (s.relative_paths)
13771378
s.prefix = prefix;
13781379

builtin/update-index.c

Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ static int mark_valid_only;
3333
static int mark_skip_worktree_only;
3434
#define MARK_FLAG 1
3535
#define UNMARK_FLAG 2
36+
static struct strbuf mtime_dir = STRBUF_INIT;
3637

3738
__attribute__((format (printf, 1, 2)))
3839
static void report(const char *fmt, ...)
@@ -48,6 +49,166 @@ static void report(const char *fmt, ...)
4849
va_end(vp);
4950
}
5051

52+
static void remove_test_directory(void)
53+
{
54+
if (mtime_dir.len)
55+
remove_dir_recursively(&mtime_dir, 0);
56+
}
57+
58+
static const char *get_mtime_path(const char *path)
59+
{
60+
static struct strbuf sb = STRBUF_INIT;
61+
strbuf_reset(&sb);
62+
strbuf_addf(&sb, "%s/%s", mtime_dir.buf, path);
63+
return sb.buf;
64+
}
65+
66+
static void xmkdir(const char *path)
67+
{
68+
path = get_mtime_path(path);
69+
if (mkdir(path, 0700))
70+
die_errno(_("failed to create directory %s"), path);
71+
}
72+
73+
static int xstat_mtime_dir(struct stat *st)
74+
{
75+
if (stat(mtime_dir.buf, st))
76+
die_errno(_("failed to stat %s"), mtime_dir.buf);
77+
return 0;
78+
}
79+
80+
static int create_file(const char *path)
81+
{
82+
int fd;
83+
path = get_mtime_path(path);
84+
fd = open(path, O_CREAT | O_RDWR, 0644);
85+
if (fd < 0)
86+
die_errno(_("failed to create file %s"), path);
87+
return fd;
88+
}
89+
90+
static void xunlink(const char *path)
91+
{
92+
path = get_mtime_path(path);
93+
if (unlink(path))
94+
die_errno(_("failed to delete file %s"), path);
95+
}
96+
97+
static void xrmdir(const char *path)
98+
{
99+
path = get_mtime_path(path);
100+
if (rmdir(path))
101+
die_errno(_("failed to delete directory %s"), path);
102+
}
103+
104+
static void avoid_racy(void)
105+
{
106+
/*
107+
* not use if we could usleep(10) if USE_NSEC is defined. The
108+
* field nsec could be there, but the OS could choose to
109+
* ignore it?
110+
*/
111+
sleep(1);
112+
}
113+
114+
static int test_if_untracked_cache_is_supported(void)
115+
{
116+
struct stat st;
117+
struct stat_data base;
118+
int fd, ret = 0;
119+
120+
strbuf_addstr(&mtime_dir, "mtime-test-XXXXXX");
121+
if (!mkdtemp(mtime_dir.buf))
122+
die_errno("Could not make temporary directory");
123+
124+
fprintf(stderr, _("Testing "));
125+
atexit(remove_test_directory);
126+
xstat_mtime_dir(&st);
127+
fill_stat_data(&base, &st);
128+
fputc('.', stderr);
129+
130+
avoid_racy();
131+
fd = create_file("newfile");
132+
xstat_mtime_dir(&st);
133+
if (!match_stat_data(&base, &st)) {
134+
close(fd);
135+
fputc('\n', stderr);
136+
fprintf_ln(stderr,_("directory stat info does not "
137+
"change after adding a new file"));
138+
goto done;
139+
}
140+
fill_stat_data(&base, &st);
141+
fputc('.', stderr);
142+
143+
avoid_racy();
144+
xmkdir("new-dir");
145+
xstat_mtime_dir(&st);
146+
if (!match_stat_data(&base, &st)) {
147+
close(fd);
148+
fputc('\n', stderr);
149+
fprintf_ln(stderr, _("directory stat info does not change "
150+
"after adding a new directory"));
151+
goto done;
152+
}
153+
fill_stat_data(&base, &st);
154+
fputc('.', stderr);
155+
156+
avoid_racy();
157+
write_or_die(fd, "data", 4);
158+
close(fd);
159+
xstat_mtime_dir(&st);
160+
if (match_stat_data(&base, &st)) {
161+
fputc('\n', stderr);
162+
fprintf_ln(stderr, _("directory stat info changes "
163+
"after updating a file"));
164+
goto done;
165+
}
166+
fputc('.', stderr);
167+
168+
avoid_racy();
169+
close(create_file("new-dir/new"));
170+
xstat_mtime_dir(&st);
171+
if (match_stat_data(&base, &st)) {
172+
fputc('\n', stderr);
173+
fprintf_ln(stderr, _("directory stat info changes after "
174+
"adding a file inside subdirectory"));
175+
goto done;
176+
}
177+
fputc('.', stderr);
178+
179+
avoid_racy();
180+
xunlink("newfile");
181+
xstat_mtime_dir(&st);
182+
if (!match_stat_data(&base, &st)) {
183+
fputc('\n', stderr);
184+
fprintf_ln(stderr, _("directory stat info does not "
185+
"change after deleting a file"));
186+
goto done;
187+
}
188+
fill_stat_data(&base, &st);
189+
fputc('.', stderr);
190+
191+
avoid_racy();
192+
xunlink("new-dir/new");
193+
xrmdir("new-dir");
194+
xstat_mtime_dir(&st);
195+
if (!match_stat_data(&base, &st)) {
196+
fputc('\n', stderr);
197+
fprintf_ln(stderr, _("directory stat info does not "
198+
"change after deleting a directory"));
199+
goto done;
200+
}
201+
202+
if (rmdir(mtime_dir.buf))
203+
die_errno(_("failed to delete directory %s"), mtime_dir.buf);
204+
fprintf_ln(stderr, _(" OK"));
205+
ret = 1;
206+
207+
done:
208+
strbuf_release(&mtime_dir);
209+
return ret;
210+
}
211+
51212
static int mark_ce_flags(const char *path, int flag, int mark)
52213
{
53214
int namelen = strlen(path);
@@ -741,6 +902,7 @@ static int reupdate_callback(struct parse_opt_ctx_t *ctx,
741902
int cmd_update_index(int argc, const char **argv, const char *prefix)
742903
{
743904
int newfd, entries, has_errors = 0, line_termination = '\n';
905+
int untracked_cache = -1;
744906
int read_from_stdin = 0;
745907
int prefix_length = prefix ? strlen(prefix) : 0;
746908
int preferred_index_format = 0;
@@ -832,6 +994,10 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
832994
N_("write index in this format")),
833995
OPT_BOOL(0, "split-index", &split_index,
834996
N_("enable or disable split index")),
997+
OPT_BOOL(0, "untracked-cache", &untracked_cache,
998+
N_("enable/disable untracked cache")),
999+
OPT_SET_INT(0, "force-untracked-cache", &untracked_cache,
1000+
N_("enable untracked cache without testing the filesystem"), 2),
8351001
OPT_END()
8361002
};
8371003

@@ -938,6 +1104,28 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
9381104
the_index.split_index = NULL;
9391105
the_index.cache_changed |= SOMETHING_CHANGED;
9401106
}
1107+
if (untracked_cache > 0) {
1108+
struct untracked_cache *uc;
1109+
1110+
if (untracked_cache < 2) {
1111+
setup_work_tree();
1112+
if (!test_if_untracked_cache_is_supported())
1113+
return 1;
1114+
}
1115+
if (!the_index.untracked) {
1116+
uc = xcalloc(1, sizeof(*uc));
1117+
strbuf_init(&uc->ident, 100);
1118+
uc->exclude_per_dir = ".gitignore";
1119+
/* should be the same flags used by git-status */
1120+
uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
1121+
the_index.untracked = uc;
1122+
}
1123+
add_untracked_ident(the_index.untracked);
1124+
the_index.cache_changed |= UNTRACKED_CHANGED;
1125+
} else if (!untracked_cache && the_index.untracked) {
1126+
the_index.untracked = NULL;
1127+
the_index.cache_changed |= UNTRACKED_CHANGED;
1128+
}
9411129

9421130
if (active_cache_changed) {
9431131
if (newfd < 0) {

cache.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -297,8 +297,11 @@ static inline unsigned int canon_mode(unsigned int mode)
297297
#define RESOLVE_UNDO_CHANGED (1 << 4)
298298
#define CACHE_TREE_CHANGED (1 << 5)
299299
#define SPLIT_INDEX_ORDERED (1 << 6)
300+
#define UNTRACKED_CHANGED (1 << 7)
300301

301302
struct split_index;
303+
struct untracked_cache;
304+
302305
struct index_state {
303306
struct cache_entry **cache;
304307
unsigned int version;
@@ -312,6 +315,7 @@ struct index_state {
312315
struct hashmap name_hash;
313316
struct hashmap dir_hash;
314317
unsigned char sha1[20];
318+
struct untracked_cache *untracked;
315319
};
316320

317321
extern struct index_state the_index;
@@ -563,6 +567,8 @@ extern void fill_stat_data(struct stat_data *sd, struct stat *st);
563567
* INODE_CHANGED, and DATA_CHANGED.
564568
*/
565569
extern int match_stat_data(const struct stat_data *sd, struct stat *st);
570+
extern int match_stat_data_racy(const struct index_state *istate,
571+
const struct stat_data *sd, struct stat *st);
566572

567573
extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);
568574

compat/mingw.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2128,3 +2128,14 @@ void mingw_startup()
21282128
/* initialize Unicode console */
21292129
winansi_init();
21302130
}
2131+
2132+
int uname(struct utsname *buf)
2133+
{
2134+
DWORD v = GetVersion();
2135+
memset(buf, 0, sizeof(*buf));
2136+
strcpy(buf->sysname, "Windows");
2137+
sprintf(buf->release, "%u.%u", v & 0xff, (v >> 8) & 0xff);
2138+
/* assuming NT variants only.. */
2139+
sprintf(buf->version, "%u", (v >> 16) & 0x7fff);
2140+
return 0;
2141+
}

0 commit comments

Comments
 (0)