Skip to content

Commit 9dd5245

Browse files
peffgitster
authored andcommitted
grep: pre-load userdiff drivers when threaded
The low-level grep_source code will automatically load the userdiff driver to see whether a file is binary. However, when we are threaded, it will load the drivers in a non-deterministic order, handling each one as its assigned thread happens to be scheduled. Meanwhile, the attribute lookup code (which underlies the userdiff driver lookup) is optimized to handle paths in sequential order (because they tend to share the same gitattributes files). Multi-threading the lookups destroys the locality and makes this optimization less effective. We can fix this by pre-loading the userdiff driver in the main thread, before we hand off the file to a worker thread. My best-of-five for "git grep foo" on the linux-2.6 repository went from: real 0m0.391s user 0m1.708s sys 0m0.584s to: real 0m0.360s user 0m1.576s sys 0m0.572s Not a huge speedup, but it's quite easy to do. The only trick is that we shouldn't perform this optimization if "-a" was used, in which case we won't bother checking whether the files are binary at all. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 0826579 commit 9dd5245

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

builtin/grep.c

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,8 @@ static pthread_cond_t cond_result;
8585

8686
static int skip_first_line;
8787

88-
static void add_work(enum grep_source_type type, const char *name,
89-
const void *id)
88+
static void add_work(struct grep_opt *opt, enum grep_source_type type,
89+
const char *name, const void *id)
9090
{
9191
grep_lock();
9292

@@ -95,6 +95,8 @@ static void add_work(enum grep_source_type type, const char *name,
9595
}
9696

9797
grep_source_init(&todo[todo_end].source, type, name, id);
98+
if (opt->binary != GREP_BINARY_TEXT)
99+
grep_source_load_driver(&todo[todo_end].source);
98100
todo[todo_end].done = 0;
99101
strbuf_reset(&todo[todo_end].out);
100102
todo_end = (todo_end + 1) % ARRAY_SIZE(todo);
@@ -333,7 +335,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1,
333335

334336
#ifndef NO_PTHREADS
335337
if (use_threads) {
336-
add_work(GREP_SOURCE_SHA1, pathbuf.buf, sha1);
338+
add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, sha1);
337339
strbuf_release(&pathbuf);
338340
return 0;
339341
} else
@@ -362,7 +364,7 @@ static int grep_file(struct grep_opt *opt, const char *filename)
362364

363365
#ifndef NO_PTHREADS
364366
if (use_threads) {
365-
add_work(GREP_SOURCE_FILE, buf.buf, filename);
367+
add_work(opt, GREP_SOURCE_FILE, buf.buf, filename);
366368
strbuf_release(&buf);
367369
return 0;
368370
} else

0 commit comments

Comments
 (0)