Skip to content

Commit 163ee5e

Browse files
derrickstoleegitster
authored andcommitted
sha1_file: use strbuf_add() instead of strbuf_addf()
Replace use of strbuf_addf() with strbuf_add() when enumerating loose objects in for_each_file_in_obj_subdir(). Since we already check the length and hex-values of the string before consuming the path, we can prevent extra computation by using the lower- level method. One consumer of for_each_file_in_obj_subdir() is the abbreviation code. OID abbreviations use a cached list of loose objects (per object subdirectory) to make repeated queries fast, but there is significant cache load time when there are many loose objects. Most repositories do not have many loose objects before repacking, but in the GVFS case the repos can grow to have millions of loose objects. Profiling 'git log' performance in GitForWindows on a GVFS-enabled repo with ~2.5 million loose objects revealed 12% of the CPU time was spent in strbuf_addf(). Add a new performance test to p4211-line-log.sh that is more sensitive to this cache-loading. By limiting to 1000 commits, we more closely resemble user wait time when reading history into a pager. For a copy of the Linux repo with two ~512 MB packfiles and ~572K loose objects, running 'git log --oneline --parents --raw -1000' had the following performance: HEAD~1 HEAD ---------------------------------------- 7.70(7.15+0.54) 7.44(7.09+0.29) -3.4% Signed-off-by: Derrick Stolee <[email protected]> Reviewed-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 1a4e40a commit 163ee5e

File tree

2 files changed

+11
-5
lines changed

2 files changed

+11
-5
lines changed

sha1_file.c

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1903,7 +1903,6 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
19031903
origlen = path->len;
19041904
strbuf_complete(path, '/');
19051905
strbuf_addf(path, "%02x", subdir_nr);
1906-
baselen = path->len;
19071906

19081907
dir = opendir(path->buf);
19091908
if (!dir) {
@@ -1914,15 +1913,18 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
19141913
}
19151914

19161915
oid.hash[0] = subdir_nr;
1916+
strbuf_addch(path, '/');
1917+
baselen = path->len;
19171918

19181919
while ((de = readdir(dir))) {
1920+
size_t namelen;
19191921
if (is_dot_or_dotdot(de->d_name))
19201922
continue;
19211923

1924+
namelen = strlen(de->d_name);
19221925
strbuf_setlen(path, baselen);
1923-
strbuf_addf(path, "/%s", de->d_name);
1924-
1925-
if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 &&
1926+
strbuf_add(path, de->d_name, namelen);
1927+
if (namelen == GIT_SHA1_HEXSZ - 2 &&
19261928
!hex_to_bytes(oid.hash + 1, de->d_name,
19271929
GIT_SHA1_RAWSZ - 1)) {
19281930
if (obj_cb) {
@@ -1941,7 +1943,7 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
19411943
}
19421944
closedir(dir);
19431945

1944-
strbuf_setlen(path, baselen);
1946+
strbuf_setlen(path, baselen - 1);
19451947
if (!r && subdir_cb)
19461948
r = subdir_cb(subdir_nr, path->buf, data);
19471949

t/perf/p4211-line-log.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,8 @@ test_perf 'git log --oneline --raw --parents' '
3535
git log --oneline --raw --parents >/dev/null
3636
'
3737

38+
test_perf 'git log --oneline --raw --parents -1000' '
39+
git log --oneline --raw --parents -1000 >/dev/null
40+
'
41+
3842
test_done

0 commit comments

Comments
 (0)