Skip to content

Commit 0bfb127

Browse files
pks-tgitster
authored andcommitted
packfile: avoid access(3p) calls for missing packs
The function `add_packed_git()` adds a packfile to the known list of packs in case it exists. In case the packfile doesn't look like a pack though, or in case it doesn't exist, the function will simply return a `NULL` pointer. The ordering in which those checks are done is somewhat curious though: we first perform a couple of access(3p) syscalls for auxiliary files like ".keep" before we determine whether the ".pack" file itself exists. And if we see that the ".pack" file does not exist, we bail out and thus effectively discard all the information. This means that the access(3p) calls were a complete waste of time. The reason why we do this is likely because we reuse `p->pack_name` to derive the other files' names, as well, so that we only have to allocate this buffer once. As such, we have to compute the packfile name last so that it doesn't get overwritten by the other names. All of this likely doesn't matter in the general case: Git shouldn't end up looking too many nonexistent packfiles anyway. But there are edge cases where it _does_ matter. One such edge case that we have hit in our production systems was a stale "multi-pack-index" file that contained references to nonexistent packfiles. As there is no negative lookup cache this causes us to repeatedly look up the same packfiles via the multi-pack index, only to realize that they don't exist. This translated into hundreds of thousands of syscalls to access(3p) and stat(3p). While the issue was entirely self-made because the multi-pack index should have been regenerated, we can still reduce the number of syscalls by 75% in the case of nonexistent packfiles by reordering these calls. This requires us to restore the final packfile name after the access(3p) calls. But given that this is a mere memory copy it is very unlikely to matter in comparison to the four syscalls we performed beforehand. Another mitigation would be to introduce a negative lookup cache so that we don't try to add missing packfiles repeatedly. But that refactoring would be a bit more involved for dubious gains, so for now the author has decided against it. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent cb96e16 commit 0bfb127

File tree

1 file changed

+7
-4
lines changed

1 file changed

+7
-4
lines changed

packfile.c

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -737,6 +737,12 @@ struct packed_git *add_packed_git(struct repository *r, const char *path,
737737
p = alloc_packed_git(r, alloc);
738738
memcpy(p->pack_name, path, path_len);
739739

740+
xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
741+
if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
742+
free(p);
743+
return NULL;
744+
}
745+
740746
xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
741747
if (!access(p->pack_name, F_OK))
742748
p->pack_keep = 1;
@@ -749,11 +755,8 @@ struct packed_git *add_packed_git(struct repository *r, const char *path,
749755
if (!access(p->pack_name, F_OK))
750756
p->is_cruft = 1;
751757

758+
/* Restore the final packfile name. */
752759
xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
753-
if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
754-
free(p);
755-
return NULL;
756-
}
757760

758761
/* ok, it looks sane as far as we can check without
759762
* actually mapping the pack file.

0 commit comments

Comments
 (0)