Skip to content

Commit 8b5763e

Browse files
peffttaylorr
authored andcommitted
midx: avoid duplicate packed_git entries
When we scan a pack directory to load the idx entries we find into the packed_git list, we skip any of them that are contained in a midx. We then load them later lazily if we actually need to access the corresponding pack, referencing them both from the midx struct and the packed_git list. The lazy-load in the midx code checks to see if the midx already mentions the pack, but doesn't otherwise check the packed_git list. This makes sense, since we should have added any pack to both lists. But there's a loophole! If we call close_object_store(), that frees the midx entirely, but _not_ the packed_git structs, which we must keep around for Reasons[1]. If we then try to look up more objects, we'll auto-load the midx again, which won't realize that we've already loaded those packs, and will create duplicate entries in the packed_git list. This is possibly inefficient, because it means we may open and map the pack redundantly. But it can also lead to weird user-visible behavior. The case I found is in "git repack", which closes and reopens the midx after repacking and then calls update_server_info(). We end up writing the duplicate entries into objects/info/packs. We could obviously de-dup them while writing that file, but it seems like a violation of more core assumptions that we end up with these duplicate structs at all. We can avoid the duplicates reasonably efficiently by checking their names in the pack_map hash. This annoyingly does require a little more than a straight hash lookup due to the naming conventions, but it should only happen when we are about to actually open a pack. I don't think one extra malloc will be noticeable there. [1] I'm not entirely sure of all the details, except that we generally assume the packed_git structs never go away. We noted this restriction in the comment added by 6f1e939 (object: fix leaking packfiles when closing object store, 2024-08-08), but it's somewhat vague. At any rate, if you try freeing the structs in close_object_store(), you can observe segfaults all over the test suite. So it might be fixable, but it's not trivial. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Taylor Blau <[email protected]>
1 parent 6a11438 commit 8b5763e

File tree

2 files changed

+25
-3
lines changed

2 files changed

+25
-3
lines changed

midx.c

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
445445
uint32_t pack_int_id)
446446
{
447447
struct strbuf pack_name = STRBUF_INIT;
448+
struct strbuf key = STRBUF_INIT;
448449
struct packed_git *p;
449450

450451
pack_int_id = midx_for_pack(&m, pack_int_id);
@@ -455,16 +456,29 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
455456
strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
456457
m->pack_names[pack_int_id]);
457458

458-
p = add_packed_git(pack_name.buf, pack_name.len, m->local);
459+
/* pack_map holds the ".pack" name, but we have the .idx */
460+
strbuf_addbuf(&key, &pack_name);
461+
strbuf_strip_suffix(&key, ".idx");
462+
strbuf_addstr(&key, ".pack");
463+
p = hashmap_get_entry_from_hash(&r->objects->pack_map,
464+
strhash(key.buf), key.buf,
465+
struct packed_git, packmap_ent);
466+
if (!p) {
467+
p = add_packed_git(pack_name.buf, pack_name.len, m->local);
468+
if (p) {
469+
install_packed_git(r, p);
470+
list_add_tail(&p->mru, &r->objects->packed_git_mru);
471+
}
472+
}
473+
459474
strbuf_release(&pack_name);
475+
strbuf_release(&key);
460476

461477
if (!p)
462478
return 1;
463479

464480
p->multi_pack_index = 1;
465481
m->packs[pack_int_id] = p;
466-
install_packed_git(r, p);
467-
list_add_tail(&p->mru, &r->objects->packed_git_mru);
468482

469483
return 0;
470484
}

t/t5200-update-server-info.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,4 +39,12 @@ test_expect_success 'info/refs updates when changes are made' '
3939
! test_cmp a b
4040
'
4141

42+
test_expect_success 'midx does not create duplicate pack entries' '
43+
git repack -d --write-midx &&
44+
git repack -d &&
45+
grep ^P .git/objects/info/packs >packs &&
46+
uniq -d <packs >dups &&
47+
test_must_be_empty dups
48+
'
49+
4250
test_done

0 commit comments

Comments
 (0)