Skip to content

Commit 125c326

Browse files
ttaylorrgitster
authored andcommitted
builtin/pack-objects.c: translate bit positions during pack-reuse
When reusing chunks verbatim from an existing source pack, the function write_reused_pack() first attempts to reuse whole words (via the function `write_reused_pack_verbatim()`), and then individual bits (via `write_reused_pack_one()`). In the non-MIDX case, all of this code works fine. Likewise, in the MIDX case, processing bits individually from the first (preferred) pack works fine. However, processing subsequent packs in the MIDX case is broken when there are duplicate objects among the set of MIDX'd packs. This is because we treat the individual bit positions as valid pack positions within the source pack(s), which does not account for gaps in the source pack, like we see when the MIDX must break ties between duplicate objects which appear in multiple packs. The broken code looks like: for (; i < reuse_packfile_bitmap->word_alloc; i++) { for (offset = 0; offset < BITS_IN_EWORD, offset++) { /* ... */ write_reused_pack_one(reuse_packfile->p, pos + offset - reuse_packfile->bitmap_pos, f, pack_start, &w_curs); } } , where the second argument is incorrect and does not account for gaps. Instead, make sure that we translate bit positions in the MIDX's pseudo-pack order to pack positions in the respective source packs by: - Translating the bit position (pseudo-pack order) to a MIDX position (lexical order). - Use the MIDX position to obtain the offset at which the given object occurs in the source pack. - Then translate that offset back into a pack relative position within the source pack by calling offset_to_pack_pos(). After doing this, then we can safely use the result as a pack position. Note that when doing single-pack reuse, as well as reusing objects from the MIDX's preferred pack, such translation is not necessary, since either ties are broken in favor of the preferred pack, or there are no ties to break at all (in the case of non-MIDX bitmaps). Failing to do this can result in strange failure modes. One example that can occur when misinterpreting bits in the above fashion is that Git thinks it's supposed to send a delta that the caller does not want. Under this (incorrect) assumption, we try to look up the delta's base (so that we can patch any OFS_DELTAs if necessary). We do this using find_reused_offset(). But if we try and call that function for an offset belonging to an object we did not send, we'll get back garbage. This can result in us computing a negative fixup value, which results in memory corruption when trying to write the (patched) OFS_DELTA header. Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 41cd4b4 commit 125c326

File tree

2 files changed

+63
-12
lines changed

2 files changed

+63
-12
lines changed

builtin/pack-objects.c

Lines changed: 36 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1191,6 +1191,7 @@ static void write_reused_pack(struct bitmapped_pack *reuse_packfile,
11911191
size_t pos = (i * BITS_IN_EWORD);
11921192

11931193
for (offset = 0; offset < BITS_IN_EWORD; ++offset) {
1194+
uint32_t pack_pos;
11941195
if ((word >> offset) == 0)
11951196
break;
11961197

@@ -1199,14 +1200,41 @@ static void write_reused_pack(struct bitmapped_pack *reuse_packfile,
11991200
continue;
12001201
if (pos + offset >= reuse_packfile->bitmap_pos + reuse_packfile->bitmap_nr)
12011202
goto done;
1202-
/*
1203-
* Can use bit positions directly, even for MIDX
1204-
* bitmaps. See comment in try_partial_reuse()
1205-
* for why.
1206-
*/
1207-
write_reused_pack_one(reuse_packfile->p,
1208-
pos + offset - reuse_packfile->bitmap_pos,
1209-
f, pack_start, &w_curs);
1203+
1204+
if (reuse_packfile->bitmap_pos) {
1205+
/*
1206+
* When doing multi-pack reuse on a
1207+
* non-preferred pack, translate bit positions
1208+
* from the MIDX pseudo-pack order back to their
1209+
* pack-relative positions before attempting
1210+
* reuse.
1211+
*/
1212+
struct multi_pack_index *m = reuse_packfile->from_midx;
1213+
uint32_t midx_pos;
1214+
off_t pack_ofs;
1215+
1216+
if (!m)
1217+
BUG("non-zero bitmap position without MIDX");
1218+
1219+
midx_pos = pack_pos_to_midx(m, pos + offset);
1220+
pack_ofs = nth_midxed_offset(m, midx_pos);
1221+
1222+
if (offset_to_pack_pos(reuse_packfile->p,
1223+
pack_ofs, &pack_pos) < 0)
1224+
BUG("could not find expected object at offset %"PRIuMAX" in pack %s",
1225+
(uintmax_t)pack_ofs,
1226+
pack_basename(reuse_packfile->p));
1227+
} else {
1228+
/*
1229+
* Can use bit positions directly, even for MIDX
1230+
* bitmaps. See comment in try_partial_reuse()
1231+
* for why.
1232+
*/
1233+
pack_pos = pos + offset;
1234+
}
1235+
1236+
write_reused_pack_one(reuse_packfile->p, pack_pos, f,
1237+
pack_start, &w_curs);
12101238
display_progress(progress_state, ++written);
12111239
}
12121240
}

t/t5332-multi-pack-reuse.sh

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ test_expect_success 'multi-pack reuse is disabled by default' '
6969
test_pack_objects_reused_all 3 1
7070
'
7171

72-
test_expect_failure 'feature.experimental implies multi-pack reuse' '
72+
test_expect_success 'feature.experimental implies multi-pack reuse' '
7373
test_config feature.experimental true &&
7474
7575
test_pack_objects_reused_all 6 2
@@ -86,7 +86,7 @@ test_expect_success 'enable multi-pack reuse' '
8686
git config pack.allowPackReuse multi
8787
'
8888

89-
test_expect_failure 'reuse all objects from subset of bitmapped packs' '
89+
test_expect_success 'reuse all objects from subset of bitmapped packs' '
9090
test_commit C &&
9191
git repack -d &&
9292
@@ -100,7 +100,7 @@ test_expect_failure 'reuse all objects from subset of bitmapped packs' '
100100
test_pack_objects_reused 6 2 <in
101101
'
102102

103-
test_expect_failure 'reuse all objects from all packs' '
103+
test_expect_success 'reuse all objects from all packs' '
104104
test_pack_objects_reused_all 9 3
105105
'
106106

@@ -194,7 +194,7 @@ test_expect_success 'omit delta with uninteresting base (same pack)' '
194194
test_pack_objects_reused 3 1 <in
195195
'
196196

197-
test_expect_failure 'omit delta from uninteresting base (cross pack)' '
197+
test_expect_success 'omit delta from uninteresting base (cross pack)' '
198198
cat >in <<-EOF &&
199199
$(git rev-parse $base)
200200
^$(git rev-parse $delta)
@@ -236,4 +236,27 @@ test_expect_success 'non-omitted delta in MIDX preferred pack' '
236236
test_pack_objects_reused_all $(wc -l <expect) 1
237237
'
238238

239+
test_expect_success 'duplicate objects' '
240+
git init duplicate-objects &&
241+
(
242+
cd duplicate-objects &&
243+
244+
git config pack.allowPackReuse multi &&
245+
246+
test_commit base &&
247+
248+
git repack -a &&
249+
250+
git rev-parse HEAD^{tree} >in &&
251+
p="$(git pack-objects $packdir/pack <in)" &&
252+
253+
git multi-pack-index write --bitmap --preferred-pack=pack-$p.idx &&
254+
255+
objects_nr="$(git rev-list --count --all --objects)" &&
256+
packs_nr="$(find $packdir -type f -name "pack-*.pack" | wc -l)" &&
257+
258+
test_pack_objects_reused_all $objects_nr $packs_nr
259+
)
260+
'
261+
239262
test_done

0 commit comments

Comments
 (0)