Skip to content

Commit aab7bea

Browse files
ttaylorrgitster
authored andcommitted
t7703: demonstrate object corruption with pack.packSizeLimit
When doing a `--geometric=<d>` repack, `git repack` determines a splitting point among packs ordered by their object count such that: - each pack above the split has at least `<d>` times as many objects as the next-largest pack by object count, and - the first pack above the split has at least `<d>` times as many object as the sum of all packs below the split line combined `git repack` then creates a pack containing all of the objects contained in packs below the split line by running `git pack-objects --stdin-packs` underneath. Once packs are moved into place, then any packs below the split line are removed, since their objects were just combined into a new pack. But `git repack` tries to be careful to avoid removing a pack that it just wrote, by checking: struct packed_git *p = geometry->pack[i]; if (string_list_has_string(&names, hash_to_hex(p->hash))) continue; in the `delete_redundant` and `geometric` conditional towards the end of `cmd_repack`. But it's possible to trick `git repack` into not recognizing a pack that it just wrote when `names` is out-of-order (which violates `string_list_has_string()`'s assumption that the list is sorted and thus binary search-able). When this happens in just the right circumstances, it is possible to remove a pack that we just wrote, leading to object corruption. Luckily, this is quite difficult to provoke in practice (for a couple of reasons): - we ordinarily write just one pack, so `names` usually contains just one entry, and is thus sorted - when we do write more than one pack (e.g., due to `--max-pack-size`) we have to: (a) write a pack identical to one that already exists, (b) have that pack be below the split line, and (c) have the set of packs written by `pack-objects` occur in an order which tricks `string_list_has_string()`. Demonstrate the above scenario in a failing test, which causes `git repack --geometric` to write a pack which occurs below the split line, _and_ fail to recognize that it wrote that pack. The following patch will fix this bug. Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 4b5a808 commit aab7bea

File tree

1 file changed

+47
-0
lines changed

1 file changed

+47
-0
lines changed

t/t7703-repack-geometric.sh

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ test_description='git repack --geometric works correctly'
77
GIT_TEST_MULTI_PACK_INDEX=0
88

99
objdir=.git/objects
10+
packdir=$objdir/pack
1011
midx=$objdir/pack/multi-pack-index
1112

1213
test_expect_success '--geometric with no packs' '
@@ -230,4 +231,50 @@ test_expect_success '--geometric chooses largest MIDX preferred pack' '
230231
)
231232
'
232233

234+
test_expect_failure '--geometric with pack.packSizeLimit' '
235+
git init pack-rewrite &&
236+
test_when_finished "rm -fr pack-rewrite" &&
237+
(
238+
cd pack-rewrite &&
239+
240+
test-tool genrandom foo 1048576 >foo &&
241+
test-tool genrandom bar 1048576 >bar &&
242+
243+
git add foo bar &&
244+
test_tick &&
245+
git commit -m base &&
246+
247+
git rev-parse HEAD:foo HEAD:bar >p1.objects &&
248+
git rev-parse HEAD HEAD^{tree} >p2.objects &&
249+
250+
# These two packs each contain two objects, so the following
251+
# `--geometric` repack will try to combine them.
252+
p1="$(git pack-objects $packdir/pack <p1.objects)" &&
253+
p2="$(git pack-objects $packdir/pack <p2.objects)" &&
254+
255+
# Remove any loose objects in packs, since we do not want extra
256+
# copies around (which would mask over potential object
257+
# corruption issues).
258+
git prune-packed &&
259+
260+
# Both p1 and p2 will be rolled up, but pack-objects will write
261+
# three packs:
262+
#
263+
# - one containing object "foo",
264+
# - another containing object "bar",
265+
# - a final pack containing the commit and tree objects
266+
# (identical to p2 above)
267+
git repack --geometric 2 -d --max-pack-size=1048576 &&
268+
269+
# Ensure `repack` can detect that the third pack it wrote
270+
# (containing just the tree and commit objects) was identical to
271+
# one that was below the geometric split, so that we can save it
272+
# from deletion.
273+
#
274+
# If `repack` fails to do that, we will incorrectly delete p2,
275+
# causing object corruption.
276+
git fsck
277+
)
278+
'
279+
233280
test_done

0 commit comments

Comments
 (0)