Skip to content

Commit 1eb22c7

Browse files
derrickstoleegitster
authored andcommitted
multi-pack-index: repack batches below --batch-size
The --batch-size=<size> option of 'git multi-pack-index repack' is intended to limit the amount of work done by the repack. In the case of a large repository, this command should repack a number of small pack-files but leave the large pack-files alone. Most often, the repository has one large pack-file from a 'git clone' operation and number of smaller pack-files from incremental 'git fetch' operations. The issue with '--batch-size' is that it also _prevents_ the repack from happening if the expected size of the resulting pack-file is too small. This was intended as a way to avoid frequent churn of small pack-files, but it has mostly caused confusion when a repository is of "medium" size. That is, not enormous like the Windows OS repository, but also not so small that this incremental repack isn't valuable. The solution presented here is to collect pack-files for repack if their expected size is smaller than the batch-size parameter until either the total expected size exceeds the batch-size or all pack-files are considered. If there are at least two pack-files, then these are combined to a new pack-file whose size should not be too much larger than the batch-size. This new strategy should succeed in keeping the number of pack-files small in these "medium" size repositories. The concern about churn is likely not interesting, as the real control over that is the frequency in which the repack command is run. Signed-off-by: Derrick Stolee <[email protected]> Reviewed-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 4f0a8be commit 1eb22c7

File tree

3 files changed

+25
-6
lines changed

3 files changed

+25
-6
lines changed

Documentation/git-multi-pack-index.txt

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -51,11 +51,12 @@ repack::
5151
multi-pack-index, then divide by the total number of objects in
5252
the pack and multiply by the pack size. We select packs with
5353
expected size below the batch size until the set of packs have
54-
total expected size at least the batch size. If the total size
55-
does not reach the batch size, then do nothing. If a new pack-
56-
file is created, rewrite the multi-pack-index to reference the
57-
new pack-file. A later run of 'git multi-pack-index expire' will
58-
delete the pack-files that were part of this batch.
54+
total expected size at least the batch size, or all pack-files
55+
are considered. If only one pack-file is selected, then do
56+
nothing. If a new pack-file is created, rewrite the
57+
multi-pack-index to reference the new pack-file. A later run of
58+
'git multi-pack-index expire' will delete the pack-files that
59+
were part of this batch.
5960
+
6061
If `repack.packKeptObjects` is `false`, then any pack-files with an
6162
associated `.keep` file will not be selected for the batch to repack.

midx.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1371,7 +1371,7 @@ static int fill_included_packs_batch(struct repository *r,
13711371

13721372
free(pack_info);
13731373

1374-
if (total_size < batch_size || packs_to_repack < 2)
1374+
if (packs_to_repack < 2)
13751375
return 1;
13761376

13771377
return 0;

t/t5319-multi-pack-index.sh

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -643,6 +643,7 @@ test_expect_success 'expire respects .keep files' '
643643
'
644644

645645
test_expect_success 'repack --batch-size=0 repacks everything' '
646+
cp -r dup dup2 &&
646647
(
647648
cd dup &&
648649
rm .git/objects/pack/*.keep &&
@@ -662,4 +663,21 @@ test_expect_success 'repack --batch-size=0 repacks everything' '
662663
)
663664
'
664665

666+
test_expect_success 'repack --batch-size=<large> repacks everything' '
667+
(
668+
cd dup2 &&
669+
rm .git/objects/pack/*.keep &&
670+
ls .git/objects/pack/*idx >idx-list &&
671+
test_line_count = 2 idx-list &&
672+
git multi-pack-index repack --batch-size=2000000 &&
673+
ls .git/objects/pack/*idx >idx-list &&
674+
test_line_count = 3 idx-list &&
675+
test-tool read-midx .git/objects | grep idx >midx-list &&
676+
test_line_count = 3 midx-list &&
677+
git multi-pack-index expire &&
678+
ls -al .git/objects/pack/*idx >idx-list &&
679+
test_line_count = 1 idx-list
680+
)
681+
'
682+
665683
test_done

0 commit comments

Comments
 (0)