Skip to content

Commit a643157

Browse files
raffsgitster
authored andcommitted
repack: avoid loosening promisor objects in partial clones
When `git repack -A -d` is run in a partial clone, `pack-objects` is invoked twice: once to repack all promisor objects, and once to repack all non-promisor objects. The latter `pack-objects` invocation is with --exclude-promisor-objects and --unpack-unreachable, which loosens all objects unused during this invocation. Unfortunately, this includes promisor objects. Because the -d argument to `git repack` subsequently deletes all loose objects also in packs, these just-loosened promisor objects will be immediately deleted. However, this extra disk churn is unnecessary in the first place. For example, in a newly-cloned partial repo that filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up unpacking all trees and commits into the filesystem because every object, in this particular case, is a promisor object. Depending on the repo size, this increases the disk usage considerably: In my copy of the linux.git, the object directory peaked 26GB of more disk usage. In order to avoid this extra disk churn, pass the names of the promisor packfiles as --keep-pack arguments to the second invocation of `pack-objects`. This informs `pack-objects` that the promisor objects are already in a safe packfile and, therefore, do not need to be loosened. For testing, we need to validate whether any object was loosened. However, the "evidence" (loosened objects) is deleted during the process which prevents us from inspecting the object directory. Instead, let's teach `pack-objects` to count loosened objects and emit via trace2 thus allowing inspecting the debug events after the process is finished. This new event is used on the added regression test. Lastly, add a new perf test to evaluate the performance impact made by this changes (tested on git.git): Test HEAD^ HEAD ---------------------------------------------------------- 5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2% For a bigger repository, such as linux.git, the improvement is even bigger: Test HEAD^ HEAD ------------------------------------------------------------------- 5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1% These improvements are particular big because every object in the newly-cloned partial repository is a promisor object. Reported-by: SZEDER Gábor <[email protected]> Helped-by: Jeff King <[email protected]> Helped-by: Jonathan Tan <[email protected]> Signed-off-by: Rafael Silva <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent c1fa951 commit a643157

File tree

4 files changed

+26
-3
lines changed

4 files changed

+26
-3
lines changed

builtin/pack-objects.c

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3479,6 +3479,7 @@ static void loosen_unused_packed_objects(void)
34793479
{
34803480
struct packed_git *p;
34813481
uint32_t i;
3482+
uint32_t loosened_objects_nr = 0;
34823483
struct object_id oid;
34833484

34843485
for (p = get_all_packs(the_repository); p; p = p->next) {
@@ -3492,11 +3493,16 @@ static void loosen_unused_packed_objects(void)
34923493
nth_packed_object_id(&oid, p, i);
34933494
if (!packlist_find(&to_pack, &oid) &&
34943495
!has_sha1_pack_kept_or_nonlocal(&oid) &&
3495-
!loosened_object_can_be_discarded(&oid, p->mtime))
3496+
!loosened_object_can_be_discarded(&oid, p->mtime)) {
34963497
if (force_object_loose(&oid, p->mtime))
34973498
die(_("unable to force loose object"));
3499+
loosened_objects_nr++;
3500+
}
34983501
}
34993502
}
3503+
3504+
trace2_data_intmax("pack-objects", the_repository,
3505+
"loosen_unused_packed_objects/loosened", loosened_objects_nr);
35003506
}
35013507

35023508
/*

builtin/repack.c

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ static int delta_base_offset = 1;
2020
static int pack_kept_objects = -1;
2121
static int write_bitmaps = -1;
2222
static int use_delta_islands;
23-
static char *packdir, *packtmp;
23+
static char *packdir, *packtmp_name, *packtmp;
2424

2525
static const char *const git_repack_usage[] = {
2626
N_("git repack [<options>]"),
@@ -530,7 +530,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
530530
}
531531

532532
packdir = mkpathdup("%s/pack", get_object_directory());
533-
packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid());
533+
packtmp_name = xstrfmt(".tmp-%d-pack", (int)getpid());
534+
packtmp = mkpathdup("%s/%s", packdir, packtmp_name);
534535

535536
sigchain_push_common(remove_pack_on_signal);
536537

@@ -573,6 +574,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
573574
repack_promisor_objects(&po_args, &names);
574575

575576
if (existing_packs.nr && delete_redundant) {
577+
for_each_string_list_item(item, &names) {
578+
strvec_pushf(&cmd.args, "--keep-pack=%s-%s.pack",
579+
packtmp_name, item->string);
580+
}
576581
if (unpack_unreachable) {
577582
strvec_pushf(&cmd.args,
578583
"--unpack-unreachable=%s",

t/perf/p5600-partial-clone.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,8 @@ test_perf 'count non-promisor commits' '
3535
git -C bare.git rev-list --all --count --exclude-promisor-objects
3636
'
3737

38+
test_perf 'gc' '
39+
git -C bare.git gc
40+
'
41+
3842
test_done

t/t5616-partial-clone.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -548,6 +548,14 @@ test_expect_success 'fetch from a partial clone, protocol v2' '
548548
grep "version 2" trace
549549
'
550550

551+
test_expect_success 'repack does not loosen promisor objects' '
552+
rm -rf client trace &&
553+
git clone --bare --filter=blob:none "file://$(pwd)/srv.bare" client &&
554+
test_when_finished "rm -rf client trace" &&
555+
GIT_TRACE2_PERF="$(pwd)/trace" git -C client repack -A -d &&
556+
grep "loosen_unused_packed_objects/loosened:0" trace
557+
'
558+
551559
. "$TEST_DIRECTORY"/lib-httpd.sh
552560
start_httpd
553561

0 commit comments

Comments
 (0)