Skip to content

Commit 71c5aec

Browse files
chriscoolgitster
authored andcommitted
repack: implement --filter-to for storing filtered out objects
A previous commit has implemented `git repack --filter=<filter-spec>` to allow users to filter out some objects from the main pack and move them into a new different pack. It would be nice if this new different pack could be created in a different directory than the regular pack. This would make it possible to move large blobs into a pack on a different kind of storage, for example cheaper storage. Even in a different directory, this pack can be accessible if, for example, the Git alternates mechanism is used to point to it. In fact not using the Git alternates mechanism can corrupt a repo as the generated pack containing the filtered objects might not be accessible from the repo any more. So setting up the Git alternates mechanism should be done before using this feature if the user wants the repo to be fully usable while this feature is used. In some cases, like when a repo has just been cloned or when there is no other activity in the repo, it's Ok to setup the Git alternates mechanism afterwards though. It's also Ok to just inspect the generated packfile containing the filtered objects and then just move it into the '.git/objects/pack/' directory manually. That's why it's not necessary for this command to check that the Git alternates mechanism has been already setup. While at it, as an example to show that `--filter` and `--filter-to` work well with other options, let's also add a test to check that these options work well with `--max-pack-size`. Signed-off-by: Christian Couder <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 1cd43a9 commit 71c5aec

File tree

3 files changed

+82
-1
lines changed

3 files changed

+82
-1
lines changed

Documentation/git-repack.txt

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,17 @@ depth is 4095.
155155
a single packfile containing all the objects. See
156156
linkgit:git-rev-list[1] for valid `<filter-spec>` forms.
157157

158+
--filter-to=<dir>::
159+
Write the pack containing filtered out objects to the
160+
directory `<dir>`. Only useful with `--filter`. This can be
161+
used for putting the pack on a separate object directory that
162+
is accessed through the Git alternates mechanism. **WARNING:**
163+
If the packfile containing the filtered out objects is not
164+
accessible, the repo can become corrupt as it might not be
165+
possible to access the objects in that packfile. See the
166+
`objects` and `objects/info/alternates` sections of
167+
linkgit:gitrepository-layout[5].
168+
158169
-b::
159170
--write-bitmap-index::
160171
Write a reachability bitmap index as part of the repack. This

builtin/repack.c

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -977,6 +977,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
977977
int write_midx = 0;
978978
const char *cruft_expiration = NULL;
979979
const char *expire_to = NULL;
980+
const char *filter_to = NULL;
980981

981982
struct option builtin_repack_options[] = {
982983
OPT_BIT('a', NULL, &pack_everything,
@@ -1029,6 +1030,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
10291030
N_("write a multi-pack index of the resulting packs")),
10301031
OPT_STRING(0, "expire-to", &expire_to, N_("dir"),
10311032
N_("pack prefix to store a pack containing pruned objects")),
1033+
OPT_STRING(0, "filter-to", &filter_to, N_("dir"),
1034+
N_("pack prefix to store a pack containing filtered out objects")),
10321035
OPT_END()
10331036
};
10341037

@@ -1177,6 +1180,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
11771180
if (po_args.filter_options.choice)
11781181
strvec_pushf(&cmd.args, "--filter=%s",
11791182
expand_list_objects_filter_spec(&po_args.filter_options));
1183+
else if (filter_to)
1184+
die(_("option '%s' can only be used along with '%s'"), "--filter-to", "--filter");
11801185

11811186
if (geometry.split_factor)
11821187
cmd.in = -1;
@@ -1265,8 +1270,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
12651270
}
12661271

12671272
if (po_args.filter_options.choice) {
1273+
if (!filter_to)
1274+
filter_to = packtmp;
1275+
12681276
ret = write_filtered_pack(&po_args,
1269-
packtmp,
1277+
filter_to,
12701278
find_pack_prefix(packdir, packtmp),
12711279
&existing,
12721280
&names);

t/t7700-repack.sh

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -462,6 +462,68 @@ test_expect_success '--filter works with --pack-kept-objects and .keep packs' '
462462
)
463463
'
464464

465+
test_expect_success '--filter-to stores filtered out objects' '
466+
git -C bare.git repack -a -d &&
467+
test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack &&
468+
469+
git init --bare filtered.git &&
470+
git -C bare.git -c repack.writebitmaps=false repack -a -d \
471+
--filter=blob:none \
472+
--filter-to=../filtered.git/objects/pack/pack &&
473+
test_stdout_line_count = 1 ls bare.git/objects/pack/pack-*.pack &&
474+
test_stdout_line_count = 1 ls filtered.git/objects/pack/pack-*.pack &&
475+
476+
commit_pack=$(test-tool -C bare.git find-pack -c 1 HEAD) &&
477+
blob_pack=$(test-tool -C bare.git find-pack -c 0 HEAD:file1) &&
478+
blob_hash=$(git -C bare.git rev-parse HEAD:file1) &&
479+
test -n "$blob_hash" &&
480+
blob_pack=$(test-tool -C filtered.git find-pack -c 1 $blob_hash) &&
481+
482+
echo $(pwd)/filtered.git/objects >bare.git/objects/info/alternates &&
483+
blob_pack=$(test-tool -C bare.git find-pack -c 1 HEAD:file1) &&
484+
blob_content=$(git -C bare.git show $blob_hash) &&
485+
test "$blob_content" = "content1"
486+
'
487+
488+
test_expect_success '--filter works with --max-pack-size' '
489+
rm -rf filtered.git &&
490+
git init --bare filtered.git &&
491+
git init max-pack-size &&
492+
(
493+
cd max-pack-size &&
494+
test_commit base &&
495+
# two blobs which exceed the maximum pack size
496+
test-tool genrandom foo 1048576 >foo &&
497+
git hash-object -w foo &&
498+
test-tool genrandom bar 1048576 >bar &&
499+
git hash-object -w bar &&
500+
git add foo bar &&
501+
git commit -m "adding foo and bar"
502+
) &&
503+
git clone --no-local --bare max-pack-size max-pack-size.git &&
504+
(
505+
cd max-pack-size.git &&
506+
git -c repack.writebitmaps=false repack -a -d --filter=blob:none \
507+
--max-pack-size=1M \
508+
--filter-to=../filtered.git/objects/pack/pack &&
509+
echo $(cd .. && pwd)/filtered.git/objects >objects/info/alternates &&
510+
511+
# Check that the 3 blobs are in different packfiles in filtered.git
512+
test_stdout_line_count = 3 ls ../filtered.git/objects/pack/pack-*.pack &&
513+
test_stdout_line_count = 1 ls objects/pack/pack-*.pack &&
514+
foo_pack=$(test-tool find-pack -c 1 HEAD:foo) &&
515+
bar_pack=$(test-tool find-pack -c 1 HEAD:bar) &&
516+
base_pack=$(test-tool find-pack -c 1 HEAD:base.t) &&
517+
test "$foo_pack" != "$bar_pack" &&
518+
test "$foo_pack" != "$base_pack" &&
519+
test "$bar_pack" != "$base_pack" &&
520+
for pack in "$foo_pack" "$bar_pack" "$base_pack"
521+
do
522+
case "$foo_pack" in */filtered.git/objects/pack/*) true ;; *) return 1 ;; esac
523+
done
524+
)
525+
'
526+
465527
objdir=.git/objects
466528
midx=$objdir/pack/multi-pack-index
467529

0 commit comments

Comments
 (0)