Skip to content

Commit 3079026

Browse files
blanetgitster
authored andcommitted
bundle-uri: verify oid before writing refs
When using the bundle-uri mechanism with a bundle list containing multiple interrelated bundles, we encountered a bug where tips from downloaded bundles were not discovered, thus resulting in rather slow clones. This was particularly problematic when employing the "creationTokens" heuristic. To reproduce this issue, consider a repository with a single branch "main" pointing to commit "A". Firstly, create a base bundle with: git bundle create base.bundle main Then, add a new commit "B" on top of "A", and create an incremental bundle for "main": git bundle create incr.bundle A..main Now, generate a bundle list with the following content: [bundle] version = 1 mode = all heuristic = creationToken [bundle "base"] uri = base.bundle creationToken = 1 [bundle "incr"] uri = incr.bundle creationToken = 2 A fresh clone with the bundle list above should result in a reference "refs/bundles/main" pointing to "B" in the new repository. However, git would still download everything from the server, as if it had fetched nothing locally. So why the "refs/bundles/main" is not discovered? After some digging I found that: 1. Bundles in bundle list are downloaded to local files via `bundle-uri.c:download_bundle_list` or via `bundle-uri.c:fetch_bundles_by_token` for the "creationToken" heuristic. 2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which is called by `bundle-uri.c:unbundle_all_bundles` or called within `bundle-uri.c:fetch_bundles_by_token` for the "creationToken" heuristic. 3. To get all prerequisites of the bundle, the bundle header is read inside `bundle-uri.c:unbundle_from_file` to by calling `bundle.c:read_bundle_header`. 4. Then it calls `bundle.c:unbundle`, which calls `bundle.c:verify_bundle` to ensure the repository contains all the prerequisites. 5. `bundle.c:verify_bundle` calls `parse_object`, which eventually invokes `packfile.c:prepare_packed_git` or `packfile.c:reprepare_packed_git`, filling `raw_object_store->packed_git` and setting `packed_git_initialized`. 6. If `bundle.c:unbundle` succeeds, it writes refs via `refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here bundle refs which can target arbitrary objects are written to the repository. 7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions `fetch-pack.c:mark_complete_and_common_ref` and `fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to find local tips for negotiation. The `OBJECT_INFO_QUICK` flag prevents `packfile.c:reprepare_packed_git` from being called, resulting in failures to parse OIDs that reside only in the latest bundle. In the example above, when unbunding "incr.bundle", "base.pack" is added to `packed_git` due to prerequisites verification. However, "B" cannot be found for negotiation because it exists in "incr.pack", which is not included in `packed_git`. Fix the bug by removing `REF_SKIP_OID_VERIFICATION` flag when writing bundle refs. When `refs.c:refs_update_ref` is called to write the corresponding bundle refs, it triggers `refs.c:ref_transaction_commit`. This, in turn, invokes `refs.c:ref_transaction_prepare`, which calls `transaction_prepare` of the refs storage backend. For files backend, it is `files-backend.c:files_transaction_prepare`, and for reftable backend, it is `reftable-backend.c:reftable_be_transaction_prepare`. Both functions eventually call `object.c:parse_object`, which can invoke `packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures that bundle refs point to valid objects and that all tips from bundle refs are correctly parsed during subsequent negotiations. A set of negotiation-related tests for cloning with bundle-uri has been included to demonstrate that downloaded bundles are utilized to accelerate fetching. Additionally, another test has been added to show that bundles with incorrect headers, where refs point to non-existent objects, do not result in any bundle refs being created in the repository. Reviewed-by: Karthik Nayak <[email protected]> Reviewed-by: Patrick Steinhardt <[email protected]> Signed-off-by: Xing Xin <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 8d94cfb commit 3079026

File tree

2 files changed

+152
-6
lines changed

2 files changed

+152
-6
lines changed

bundle-uri.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -400,8 +400,7 @@ static int unbundle_from_file(struct repository *r, const char *file)
400400
refs_update_ref(get_main_ref_store(the_repository),
401401
"fetched bundle", bundle_ref.buf, oid,
402402
has_old ? &old_oid : NULL,
403-
REF_SKIP_OID_VERIFICATION,
404-
UPDATE_REFS_MSG_ON_ERR);
403+
0, UPDATE_REFS_MSG_ON_ERR);
405404
}
406405

407406
bundle_header_release(&header);

t/t5558-clone-bundle-uri.sh

Lines changed: 151 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
test_description='test fetching bundles with --bundle-uri'
44

55
. ./test-lib.sh
6+
. "$TEST_DIRECTORY"/lib-bundle.sh
67

78
test_expect_success 'fail to clone from non-existent file' '
89
test_when_finished rm -rf test &&
@@ -19,10 +20,24 @@ test_expect_success 'fail to clone from non-bundle file' '
1920

2021
test_expect_success 'create bundle' '
2122
git init clone-from &&
22-
git -C clone-from checkout -b topic &&
23-
test_commit -C clone-from A &&
24-
test_commit -C clone-from B &&
25-
git -C clone-from bundle create B.bundle topic
23+
(
24+
cd clone-from &&
25+
git checkout -b topic &&
26+
27+
test_commit A &&
28+
git bundle create A.bundle topic &&
29+
30+
test_commit B &&
31+
git bundle create B.bundle topic &&
32+
33+
# Create a bundle with reference pointing to non-existent object.
34+
commit_a=$(git rev-parse A) &&
35+
commit_b=$(git rev-parse B) &&
36+
sed -e "/^$/q" -e "s/$commit_a /$commit_b /" \
37+
<A.bundle >bad-header.bundle &&
38+
convert_bundle_to_pack \
39+
<A.bundle >>bad-header.bundle
40+
)
2641
'
2742

2843
test_expect_success 'clone with path bundle' '
@@ -33,6 +48,16 @@ test_expect_success 'clone with path bundle' '
3348
test_cmp expect actual
3449
'
3550

51+
test_expect_success 'clone with bundle that has bad header' '
52+
# Write bundle ref fails, but clone can still proceed.
53+
git clone --bundle-uri="clone-from/bad-header.bundle" \
54+
clone-from clone-bad-header 2>err &&
55+
commit_b=$(git -C clone-from rev-parse B) &&
56+
test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err &&
57+
git -C clone-bad-header for-each-ref --format="%(refname)" >refs &&
58+
test_grep ! "refs/bundles/" refs
59+
'
60+
3661
test_expect_success 'clone with path bundle and non-default hash' '
3762
test_when_finished "rm -rf clone-path-non-default-hash" &&
3863
GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="clone-from/B.bundle" \
@@ -259,6 +284,128 @@ test_expect_success 'clone bundle list (file, any mode, all failures)' '
259284
! grep "refs/bundles/" refs
260285
'
261286

287+
test_expect_success 'negotiation: bundle with part of wanted commits' '
288+
test_when_finished "rm -f trace*.txt" &&
289+
GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
290+
git clone --no-local --bundle-uri="clone-from/A.bundle" \
291+
clone-from nego-bundle-part &&
292+
git -C nego-bundle-part for-each-ref --format="%(refname)" >refs &&
293+
grep "refs/bundles/" refs >actual &&
294+
test_write_lines refs/bundles/topic >expect &&
295+
test_cmp expect actual &&
296+
# Ensure that refs/bundles/topic are sent as "have".
297+
tip=$(git -C clone-from rev-parse A) &&
298+
test_grep "clone> have $tip" trace-packet.txt
299+
'
300+
301+
test_expect_success 'negotiation: bundle with all wanted commits' '
302+
test_when_finished "rm -f trace*.txt" &&
303+
GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
304+
git clone --no-local --single-branch --branch=topic --no-tags \
305+
--bundle-uri="clone-from/B.bundle" \
306+
clone-from nego-bundle-all &&
307+
git -C nego-bundle-all for-each-ref --format="%(refname)" >refs &&
308+
grep "refs/bundles/" refs >actual &&
309+
test_write_lines refs/bundles/topic >expect &&
310+
test_cmp expect actual &&
311+
# We already have all needed commits so no "want" needed.
312+
test_grep ! "clone> want " trace-packet.txt
313+
'
314+
315+
test_expect_success 'negotiation: bundle list (no heuristic)' '
316+
test_when_finished "rm -f trace*.txt" &&
317+
cat >bundle-list <<-EOF &&
318+
[bundle]
319+
version = 1
320+
mode = all
321+
322+
[bundle "bundle-1"]
323+
uri = file://$(pwd)/clone-from/bundle-1.bundle
324+
325+
[bundle "bundle-2"]
326+
uri = file://$(pwd)/clone-from/bundle-2.bundle
327+
EOF
328+
329+
GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
330+
git clone --no-local --bundle-uri="file://$(pwd)/bundle-list" \
331+
clone-from nego-bundle-list-no-heuristic &&
332+
333+
git -C nego-bundle-list-no-heuristic for-each-ref --format="%(refname)" >refs &&
334+
grep "refs/bundles/" refs >actual &&
335+
cat >expect <<-\EOF &&
336+
refs/bundles/base
337+
refs/bundles/left
338+
EOF
339+
test_cmp expect actual &&
340+
tip=$(git -C nego-bundle-list-no-heuristic rev-parse refs/bundles/left) &&
341+
test_grep "clone> have $tip" trace-packet.txt
342+
'
343+
344+
test_expect_success 'negotiation: bundle list (creationToken)' '
345+
test_when_finished "rm -f trace*.txt" &&
346+
cat >bundle-list <<-EOF &&
347+
[bundle]
348+
version = 1
349+
mode = all
350+
heuristic = creationToken
351+
352+
[bundle "bundle-1"]
353+
uri = file://$(pwd)/clone-from/bundle-1.bundle
354+
creationToken = 1
355+
356+
[bundle "bundle-2"]
357+
uri = file://$(pwd)/clone-from/bundle-2.bundle
358+
creationToken = 2
359+
EOF
360+
361+
GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
362+
git clone --no-local --bundle-uri="file://$(pwd)/bundle-list" \
363+
clone-from nego-bundle-list-heuristic &&
364+
365+
git -C nego-bundle-list-heuristic for-each-ref --format="%(refname)" >refs &&
366+
grep "refs/bundles/" refs >actual &&
367+
cat >expect <<-\EOF &&
368+
refs/bundles/base
369+
refs/bundles/left
370+
EOF
371+
test_cmp expect actual &&
372+
tip=$(git -C nego-bundle-list-heuristic rev-parse refs/bundles/left) &&
373+
test_grep "clone> have $tip" trace-packet.txt
374+
'
375+
376+
test_expect_success 'negotiation: bundle list with all wanted commits' '
377+
test_when_finished "rm -f trace*.txt" &&
378+
cat >bundle-list <<-EOF &&
379+
[bundle]
380+
version = 1
381+
mode = all
382+
heuristic = creationToken
383+
384+
[bundle "bundle-1"]
385+
uri = file://$(pwd)/clone-from/bundle-1.bundle
386+
creationToken = 1
387+
388+
[bundle "bundle-2"]
389+
uri = file://$(pwd)/clone-from/bundle-2.bundle
390+
creationToken = 2
391+
EOF
392+
393+
GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
394+
git clone --no-local --single-branch --branch=left --no-tags \
395+
--bundle-uri="file://$(pwd)/bundle-list" \
396+
clone-from nego-bundle-list-all &&
397+
398+
git -C nego-bundle-list-all for-each-ref --format="%(refname)" >refs &&
399+
grep "refs/bundles/" refs >actual &&
400+
cat >expect <<-\EOF &&
401+
refs/bundles/base
402+
refs/bundles/left
403+
EOF
404+
test_cmp expect actual &&
405+
# We already have all needed commits so no "want" needed.
406+
test_grep ! "clone> want " trace-packet.txt
407+
'
408+
262409
#########################################################################
263410
# HTTP tests begin here
264411

0 commit comments

Comments
 (0)