Skip to content

Commit 9159029

Browse files
pks-tgitster
authored andcommitted
builtin/clone: fix bundle URIs with mismatching object formats
We create the reference database in git-clone(1) quite early before connecting to the remote repository. Given that we do not yet know about the object format that the remote repository uses at that point in time the consequence is that the refdb may be initialized with the wrong object format. This is not a problem in the context of the files backend as we do not encode the object format anywhere, and furthermore the only reference that we write between initializing the refdb and learning about the object format is the "HEAD" symref. It will become a problem though once we land the reftable backend, which indeed does require to know about the proper object format at the time of creation. We thus need to rearrange the logic in git-clone(1) so that we only initialize the refdb once we have learned about the actual object format. As a first step, move listing of remote references to happen earlier, which also allow us to set up the hash algorithm of the repository earlier now. While we aim to execute this logic as late as possible until after most of the setup has happened already, detection of the object format and thus later the setup of the reference database must happen before any other logic that may spawn Git commands or otherwise these Git commands may not recognize the repository as such. The first Git step where we expect the repository to be fully initalized is when we fetch bundles via bundle URIs. Funny enough, the comments there also state that "the_repository must match the cloned repo", which is indeed not necessarily the case for the hash algorithm right now. So in practice it is the right thing to detect the remote's object format before downloading bundle URIs anyway, and not doing so causes clones with bundle URIs to fail when the local default object format does not match the remote repository's format. Unfortunately though, this creates a new issue: downloading bundles may take a long time, so if we list refs beforehand they might've grown stale meanwhile. It is not clear how to solve this issue except for a second reference listing though after we have downloaded the bundles, which may be an expensive thing to do. Arguably though, it's preferable to have a staleness issue compared to being unable to clone a repository altogether. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent bab2283 commit 9159029

File tree

2 files changed

+41
-25
lines changed

2 files changed

+41
-25
lines changed

builtin/clone.c

Lines changed: 23 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1266,6 +1266,26 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
12661266
if (transport->smart_options && !deepen && !filter_options.choice)
12671267
transport->smart_options->check_self_contained_and_connected = 1;
12681268

1269+
strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
1270+
refspec_ref_prefixes(&remote->fetch,
1271+
&transport_ls_refs_options.ref_prefixes);
1272+
if (option_branch)
1273+
expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
1274+
option_branch);
1275+
if (!option_no_tags)
1276+
strvec_push(&transport_ls_refs_options.ref_prefixes,
1277+
"refs/tags/");
1278+
1279+
refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
1280+
1281+
/*
1282+
* Now that we know what algorithm the remote side is using, let's set
1283+
* ours to the same thing.
1284+
*/
1285+
hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
1286+
initialize_repository_version(hash_algo, 1);
1287+
repo_set_hash_algo(the_repository, hash_algo);
1288+
12691289
/*
12701290
* Before fetching from the remote, download and install bundle
12711291
* data from the --bundle-uri option.
@@ -1281,24 +1301,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
12811301
bundle_uri);
12821302
else if (has_heuristic)
12831303
git_config_set_gently("fetch.bundleuri", bundle_uri);
1284-
}
1285-
1286-
strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
1287-
refspec_ref_prefixes(&remote->fetch,
1288-
&transport_ls_refs_options.ref_prefixes);
1289-
if (option_branch)
1290-
expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
1291-
option_branch);
1292-
if (!option_no_tags)
1293-
strvec_push(&transport_ls_refs_options.ref_prefixes,
1294-
"refs/tags/");
1295-
1296-
refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
1297-
1298-
if (refs)
1299-
mapped_refs = wanted_peer_refs(refs, &remote->fetch);
1300-
1301-
if (!bundle_uri) {
1304+
} else {
13021305
/*
13031306
* Populate transport->got_remote_bundle_uri and
13041307
* transport->bundle_uri. We might get nothing.
@@ -1319,13 +1322,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
13191322
}
13201323
}
13211324

1322-
/*
1323-
* Now that we know what algorithm the remote side is using,
1324-
* let's set ours to the same thing.
1325-
*/
1326-
hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
1327-
initialize_repository_version(hash_algo, 1);
1328-
repo_set_hash_algo(the_repository, hash_algo);
1325+
if (refs)
1326+
mapped_refs = wanted_peer_refs(refs, &remote->fetch);
13291327

13301328
if (mapped_refs) {
13311329
/*

t/t5558-clone-bundle-uri.sh

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,15 @@ test_expect_success 'clone with path bundle' '
3333
test_cmp expect actual
3434
'
3535

36+
test_expect_success 'clone with path bundle and non-default hash' '
37+
test_when_finished "rm -rf clone-path-non-default-hash" &&
38+
GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="clone-from/B.bundle" \
39+
clone-from clone-path-non-default-hash &&
40+
git -C clone-path-non-default-hash rev-parse refs/bundles/topic >actual &&
41+
git -C clone-from rev-parse topic >expect &&
42+
test_cmp expect actual
43+
'
44+
3645
test_expect_success 'clone with file:// bundle' '
3746
git clone --bundle-uri="file://$(pwd)/clone-from/B.bundle" \
3847
clone-from clone-file &&
@@ -284,6 +293,15 @@ test_expect_success 'clone HTTP bundle' '
284293
test_config -C clone-http log.excludedecoration refs/bundle/
285294
'
286295

296+
test_expect_success 'clone HTTP bundle with non-default hash' '
297+
test_when_finished "rm -rf clone-http-non-default-hash" &&
298+
GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="$HTTPD_URL/B.bundle" \
299+
"$HTTPD_URL/smart/fetch.git" clone-http-non-default-hash &&
300+
git -C clone-http-non-default-hash rev-parse refs/bundles/topic >actual &&
301+
git -C clone-from rev-parse topic >expect &&
302+
test_cmp expect actual
303+
'
304+
287305
test_expect_success 'clone bundle list (HTTP, no heuristic)' '
288306
test_when_finished rm -f trace*.txt &&
289307

0 commit comments

Comments
 (0)