Skip to content

Commit 68d090a

Browse files
pks-tgitster
authored andcommitted
builtin/remote: rework how remote refs get renamed
It was recently reported [1] that renaming a remote that has dangling symrefs is broken. This issue can be trivially reproduced: $ git init repo Initialized empty Git repository in /tmp/repo/.git/ $ cd repo/ $ git remote add origin /dev/null $ git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master $ git remote rename origin renamed $ git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master $ git symbolic-ref refs/remotes/renamed/HEAD fatal: ref refs/remotes/renamed/HEAD is not a symbolic ref As one can see, the "HEAD" reference did not get renamed but stays in the same place. There are two issues here: - We use `refs_resolve_ref_unsafe()` to resolve references, but we don't pass the `RESOLVE_REF_NO_RECURSE` flag. Consequently, if the reference does not resolve, the function will fail and we thus ignore this branch. - We use `refs_for_each_ref()` to iterate through the old remote's references, but that function ignores broken references. Both of these issues are easy to fix. But having a closer look at the logic that renames remote references surfaces that it leaves a lot to be desired overall. The problem is that we're using O(|refs| + |symrefs| * 2) many reference transactions to perform the renames. We first delete all symrefs, then individually rename every direct reference and finally we recreate the symrefs. On the one hand this isn't even remotely an atomic operation, so if we hit any error we'll already have deleted all references. But more importantly it is also extremely inefficient. The number of transactions for symrefs doesn't really bother us too much, as there should generally only be a single symref anyway ("HEAD"). But the renames are very expensive: - For the "reftable" backend we perform auto-compaction after every single rename, which does add up. - For the "files" backend we potentially have to rewrite the "packed-refs" file on every single rename in case they are packed. The consequence here is quadratic runtime performance. Renaming a 100k references takes hours to complete. Refactor the code to use a single transaction to perform all the reference updates atomically, which speeds up the transaction quite significantly: Benchmark 1: rename remote (refformat = files, revision = HEAD~) Time (mean ± σ): 238.770 s ± 13.857 s [User: 91.473 s, System: 143.793 s] Range (min … max): 204.863 s … 247.699 s 10 runs Benchmark 2: rename remote (refformat = files, revision = HEAD) Time (mean ± σ): 2.103 s ± 0.036 s [User: 0.360 s, System: 1.313 s] Range (min … max): 2.011 s … 2.141 s 10 runs Summary rename remote (refformat = files, revision = HEAD) ran 113.53 ± 6.87 times faster than rename remote (refformat = files, revision = HEAD~) For the "reftable" backend we see a significant speedup, as well, but given that we don't have quadratic runtime behaviour there it's way less extreme: Benchmark 1: rename remote (refformat = reftable, revision = HEAD~) Time (mean ± σ): 8.604 s ± 0.539 s [User: 4.985 s, System: 2.368 s] Range (min … max): 7.880 s … 9.556 s 10 runs Benchmark 2: rename remote (refformat = reftable, revision = HEAD) Time (mean ± σ): 1.177 s ± 0.103 s [User: 0.446 s, System: 0.270 s] Range (min … max): 1.023 s … 1.410 s 10 runs Summary rename remote (refformat = reftable, revision = HEAD) ran 7.31 ± 0.79 times faster than rename remote (refformat = reftable, revision = HEAD~) There is one issue though with using atomic transactions: when nesting a remote into itself it can happen that renamed references conflict with the old referencse. For example, when we have a reference "refs/remotes/origin/foo" and we rename "origin" to "origin/foo", then we'll end up with an F/D conflict when we try to create the renamed reference "refs/remotes/origin/foo/foo". This situation is overall quite unlikely to happen: people tend to not use nested remotes, and if they do they must at the same time also have a conflicting refname. But the end result would be that the old remote references stay intact whereas all the other parts of the repository have been adjusted for the new remote name. Address this by queueing and preparing the reference update before we touch any other part of the repository. Like this we can make sure that the reference update will go through before rewriting the configuration. Otherwise, if the transaction fails to prepare we can gracefully abort the whole operation without any changes having been performed in the repository yet. Furthermore, we can detect the conflict and print some helpful advice for how the user can resolve this situation. So overall, the tradeoff is that: - Reference transactions are now all-or-nothing. This is a significant improvement over the previous state where we may have ended up with partially-renamed references. - Rewriting references is now significantly faster. - We only rewrite the configuration in case we know that all references can be updated. - But we may refuse to rename a remote in case references conflict. Overall this seems like an acceptable tradeoff. While at it, fix the handling of symbolic/broken references by using `refs_for_each_rawref()`. Add tests that cover both this reported issue and tests that exercise nesting of remotes. One thing to note: with this change we cannot provide a proper progress monitor anymore as we queue the references into the transactions as we iterate through them. Consequently, as we don't know yet how many refs there are in total, we cannot report how many percent of the operation is done anymore. But that's a small price to pay considering that you now shouldn't need the progress monitor in most situations at all anymore. [1]: <CANrWfmQWa=RJnm7d3C7ogRX6Tth2eeuGwvwrNmzS2gr+eP0OpA@mail.gmail.com> Reported-by: Han Jiang <[email protected]> Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 08e6a7a commit 68d090a

File tree

2 files changed

+270
-99
lines changed

2 files changed

+270
-99
lines changed

builtin/remote.c

Lines changed: 197 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
11
#define USE_THE_REPOSITORY_VARIABLE
22

33
#include "builtin.h"
4+
#include "advice.h"
45
#include "config.h"
6+
#include "date.h"
57
#include "gettext.h"
8+
#include "ident.h"
69
#include "parse-options.h"
710
#include "path.h"
811
#include "transport.h"
@@ -610,36 +613,161 @@ static int add_branch_for_removal(const char *refname,
610613
struct rename_info {
611614
const char *old_name;
612615
const char *new_name;
613-
struct string_list *remote_branches;
614-
uint32_t symrefs_nr;
616+
struct ref_transaction *transaction;
617+
struct progress *progress;
618+
struct strbuf *err;
619+
uint32_t progress_nr;
620+
uint64_t index;
615621
};
616622

617-
static int read_remote_branches(const char *refname, const char *referent UNUSED,
618-
const struct object_id *oid UNUSED,
619-
int flags UNUSED, void *cb_data)
623+
static void compute_renamed_ref(struct rename_info *rename,
624+
const char *refname,
625+
struct strbuf *out)
626+
{
627+
strbuf_reset(out);
628+
strbuf_addstr(out, refname);
629+
strbuf_splice(out, strlen("refs/remotes/"), strlen(rename->old_name),
630+
rename->new_name, strlen(rename->new_name));
631+
}
632+
633+
static int rename_one_reflog_entry(const char *old_refname,
634+
struct object_id *old_oid,
635+
struct object_id *new_oid,
636+
const char *committer,
637+
timestamp_t timestamp, int tz,
638+
const char *msg, void *cb_data)
620639
{
621640
struct rename_info *rename = cb_data;
622-
struct strbuf buf = STRBUF_INIT;
623-
struct string_list_item *item;
624-
int flag;
625-
const char *symref;
626-
627-
strbuf_addf(&buf, "refs/remotes/%s/", rename->old_name);
628-
if (starts_with(refname, buf.buf)) {
629-
item = string_list_append(rename->remote_branches, refname);
630-
symref = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
631-
refname, RESOLVE_REF_READING,
632-
NULL, &flag);
633-
if (symref && (flag & REF_ISSYMREF)) {
634-
item->util = xstrdup(symref);
635-
rename->symrefs_nr++;
636-
} else {
637-
item->util = NULL;
638-
}
641+
struct strbuf new_refname = STRBUF_INIT;
642+
struct strbuf identity = STRBUF_INIT;
643+
struct strbuf name = STRBUF_INIT;
644+
struct strbuf mail = STRBUF_INIT;
645+
struct ident_split ident;
646+
const char *date;
647+
int error;
648+
649+
compute_renamed_ref(rename, old_refname, &new_refname);
650+
651+
if (split_ident_line(&ident, committer, strlen(committer)) < 0) {
652+
error = -1;
653+
goto out;
639654
}
640-
strbuf_release(&buf);
641655

642-
return 0;
656+
strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
657+
strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
658+
659+
date = show_date(timestamp, tz, DATE_MODE(NORMAL));
660+
strbuf_addstr(&identity, fmt_ident(name.buf, mail.buf,
661+
WANT_BLANK_IDENT, date, 0));
662+
663+
error = ref_transaction_update_reflog(rename->transaction, new_refname.buf,
664+
new_oid, old_oid, identity.buf, msg,
665+
rename->index++, rename->err);
666+
667+
out:
668+
strbuf_release(&new_refname);
669+
strbuf_release(&identity);
670+
strbuf_release(&name);
671+
strbuf_release(&mail);
672+
return error;
673+
}
674+
675+
static int rename_one_reflog(const char *old_refname,
676+
const struct object_id *old_oid,
677+
struct rename_info *rename)
678+
{
679+
struct strbuf new_refname = STRBUF_INIT;
680+
struct strbuf message = STRBUF_INIT;
681+
int error;
682+
683+
if (!refs_reflog_exists(get_main_ref_store(the_repository), old_refname))
684+
return 0;
685+
686+
error = refs_for_each_reflog_ent(get_main_ref_store(the_repository),
687+
old_refname, rename_one_reflog_entry, rename);
688+
if (error < 0)
689+
goto out;
690+
691+
compute_renamed_ref(rename, old_refname, &new_refname);
692+
693+
/*
694+
* Manually write the reflog entry for the now-renamed ref. We cannot
695+
* rely on `rename_one_ref()` to do this for us as that would screw
696+
* over order in which reflog entries are being written.
697+
*
698+
* Furthermore, we only append the entry in case the reference
699+
* resolves. Missing references shouldn't have reflogs anyway.
700+
*/
701+
strbuf_addf(&message, "remote: renamed %s to %s", old_refname,
702+
new_refname.buf);
703+
704+
error = ref_transaction_update_reflog(rename->transaction, new_refname.buf,
705+
old_oid, old_oid, git_committer_info(0),
706+
message.buf, rename->index++, rename->err);
707+
if (error < 0)
708+
return error;
709+
710+
out:
711+
strbuf_release(&new_refname);
712+
strbuf_release(&message);
713+
return error;
714+
}
715+
716+
static int rename_one_ref(const char *old_refname, const char *referent,
717+
const struct object_id *oid,
718+
int flags, void *cb_data)
719+
{
720+
struct strbuf new_referent = STRBUF_INIT;
721+
struct strbuf new_refname = STRBUF_INIT;
722+
struct rename_info *rename = cb_data;
723+
const char *ptr = old_refname;
724+
int error;
725+
726+
if (!skip_prefix(ptr, "refs/remotes/", &ptr) ||
727+
!skip_prefix(ptr, rename->old_name, &ptr) ||
728+
!skip_prefix(ptr, "/", &ptr)) {
729+
error = 0;
730+
goto out;
731+
}
732+
733+
compute_renamed_ref(rename, old_refname, &new_refname);
734+
735+
if (flags & REF_ISSYMREF) {
736+
/*
737+
* Stupidly enough `referent` is not pointing to the immediate
738+
* target of a symref, but it's the recursively resolved value.
739+
* So symrefs pointing to symrefs would be misresolved, and
740+
* unborn symrefs don't have any value for the `referent` at all.
741+
*/
742+
referent = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
743+
old_refname, RESOLVE_REF_NO_RECURSE,
744+
NULL, NULL);
745+
compute_renamed_ref(rename, referent, &new_referent);
746+
oid = NULL;
747+
}
748+
749+
error = ref_transaction_delete(rename->transaction, old_refname,
750+
oid, referent, REF_NO_DEREF, NULL, rename->err);
751+
if (error < 0)
752+
goto out;
753+
754+
error = ref_transaction_update(rename->transaction, new_refname.buf, oid, null_oid(the_hash_algo),
755+
(flags & REF_ISSYMREF) ? new_referent.buf : NULL, NULL,
756+
REF_SKIP_CREATE_REFLOG | REF_NO_DEREF | REF_SKIP_OID_VERIFICATION,
757+
NULL, rename->err);
758+
if (error < 0)
759+
goto out;
760+
761+
error = rename_one_reflog(old_refname, oid, rename);
762+
if (error < 0)
763+
goto out;
764+
765+
display_progress(rename->progress, ++rename->progress_nr);
766+
767+
out:
768+
strbuf_release(&new_referent);
769+
strbuf_release(&new_refname);
770+
return error;
643771
}
644772

645773
static int migrate_file(struct remote *remote)
@@ -727,6 +855,14 @@ static void handle_push_default(const char* old_name, const char* new_name)
727855
strbuf_release(&push_default.origin);
728856
}
729857

858+
static const char conflicting_remote_refs_advice[] = N_(
859+
"The remote you are trying to rename has conflicting references in the\n"
860+
"new target refspec. This is most likely caused by you trying to nest\n"
861+
"a remote into itself, e.g. by renaming 'parent' into 'parent/child'\n"
862+
"or by unnesting a remote, e.g. the other way round.\n"
863+
"\n"
864+
"If that is the case, you can address this by first renaming the\n"
865+
"remote to a different name.\n");
730866

731867
static int mv(int argc, const char **argv, const char *prefix,
732868
struct repository *repo UNUSED)
@@ -738,11 +874,11 @@ static int mv(int argc, const char **argv, const char *prefix,
738874
};
739875
struct remote *oldremote, *newremote;
740876
struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT, buf3 = STRBUF_INIT,
741-
old_remote_context = STRBUF_INIT;
742-
struct string_list remote_branches = STRING_LIST_INIT_DUP;
743-
struct rename_info rename;
744-
int refs_renamed_nr = 0, refspecs_need_update = 0;
745-
struct progress *progress = NULL;
877+
old_remote_context = STRBUF_INIT, err = STRBUF_INIT;
878+
struct rename_info rename = {
879+
.err = &err,
880+
};
881+
int refspecs_need_update = 0;
746882
int result = 0;
747883

748884
argc = parse_options(argc, argv, prefix, options,
@@ -753,8 +889,6 @@ static int mv(int argc, const char **argv, const char *prefix,
753889

754890
rename.old_name = argv[0];
755891
rename.new_name = argv[1];
756-
rename.remote_branches = &remote_branches;
757-
rename.symrefs_nr = 0;
758892

759893
oldremote = remote_get(rename.old_name);
760894
if (!remote_is_configured(oldremote, 1)) {
@@ -788,6 +922,30 @@ static int mv(int argc, const char **argv, const char *prefix,
788922
refspecs_need_update = !!strstr(oldremote->fetch.items[i].raw,
789923
old_remote_context.buf);
790924

925+
if (refspecs_need_update) {
926+
rename.transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
927+
0, &err);
928+
if (!rename.transaction)
929+
goto out;
930+
931+
if (show_progress)
932+
rename.progress = start_delayed_progress(the_repository,
933+
_("Renaming remote references"), 0);
934+
935+
result = refs_for_each_rawref(get_main_ref_store(the_repository),
936+
rename_one_ref, &rename);
937+
if (result < 0)
938+
die(_("queueing remote ref renames failed: %s"), rename.err->buf);
939+
940+
result = ref_transaction_prepare(rename.transaction, &err);
941+
if (result < 0) {
942+
error("renaming remote references failed: %s", err.buf);
943+
if (result == REF_TRANSACTION_ERROR_NAME_CONFLICT)
944+
advise(conflicting_remote_refs_advice);
945+
die(NULL);
946+
}
947+
}
948+
791949
if (oldremote->fetch.nr) {
792950
strbuf_reset(&buf);
793951
strbuf_addf(&buf, "remote.%s.fetch", rename.new_name);
@@ -829,83 +987,23 @@ static int mv(int argc, const char **argv, const char *prefix,
829987
}
830988
}
831989

832-
if (!refspecs_need_update)
833-
goto out;
834-
835-
/*
836-
* First remove symrefs, then rename the rest, finally create
837-
* the new symrefs.
838-
*/
839-
refs_for_each_ref(get_main_ref_store(the_repository),
840-
read_remote_branches, &rename);
841-
if (show_progress) {
842-
/*
843-
* Count symrefs twice, since "renaming" them is done by
844-
* deleting and recreating them in two separate passes.
845-
*/
846-
progress = start_progress(the_repository,
847-
_("Renaming remote references"),
848-
rename.remote_branches->nr + rename.symrefs_nr);
849-
}
850-
for (size_t i = 0; i < remote_branches.nr; i++) {
851-
struct string_list_item *item = remote_branches.items + i;
852-
struct strbuf referent = STRBUF_INIT;
853-
854-
if (refs_read_symbolic_ref(get_main_ref_store(the_repository), item->string,
855-
&referent))
856-
continue;
857-
if (refs_delete_ref(get_main_ref_store(the_repository), NULL, item->string, NULL, REF_NO_DEREF))
858-
die(_("deleting '%s' failed"), item->string);
859-
860-
strbuf_release(&referent);
861-
display_progress(progress, ++refs_renamed_nr);
862-
}
863-
for (size_t i = 0; i < remote_branches.nr; i++) {
864-
struct string_list_item *item = remote_branches.items + i;
990+
if (refspecs_need_update) {
991+
result = ref_transaction_commit(rename.transaction, &err);
992+
if (result < 0)
993+
die(_("renaming remote refs failed: %s"), rename.err->buf);
865994

866-
if (item->util)
867-
continue;
868-
strbuf_reset(&buf);
869-
strbuf_addstr(&buf, item->string);
870-
strbuf_splice(&buf, strlen("refs/remotes/"), strlen(rename.old_name),
871-
rename.new_name, strlen(rename.new_name));
872-
strbuf_reset(&buf2);
873-
strbuf_addf(&buf2, "remote: renamed %s to %s",
874-
item->string, buf.buf);
875-
if (refs_rename_ref(get_main_ref_store(the_repository), item->string, buf.buf, buf2.buf))
876-
die(_("renaming '%s' failed"), item->string);
877-
display_progress(progress, ++refs_renamed_nr);
878-
}
879-
for (size_t i = 0; i < remote_branches.nr; i++) {
880-
struct string_list_item *item = remote_branches.items + i;
995+
stop_progress(&rename.progress);
881996

882-
if (!item->util)
883-
continue;
884-
strbuf_reset(&buf);
885-
strbuf_addstr(&buf, item->string);
886-
strbuf_splice(&buf, strlen("refs/remotes/"), strlen(rename.old_name),
887-
rename.new_name, strlen(rename.new_name));
888-
strbuf_reset(&buf2);
889-
strbuf_addstr(&buf2, item->util);
890-
strbuf_splice(&buf2, strlen("refs/remotes/"), strlen(rename.old_name),
891-
rename.new_name, strlen(rename.new_name));
892-
strbuf_reset(&buf3);
893-
strbuf_addf(&buf3, "remote: renamed %s to %s",
894-
item->string, buf.buf);
895-
if (refs_update_symref(get_main_ref_store(the_repository), buf.buf, buf2.buf, buf3.buf))
896-
die(_("creating '%s' failed"), buf.buf);
897-
display_progress(progress, ++refs_renamed_nr);
997+
handle_push_default(rename.old_name, rename.new_name);
898998
}
899-
stop_progress(&progress);
900-
901-
handle_push_default(rename.old_name, rename.new_name);
902999

9031000
out:
904-
string_list_clear(&remote_branches, 1);
1001+
ref_transaction_free(rename.transaction);
9051002
strbuf_release(&old_remote_context);
9061003
strbuf_release(&buf);
9071004
strbuf_release(&buf2);
9081005
strbuf_release(&buf3);
1006+
strbuf_release(&err);
9091007
return result;
9101008
}
9111009

0 commit comments

Comments
 (0)