Skip to content

Commit 0e358de

Browse files
KarthikNayakgitster
authored andcommitted
fetch: use batched reference updates
The reference updates performed as a part of 'git-fetch(1)', take place one at a time. For each reference update, a new transaction is created and committed. This is necessary to ensure we can allow individual updates to fail without failing the entire command. The command also supports an '--atomic' mode, which uses a single transaction to update all of the references. But this mode has an all-or-nothing approach, where if a single update fails, all updates would fail. In 23fc8e4 (refs: implement batch reference update support, 2025-04-08), we introduced a new mechanism to batch reference updates. Under the hood, this uses a single transaction to perform a batch of reference updates, while allowing only individual updates to fail. Utilize this newly introduced batch update mechanism in 'git-fetch(1)'. This provides a significant bump in performance, especially when dealing with repositories with large number of references. Adding support for batched updates is simply modifying the flow to also create a batch update transaction in the non-atomic flow. With the reftable backend there is a 22x performance improvement, when performing 'git-fetch(1)' with 10000 refs: Benchmark 1: fetch: many refs (refformat = reftable, refcount = 10000, revision = master) Time (mean ± σ): 3.403 s ± 0.775 s [User: 1.875 s, System: 1.417 s] Range (min … max): 2.454 s … 4.529 s 10 runs Benchmark 2: fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) Time (mean ± σ): 154.3 ms ± 17.6 ms [User: 102.5 ms, System: 56.1 ms] Range (min … max): 145.2 ms … 220.5 ms 18 runs Summary fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran 22.06 ± 5.62 times faster than fetch: many refs (refformat = reftable, refcount = 10000, revision = master) In similar conditions, the files backend sees a 1.25x performance improvement: Benchmark 1: fetch: many refs (refformat = files, refcount = 10000, revision = master) Time (mean ± σ): 605.5 ms ± 9.4 ms [User: 117.8 ms, System: 483.3 ms] Range (min … max): 595.6 ms … 621.5 ms 10 runs Benchmark 2: fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) Time (mean ± σ): 485.8 ms ± 4.3 ms [User: 91.1 ms, System: 396.7 ms] Range (min … max): 477.6 ms … 494.3 ms 10 runs Summary fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) ran 1.25 ± 0.02 times faster than fetch: many refs (refformat = files, refcount = 10000, revision = master) With this we'll either be using a regular transaction or a batch update transaction. This helps cleanup some code which is no longer needed as we'll now always have some type of 'ref_transaction' object being propagated. One big change is that earlier, each individual update would propagate a failure. Whereas now, the `ref_transaction_for_each_rejected_update` function is called at the end of the flow to capture the exit status for 'git-fetch(1)' and also to print F/D conflict errors. This does change the order of the errors being printed, but the behavior stays the same. Since transaction errors are now explicitly defined as part of 76e760b (refs: introduce enum-based transaction error types, 2025-04-08), utilize them and get rid of custom errors defined within 'builtin/fetch.c'. Signed-off-by: Karthik Nayak <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent b3de383 commit 0e358de

File tree

1 file changed

+73
-54
lines changed

1 file changed

+73
-54
lines changed

builtin/fetch.c

Lines changed: 73 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -640,17 +640,13 @@ static struct ref *get_ref_map(struct remote *remote,
640640
return ref_map;
641641
}
642642

643-
#define STORE_REF_ERROR_OTHER 1
644-
#define STORE_REF_ERROR_DF_CONFLICT 2
645-
646643
static int s_update_ref(const char *action,
647644
struct ref *ref,
648645
struct ref_transaction *transaction,
649646
int check_old)
650647
{
651648
char *msg;
652649
char *rla = getenv("GIT_REFLOG_ACTION");
653-
struct ref_transaction *our_transaction = NULL;
654650
struct strbuf err = STRBUF_INIT;
655651
int ret;
656652

@@ -660,43 +656,10 @@ static int s_update_ref(const char *action,
660656
rla = default_rla.buf;
661657
msg = xstrfmt("%s: %s", rla, action);
662658

663-
/*
664-
* If no transaction was passed to us, we manage the transaction
665-
* ourselves. Otherwise, we trust the caller to handle the transaction
666-
* lifecycle.
667-
*/
668-
if (!transaction) {
669-
transaction = our_transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
670-
0, &err);
671-
if (!transaction) {
672-
ret = STORE_REF_ERROR_OTHER;
673-
goto out;
674-
}
675-
}
676-
677659
ret = ref_transaction_update(transaction, ref->name, &ref->new_oid,
678660
check_old ? &ref->old_oid : NULL,
679661
NULL, NULL, 0, msg, &err);
680-
if (ret) {
681-
ret = STORE_REF_ERROR_OTHER;
682-
goto out;
683-
}
684-
685-
if (our_transaction) {
686-
switch (ref_transaction_commit(our_transaction, &err)) {
687-
case 0:
688-
break;
689-
case REF_TRANSACTION_ERROR_NAME_CONFLICT:
690-
ret = STORE_REF_ERROR_DF_CONFLICT;
691-
goto out;
692-
default:
693-
ret = STORE_REF_ERROR_OTHER;
694-
goto out;
695-
}
696-
}
697662

698-
out:
699-
ref_transaction_free(our_transaction);
700663
if (ret)
701664
error("%s", err.buf);
702665
strbuf_release(&err);
@@ -1139,7 +1102,6 @@ N_("it took %.2f seconds to check forced updates; you can use\n"
11391102
"to avoid this check\n");
11401103

11411104
static int store_updated_refs(struct display_state *display_state,
1142-
const char *remote_name,
11431105
int connectivity_checked,
11441106
struct ref_transaction *transaction, struct ref *ref_map,
11451107
struct fetch_head *fetch_head,
@@ -1277,11 +1239,6 @@ static int store_updated_refs(struct display_state *display_state,
12771239
}
12781240
}
12791241

1280-
if (rc & STORE_REF_ERROR_DF_CONFLICT)
1281-
error(_("some local refs could not be updated; try running\n"
1282-
" 'git remote prune %s' to remove any old, conflicting "
1283-
"branches"), remote_name);
1284-
12851242
if (advice_enabled(ADVICE_FETCH_SHOW_FORCED_UPDATES)) {
12861243
if (!config->show_forced_updates) {
12871244
warning(_(warn_show_forced_updates));
@@ -1365,9 +1322,8 @@ static int fetch_and_consume_refs(struct display_state *display_state,
13651322
}
13661323

13671324
trace2_region_enter("fetch", "consume_refs", the_repository);
1368-
ret = store_updated_refs(display_state, transport->remote->name,
1369-
connectivity_checked, transaction, ref_map,
1370-
fetch_head, config);
1325+
ret = store_updated_refs(display_state, connectivity_checked,
1326+
transaction, ref_map, fetch_head, config);
13711327
trace2_region_leave("fetch", "consume_refs", the_repository);
13721328

13731329
out:
@@ -1687,6 +1643,36 @@ static int set_head(const struct ref *remote_refs, struct remote *remote)
16871643
return result;
16881644
}
16891645

1646+
struct ref_rejection_data {
1647+
int *retcode;
1648+
int conflict_msg_shown;
1649+
const char *remote_name;
1650+
};
1651+
1652+
static void ref_transaction_rejection_handler(const char *refname,
1653+
const struct object_id *old_oid UNUSED,
1654+
const struct object_id *new_oid UNUSED,
1655+
const char *old_target UNUSED,
1656+
const char *new_target UNUSED,
1657+
enum ref_transaction_error err,
1658+
void *cb_data)
1659+
{
1660+
struct ref_rejection_data *data = cb_data;
1661+
1662+
if (err == REF_TRANSACTION_ERROR_NAME_CONFLICT && !data->conflict_msg_shown) {
1663+
error(_("some local refs could not be updated; try running\n"
1664+
" 'git remote prune %s' to remove any old, conflicting "
1665+
"branches"), data->remote_name);
1666+
data->conflict_msg_shown = 1;
1667+
} else {
1668+
const char *reason = ref_transaction_error_msg(err);
1669+
1670+
error(_("fetching ref %s failed: %s"), refname, reason);
1671+
}
1672+
1673+
*data->retcode = 1;
1674+
}
1675+
16901676
static int do_fetch(struct transport *transport,
16911677
struct refspec *rs,
16921678
const struct fetch_config *config)
@@ -1807,6 +1793,24 @@ static int do_fetch(struct transport *transport,
18071793
retcode = 1;
18081794
}
18091795

1796+
/*
1797+
* If not atomic, we can still use batched updates, which would be much
1798+
* more performant. We don't initiate the transaction before pruning,
1799+
* since pruning must be an independent step, to avoid F/D conflicts.
1800+
*
1801+
* TODO: if reference transactions gain logical conflict resolution, we
1802+
* can delete and create refs (with F/D conflicts) in the same transaction
1803+
* and this can be moved above the 'prune_refs()' block.
1804+
*/
1805+
if (!transaction) {
1806+
transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
1807+
REF_TRANSACTION_ALLOW_FAILURE, &err);
1808+
if (!transaction) {
1809+
retcode = -1;
1810+
goto cleanup;
1811+
}
1812+
}
1813+
18101814
if (fetch_and_consume_refs(&display_state, transport, transaction, ref_map,
18111815
&fetch_head, config)) {
18121816
retcode = 1;
@@ -1838,16 +1842,31 @@ static int do_fetch(struct transport *transport,
18381842
free_refs(tags_ref_map);
18391843
}
18401844

1841-
if (transaction) {
1842-
if (retcode)
1843-
goto cleanup;
1845+
if (retcode)
1846+
goto cleanup;
18441847

1845-
retcode = ref_transaction_commit(transaction, &err);
1848+
retcode = ref_transaction_commit(transaction, &err);
1849+
if (retcode) {
1850+
/*
1851+
* Explicitly handle transaction cleanup to avoid
1852+
* aborting an already closed transaction.
1853+
*/
1854+
ref_transaction_free(transaction);
1855+
transaction = NULL;
1856+
goto cleanup;
1857+
}
1858+
1859+
if (!atomic_fetch) {
1860+
struct ref_rejection_data data = {
1861+
.retcode = &retcode,
1862+
.conflict_msg_shown = 0,
1863+
.remote_name = transport->remote->name,
1864+
};
1865+
1866+
ref_transaction_for_each_rejected_update(transaction,
1867+
ref_transaction_rejection_handler,
1868+
&data);
18461869
if (retcode) {
1847-
/*
1848-
* Explicitly handle transaction cleanup to avoid
1849-
* aborting an already closed transaction.
1850-
*/
18511870
ref_transaction_free(transaction);
18521871
transaction = NULL;
18531872
goto cleanup;

0 commit comments

Comments
 (0)