Skip to content

Commit 9d2962a

Browse files
KarthikNayakgitster
authored andcommitted
receive-pack: use batched reference updates
The reference updates performed as a part of 'git-receive-pack(1)', take place one at a time. For each reference update, a new transaction is created and committed. This is necessary to ensure we can allow individual updates to fail without failing the entire command. The command also supports an 'atomic' mode, which uses a single transaction to update all of the references. But this mode has an all-or-nothing approach, where if a single update fails, all updates would fail. In 23fc8e4 (refs: implement batch reference update support, 2025-04-08), we introduced a new mechanism to batch reference updates. Under the hood, this uses a single transaction to perform a batch of reference updates, while allowing only individual updates to fail. Utilize this newly introduced batch update mechanism in 'git-receive-pack(1)'. This provides a significant bump in performance, especially when dealing with repositories with large number of references. With the reftable backend there is a 18x performance improvement, when performing receive-pack with 10000 refs: Benchmark 1: receive: many refs (refformat = reftable, refcount = 10000, revision = master) Time (mean ± σ): 4.276 s ± 0.078 s [User: 0.796 s, System: 3.318 s] Range (min … max): 4.185 s … 4.430 s 10 runs Benchmark 2: receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) Time (mean ± σ): 235.4 ms ± 6.9 ms [User: 75.4 ms, System: 157.3 ms] Range (min … max): 228.5 ms … 254.2 ms 11 runs Summary receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran 18.16 ± 0.63 times faster than receive: many refs (refformat = reftable, refcount = 10000, revision = master) In similar conditions, the files backend sees a 1.21x performance improvement: Benchmark 1: receive: many refs (refformat = files, refcount = 10000, revision = master) Time (mean ± σ): 1.121 s ± 0.021 s [User: 0.128 s, System: 0.975 s] Range (min … max): 1.097 s … 1.156 s 10 runs Benchmark 2: receive: many refs (refformat = files, refcount = 10000, revision = HEAD) Time (mean ± σ): 927.9 ms ± 22.6 ms [User: 99.0 ms, System: 815.2 ms] Range (min … max): 903.1 ms … 978.0 ms 10 runs Summary receive: many refs (refformat = files, refcount = 10000, revision = HEAD) ran 1.21 ± 0.04 times faster than receive: many refs (refformat = files, refcount = 10000, revision = master) As using batched updates requires the error handling to be moved to the end of the flow, create and use a 'struct strset' to track the failed refs and attribute the correct errors to them. This change also uncovers an issue when a client provides multiple updates to the same reference. For example: $ git send-pack remote.git A:foo B:foo Enumerating objects: 3, done. Counting objects: 100% (3/3), done. Delta compression using up to 20 threads Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), 226 bytes | 226.00 KiB/s, done. Total 3 (delta 1), reused 0 (delta 0), pack-reused 0 (from 0) remote: error: cannot lock ref 'refs/heads/foo': reference already exists To remote.git ! [remote rejected] A -> foo (failed to update ref) ! [remote failure] B -> foo (remote failed to report status) As you can see, the remote runs into an error because it cannot lock the target reference for the second update. Furthermore, the remote complains that the first update has been rejected whereas the second update didn't receive any status update because we failed to lock it. Reading this status message alone a user would probably expect that `foo` has not been updated at all. But that's not the case: while we claim that the ref wasn't updated, it surprisingly points to `A` now. One could argue that this is merely an error in how we report the result of this push. But ultimately, the user's request itself is already broken and doesn't make any sense in the first place and cannot ever lead to a sensible outcome that honors the full request. The conversion to batched transactions fixes the issue because we now try to queue both updates in the same transaction. As such, the transaction itself will notice this conflict and refuse the update altogether before we commit any of the values. Note that this requires changes to a couple of tests in t5408 that happened to exercise this behaviour. Given that the generated output is misleading and given that the user request cannot ever be fully honored this really feels more like a bug than properly designed behaviour. As such, changing the behaviour feels like the right thing to do. Since now reference updates are batched, the 'reference-transaction' hook will be invoked with all updates together. Currently git will 'die' when the hook returns with a non-zero exit status in the 'prepared' stage. For 'git-receive-pack(1)', this allowed users to reject an individual reference update, git would have applied previous updates but immediately abort further execution. This is definitely an incorrect usage of this hook, since the right place to do this would be the 'update' hook. This patch retains the latter behavior, but 'reference-transaction' hook now changes to a all-or-nothing behavior when a non-zero exit status is returned in the 'prepared' stage, since batch updates use a transaction under the hood. This explains the change in 't1416'. Helped-by: Jeff King <[email protected]> Helped-by: Patrick Steinhardt <[email protected]> Signed-off-by: Karthik Nayak <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 77188b5 commit 9d2962a

File tree

3 files changed

+56
-23
lines changed

3 files changed

+56
-23
lines changed

builtin/receive-pack.c

Lines changed: 48 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1845,35 +1845,67 @@ static void BUG_if_skipped_connectivity_check(struct command *commands,
18451845
BUG_if_bug("connectivity check skipped???");
18461846
}
18471847

1848+
static void ref_transaction_rejection_handler(const char *refname,
1849+
const struct object_id *old_oid UNUSED,
1850+
const struct object_id *new_oid UNUSED,
1851+
const char *old_target UNUSED,
1852+
const char *new_target UNUSED,
1853+
enum ref_transaction_error err,
1854+
void *cb_data)
1855+
{
1856+
struct strmap *failed_refs = cb_data;
1857+
1858+
strmap_put(failed_refs, refname, (char *)ref_transaction_error_msg(err));
1859+
}
1860+
18481861
static void execute_commands_non_atomic(struct command *commands,
18491862
struct shallow_info *si)
18501863
{
18511864
struct command *cmd;
18521865
struct strbuf err = STRBUF_INIT;
1866+
const char *reported_error = NULL;
1867+
struct strmap failed_refs = STRMAP_INIT;
1868+
1869+
transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
1870+
REF_TRANSACTION_ALLOW_FAILURE, &err);
1871+
if (!transaction) {
1872+
rp_error("%s", err.buf);
1873+
strbuf_reset(&err);
1874+
reported_error = "transaction failed to start";
1875+
goto failure;
1876+
}
18531877

18541878
for (cmd = commands; cmd; cmd = cmd->next) {
18551879
if (!should_process_cmd(cmd) || cmd->run_proc_receive)
18561880
continue;
18571881

1858-
transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
1859-
0, &err);
1860-
if (!transaction) {
1861-
rp_error("%s", err.buf);
1862-
strbuf_reset(&err);
1863-
cmd->error_string = "transaction failed to start";
1864-
continue;
1865-
}
1866-
18671882
cmd->error_string = update(cmd, si);
1883+
}
18681884

1869-
if (!cmd->error_string
1870-
&& ref_transaction_commit(transaction, &err)) {
1871-
rp_error("%s", err.buf);
1872-
strbuf_reset(&err);
1873-
cmd->error_string = "failed to update ref";
1874-
}
1875-
ref_transaction_free(transaction);
1885+
if (ref_transaction_commit(transaction, &err)) {
1886+
rp_error("%s", err.buf);
1887+
reported_error = "failed to update refs";
1888+
goto failure;
1889+
}
1890+
1891+
ref_transaction_for_each_rejected_update(transaction,
1892+
ref_transaction_rejection_handler,
1893+
&failed_refs);
1894+
1895+
if (strmap_empty(&failed_refs))
1896+
goto cleanup;
1897+
1898+
failure:
1899+
for (cmd = commands; cmd; cmd = cmd->next) {
1900+
if (reported_error)
1901+
cmd->error_string = reported_error;
1902+
else if (strmap_contains(&failed_refs, cmd->ref_name))
1903+
cmd->error_string = strmap_get(&failed_refs, cmd->ref_name);
18761904
}
1905+
1906+
cleanup:
1907+
ref_transaction_free(transaction);
1908+
strmap_clear(&failed_refs, 0);
18771909
strbuf_release(&err);
18781910
}
18791911

t/t1416-ref-transaction-hooks.sh

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,8 +120,6 @@ test_expect_success 'interleaving hook calls succeed' '
120120
121121
cat >expect <<-EOF &&
122122
hooks/update refs/tags/PRE $ZERO_OID $PRE_OID
123-
hooks/reference-transaction prepared
124-
hooks/reference-transaction committed
125123
hooks/update refs/tags/POST $ZERO_OID $POST_OID
126124
hooks/reference-transaction prepared
127125
hooks/reference-transaction committed

t/t5408-send-pack-stdin.sh

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -69,21 +69,24 @@ test_expect_success 'stdin mixed with cmdline' '
6969

7070
test_expect_success 'cmdline refs written in order' '
7171
clear_remote &&
72-
test_must_fail git send-pack remote.git A:foo B:foo &&
73-
verify_push A foo
72+
test_must_fail git send-pack remote.git A:foo B:foo 2>err &&
73+
test_grep "multiple updates for ref ${SQ}refs/heads/foo${SQ} not allowed" err &&
74+
test_must_fail git --git-dir=remote.git rev-parse foo
7475
'
7576

7677
test_expect_success 'cmdline refs with multiple duplicates' '
7778
clear_remote &&
78-
test_must_fail git send-pack remote.git A:foo B:foo C:foo &&
79-
verify_push A foo
79+
test_must_fail git send-pack remote.git A:foo B:foo C:foo 2>err &&
80+
test_grep "multiple updates for ref ${SQ}refs/heads/foo${SQ} not allowed" err &&
81+
test_must_fail git --git-dir=remote.git rev-parse foo
8082
'
8183

8284
test_expect_success '--stdin refs come after cmdline' '
8385
clear_remote &&
8486
echo A:foo >input &&
8587
test_must_fail git send-pack remote.git --stdin B:foo <input &&
86-
verify_push B foo
88+
test_grep "multiple updates for ref ${SQ}refs/heads/foo${SQ} not allowed" err &&
89+
test_must_fail git --git-dir=remote.git rev-parse foo
8790
'
8891

8992
test_expect_success 'refspecs and --mirror do not mix (cmdline)' '

0 commit comments

Comments
 (0)