Skip to content

Commit ab6eea6

Browse files
peffgitster
authored andcommitted
receive-pack: use oidset to de-duplicate .have lines
If you have an alternate object store with a very large number of refs, the peak memory usage of the sha1_array can grow high, even if most of them are duplicates that end up not being printed at all. The similar for_each_alternate_ref() code-paths in fetch-pack solve this by using flags in "struct object" to de-duplicate (and so are relying on obj_hash at the core). But we don't have a "struct object" at all in this case. We could call lookup_unknown_object() to get one, but if our goal is reducing memory footprint, it's not great: - an unknown object is as large as the largest object type (a commit), which is bigger than an oidset entry - we can free the memory after our ref advertisement, but "struct object" entries persist forever (and the receive-pack may hang around for a long time, as the bottleneck is often client upload bandwidth). So let's use an oidset. Note that unlike a sha1-array it doesn't sort the output as a side effect. However, our output is at least stable, because for_each_alternate_ref() will give us the sha1s in ref-sorted order. In one particularly pathological case with an alternate that has 60,000 unique refs out of 80 million total, this reduced the peak heap usage of "git receive-pack . </dev/null" from 13GB to 14MB. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 29c2bd5 commit ab6eea6

File tree

1 file changed

+12
-14
lines changed

1 file changed

+12
-14
lines changed

builtin/receive-pack.c

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
#include "sigchain.h"
2222
#include "fsck.h"
2323
#include "tmp-objdir.h"
24+
#include "oidset.h"
2425

2526
static const char * const receive_pack_usage[] = {
2627
N_("git receive-pack <git-dir>"),
@@ -271,27 +272,24 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
271272
return 0;
272273
}
273274

274-
static int show_one_alternate_sha1(const unsigned char sha1[20], void *unused)
275+
static void show_one_alternate_ref(const char *refname,
276+
const struct object_id *oid,
277+
void *data)
275278
{
276-
show_ref(".have", sha1);
277-
return 0;
278-
}
279+
struct oidset *seen = data;
279280

280-
static void collect_one_alternate_ref(const char *refname,
281-
const struct object_id *oid,
282-
void *data)
283-
{
284-
struct sha1_array *sa = data;
285-
sha1_array_append(sa, oid->hash);
281+
if (oidset_insert(seen, oid))
282+
return;
283+
284+
show_ref(".have", oid->hash);
286285
}
287286

288287
static void write_head_info(void)
289288
{
290-
struct sha1_array sa = SHA1_ARRAY_INIT;
289+
static struct oidset seen = OIDSET_INIT;
291290

292-
for_each_alternate_ref(collect_one_alternate_ref, &sa);
293-
sha1_array_for_each_unique(&sa, show_one_alternate_sha1, NULL);
294-
sha1_array_clear(&sa);
291+
for_each_alternate_ref(show_one_alternate_ref, &seen);
292+
oidset_clear(&seen);
295293
for_each_ref(show_ref_cb, NULL);
296294
if (!sent_capabilities)
297295
show_ref("capabilities^{}", null_sha1);

0 commit comments

Comments
 (0)