Skip to content

Commit 01639ec

Browse files
pks-tgitster
authored andcommitted
reftable/record: avoid copying author info
Each reflog entry contains information regarding the authorship of who has made the change. This authorship information is not the same as that of any of the commits that the reflog entry references, but instead corresponds to the local user that has executed the command. Thus, it is almost always the case that all reflog entries have the same author. We can make use of this fact when decoding reftable records: instead of freeing and then reallocating the authorship information of log records, we can special-case when the next record during an iteration has the exact same authorship as the preceding record. If so, then there is no need to reallocate the respective fields. This change results in two allocations less per log record that we're iterating over in the most common case. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 4,068,487 allocs, 4,068,365 frees, 332,011,793 bytes allocated An alternative would be to store the capacity of both name and email and then use `REFTABLE_ALLOC_GROW()` to conditionally reallocate the array. But reftable records are copied around quite a lot, and thus we need to be a bit mindful of the overall record size. Furthermore, a memory comparison should also be more efficient than having to copy over memory even if we wouldn't have to allocate a new array every time. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 87ff723 commit 01639ec

File tree

1 file changed

+21
-8
lines changed

1 file changed

+21
-8
lines changed

reftable/record.c

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -896,21 +896,34 @@ static int reftable_log_record_decode(void *rec, struct strbuf key,
896896
goto done;
897897
string_view_consume(&in, n);
898898

899-
r->value.update.name =
900-
reftable_realloc(r->value.update.name, dest.len + 1);
901-
memcpy(r->value.update.name, dest.buf, dest.len);
902-
r->value.update.name[dest.len] = 0;
899+
/*
900+
* In almost all cases we can expect the reflog name to not change for
901+
* reflog entries as they are tied to the local identity, not to the
902+
* target commits. As an optimization for this common case we can thus
903+
* skip copying over the name in case it's accurate already.
904+
*/
905+
if (!r->value.update.name ||
906+
strcmp(r->value.update.name, dest.buf)) {
907+
r->value.update.name =
908+
reftable_realloc(r->value.update.name, dest.len + 1);
909+
memcpy(r->value.update.name, dest.buf, dest.len);
910+
r->value.update.name[dest.len] = 0;
911+
}
903912

904913
strbuf_reset(&dest);
905914
n = decode_string(&dest, in);
906915
if (n < 0)
907916
goto done;
908917
string_view_consume(&in, n);
909918

910-
r->value.update.email =
911-
reftable_realloc(r->value.update.email, dest.len + 1);
912-
memcpy(r->value.update.email, dest.buf, dest.len);
913-
r->value.update.email[dest.len] = 0;
919+
/* Same as above, but for the reflog email. */
920+
if (!r->value.update.email ||
921+
strcmp(r->value.update.email, dest.buf)) {
922+
r->value.update.email =
923+
reftable_realloc(r->value.update.email, dest.len + 1);
924+
memcpy(r->value.update.email, dest.buf, dest.len);
925+
r->value.update.email[dest.len] = 0;
926+
}
914927

915928
ts = 0;
916929
n = get_var_int(&ts, &in);

0 commit comments

Comments
 (0)