Skip to content

Commit f8c1a8e

Browse files
pks-tgitster
authored andcommitted
reftable/merged: circumvent pqueue with single subiter
The merged iterator uses a priority queue to order records so that we can yielid them in the expected order. This priority queue of course comes with some overhead as we need to add, compare and remove entries in that priority queue. In the general case, that overhead cannot really be avoided. But when we have a single subiter left then there is no need to use the priority queue anymore because the order is exactly the same as what that subiter would return. While having a single subiter may sound like an edge case, it happens more frequently than one might think. In the most common scenario, you can expect a repository to have a single large table that contains most of the records and then a set of smaller tables which contain later additions to the reftable stack. In this case it is quite likely that we exhaust subiters of those smaller stacks before exhausting the large table. Special-case this and return records directly from the remaining subiter. This results in a sizeable speedup when iterating over 1m refs in a repository with a single table: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 135.4 ms ± 4.4 ms [User: 132.5 ms, System: 2.8 ms] Range (min … max): 131.0 ms … 166.3 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 126.3 ms ± 3.9 ms [User: 123.3 ms, System: 2.8 ms] Range (min … max): 122.7 ms … 157.0 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.07 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 3b6dd6a commit f8c1a8e

File tree

1 file changed

+22
-2
lines changed

1 file changed

+22
-2
lines changed

reftable/merged.c

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -87,16 +87,36 @@ static int merged_iter_next_entry(struct merged_iter *mi,
8787
struct reftable_record *rec)
8888
{
8989
struct pq_entry entry = { 0 };
90-
int err = 0;
90+
int err = 0, empty;
91+
92+
empty = merged_iter_pqueue_is_empty(mi->pq);
9193

9294
if (mi->advance_index >= 0) {
95+
/*
96+
* When there are no pqueue entries then we only have a single
97+
* subiter left. There is no need to use the pqueue in that
98+
* case anymore as we know that the subiter will return entries
99+
* in the correct order already.
100+
*
101+
* While this may sound like a very specific edge case, it may
102+
* happen more frequently than you think. Most repositories
103+
* will end up having a single large base table that contains
104+
* most of the refs. It's thus likely that we exhaust all
105+
* subiters but the one from that base ref.
106+
*/
107+
if (empty)
108+
return iterator_next(&mi->subiters[mi->advance_index].iter,
109+
rec);
110+
93111
err = merged_iter_advance_subiter(mi, mi->advance_index);
94112
if (err < 0)
95113
return err;
114+
if (!err)
115+
empty = 0;
96116
mi->advance_index = -1;
97117
}
98118

99-
if (merged_iter_pqueue_is_empty(mi->pq))
119+
if (empty)
100120
return 1;
101121

102122
entry = merged_iter_pqueue_remove(&mi->pq);

0 commit comments

Comments
 (0)