Skip to content

Commit 4950aca

Browse files
pks-tgitster
authored andcommitted
reftable: document reading and writing indices
The way the index gets written and read is not trivial at all and requires the reader to piece together a bunch of parts to figure out how it works. Add some documentation to hopefully make this easier to understand for the next reader. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent e748560 commit 4950aca

File tree

2 files changed

+50
-0
lines changed

2 files changed

+50
-0
lines changed

reftable/reader.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -508,11 +508,38 @@ static int reader_seek_indexed(struct reftable_reader *r,
508508
if (err < 0)
509509
goto done;
510510

511+
/*
512+
* The index may consist of multiple levels, where each level may have
513+
* multiple index blocks. We start by doing a linear search in the
514+
* highest layer that identifies the relevant index block as well as
515+
* the record inside that block that corresponds to our wanted key.
516+
*/
511517
err = reader_seek_linear(&index_iter, &want_index);
512518
if (err < 0)
513519
goto done;
514520

521+
/*
522+
* Traverse down the levels until we find a non-index entry.
523+
*/
515524
while (1) {
525+
/*
526+
* In case we seek a record that does not exist the index iter
527+
* will tell us that the iterator is over. This works because
528+
* the last index entry of the current level will contain the
529+
* last key it knows about. So in case our seeked key is larger
530+
* than the last indexed key we know that it won't exist.
531+
*
532+
* There is one subtlety in the layout of the index section
533+
* that makes this work as expected: the highest-level index is
534+
* at end of the section and will point backwards and thus we
535+
* start reading from the end of the index section, not the
536+
* beginning.
537+
*
538+
* If that wasn't the case and the order was reversed then the
539+
* linear seek would seek into the lower levels and traverse
540+
* all levels of the index only to find out that the key does
541+
* not exist.
542+
*/
516543
err = table_iter_next(&index_iter, &index_result);
517544
table_iter_block_done(&index_iter);
518545
if (err != 0)

reftable/writer.c

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,24 @@ static int writer_finish_section(struct reftable_writer *w)
391391
if (err < 0)
392392
return err;
393393

394+
/*
395+
* When the section we are about to index has a lot of blocks then the
396+
* index itself may span across multiple blocks, as well. This would
397+
* require a linear scan over index blocks only to find the desired
398+
* indexed block, which is inefficient. Instead, we write a multi-level
399+
* index where index records of level N+1 will refer to index blocks of
400+
* level N. This isn't constant time, either, but at least logarithmic.
401+
*
402+
* This loop handles writing this multi-level index. Note that we write
403+
* the lowest-level index pointing to the indexed blocks first. We then
404+
* continue writing additional index levels until the current level has
405+
* less blocks than the threshold so that the highest level will be at
406+
* the end of the index section.
407+
*
408+
* Readers are thus required to start reading the index section from
409+
* its end, which is why we set `index_start` to the beginning of the
410+
* last index section.
411+
*/
394412
while (w->index_len > threshold) {
395413
struct reftable_index_record *idx = NULL;
396414
size_t i, idx_len;
@@ -427,6 +445,11 @@ static int writer_finish_section(struct reftable_writer *w)
427445
reftable_free(idx);
428446
}
429447

448+
/*
449+
* The index may still contain a number of index blocks lower than the
450+
* threshold. Clear it so that these entries don't leak into the next
451+
* index section.
452+
*/
430453
writer_clear_index(w);
431454

432455
bstats = writer_reftable_block_stats(w, typ);

0 commit comments

Comments
 (0)