Commit 93869f4
fastq: Add RecordIndex and IndexedReader for O(1) random access
RecordIndex builds an offset index in a single scan, replacing the
separate count_lines, build_record_index, and estimate_avg_record_size
functions. IndexedReader wraps a seekable reader + index and provides:
- read_record_at(i): seek to known offset, no boundary scanning
- read_records_at(indices): batch sorted reads
- skip_ahead_sample(target, rng): geometric jumps in record space
This eliminates the byte-level boundary detection (find_record_after)
that was unreliable for paired-end data. Skip-ahead now jumps between
index entries instead of byte positions.
Removed from subsample.rs: skip_ahead_sample_file, build_record_index,
read_record_at, find_record_after, exponential_sample,
estimate_avg_record_size, count_records_from_index.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 8a6a5c9 commit 93869f4
File tree
4 files changed
+570
-489
lines changed- src
- commands
- fastq
- io
4 files changed
+570
-489
lines changed
0 commit comments