iterv2: introduce a new iterator stack (off by default) by RaduBerinde · Pull Request #5851 · cockroachdb/pebble

RaduBerinde · 2026-03-20T17:11:57Z

We introduce a new iterator abstraction along with a new merging
iterator, level iterator, interleaving iterator stack. This
abstraction allows presenting the next span boundary at any point,
which allows the merging iterator to (conceptually) divide the key
space into "slabs". The slab approach will make the addition of new
rangedel-type keys much easier. The boundary also naturally allows
"stopping points" in the level iterator without a special contract
with the merging iterator.

The new stack can be used by setting iterv2.Enabled. The stack
functions correctly (it passed a few hours of the metamorphic test).
It is still a prototype in that it does not yet have many of the
optimizations the existing stack has (like try-seek-using-next). The
integration of the v2 stack also needs some optimization in terms of
allocations.

Most of the code inside the iterators was written by me. Claude helped
with comments and with limited bits and pieces but it was generating
much more complicated code for the core logic. In all fairness, I also
had to go back and rewrite most of it a few times.

Suggestion on what to review:

iterv2.Iter semantics;
iterv2.TestIter implementation. While this is only used for
testing, it serves as a formal codification of the semantics (with
a fairly simple implementation). All the new iterators are tested
against the TestIter, so it's less important to review the
actual iterator implementations;
iterv2.InterleavingIter struct and Init() comments;
levelIterV2 struct comment;
mergingIterV2 struct comment;
slabState interface (BuildForward).

cockroach-teamcity · 2026-03-20T17:12:06Z

This change is

RaduBerinde · 2026-03-20T17:15:05Z

CC @jbowens - in case you are interested; this continues and is inspired by your work in this area over the last few years.

We introduce a new iterator abstraction along with a new merging iterator, level iterator, interleaving iterator stack. This abstraction allows presenting the next span boundary at any point, which allows the merging iterator to (conceptually) divide the key space into "slabs". The slab approach will make the addition of new rangedel-type keys much easier. The boundary also naturally allows "stopping points" in the level iterator without a special contract with the merging iterator. The new stack can be used by setting `iterv2.Enabled`. The stack functions correctly (it passed a few hours of the metamorphic test). It is still a prototype in that it does not yet have many of the optimizations the existing stack has (like try-seek-using-next). The integration of the v2 stack also needs some optimization in terms of allocations. Most of the code inside the iterators was written by me. Claude helped with comments and with limited bits and pieces but it was generating much more complicated code for the core logic. In all fairness, I also had to go back and rewrite most of it a few times. Suggestion on what to review: - `iterv2.Iter` semantics; - `iterv2.TestIter` implementation. While this is only used for testing, it serves as a formal codification of the semantics (with a fairly simple implementation). All the new iterators are tested against the `TestIter`, so it's less important to review the actual iterator implementations; - `iterv2.InterleavingIter` struct and `Init()` comments; - `levelIterV2` struct comment; - `mergingIterV2` struct comment; - `slabState` interface (`BuildForward`).

sumeerbhola

@sumeerbhola made 5 comments.
Reviewable status: 0 of 33 files reviewed, 5 unresolved discussions (waiting on annrpom and RaduBerinde).

internal/iterv2/iter.go line 43 at r3 (raw file):

// # Boundary keys
//
// When the iterator is about to leave the current Span, it returns a synthetic

And the Span() for this key is still the same span, yes?

internal/iterv2/iter.go line 51 at r3 (raw file):

// same user key. When a point key coincides with a span boundary (e.g. point
// key "c" at the start of span [c, g)), the boundary key for the previous span
// is returned first, followed by the point key (now in the new span).

This example is the forward iteration case.
Worth clarifying that.

And adding an example for the reverse case too.

Also, are we side-stepping the issue of how InternalKeyKindSpanBoundary compares with the rangedel or the rangekey key kinds?
We currently have these boundary/sentinel/fake/synthetic keys that use I think the rangedel and other rangekey kind. I suppose we no longer need to expose them. So we won't have a situation where both a rangedel end is exposed as a point and so is this InternalKeyKindSpanBoundary and they are in the wrong order.

internal/iterv2/iter.go line 54 at r3 (raw file):

//
// At a boundary key, Span() still returns the Span being exited; it is updated
// to the adjacent Span on the subsequent Next or Prev call.

This answers my question about the Span().

internal/iterv2/iter.go line 100 at r3 (raw file):

// nil, the caller would not discover any span keys (like RANGEDEL) in this
// region.
//

I'm a bit lost here. Could use some more examples.
For example with the same SeekPrefixGE(a) and the only keys are b keys and a RANGEDEL [b, ...) we won't return anything, yes?

Or is this saying that it will not return the b keys but will go all the way to the b boundary, and stop before showing the RANGEDEL [b, ...)?

The nuanced example doesn't have any point keys that don't match the prefix that are skipped over until the boundary is reached.

internal/iterv2/iter.go line 165 at r3 (raw file):

	//
	// Boundary can be nil when the iteration range is unbounded in the
	// corresponding direction: nil for BoundaryEnd means no upper bound, nil for

Is this unbounded case for dealing with memtable and batch iterators? I am assuming any iterator on a file will always expose a non-nil Boundary.

sumeerbhola

just flushing comments as I read. This is neat!

@sumeerbhola made 1 comment.
Reviewable status: 0 of 33 files reviewed, 5 unresolved discussions (waiting on annrpom and RaduBerinde).

sumeerbhola

@sumeerbhola made 5 comments.
Reviewable status: 0 of 33 files reviewed, 10 unresolved discussions (waiting on annrpom and RaduBerinde).

internal/iterv2/test_iter.go line 70 at r3 (raw file):

//   - points: point keys to include.
//   - spans: span definitions. Note that a span with Start=End can be used to
//     induce a spurious boundary inside a gap between spans.

I didn't understand this part about spurious boundary.

Are these spans already fragmented to be non-overlapping?

internal/iterv2/test_iter.go line 71 at r3 (raw file):

//   - spans: span definitions. Note that a span with Start=End can be used to
//     induce a spurious boundary inside a gap between spans.
//   - startKey: the overall start key of the iterator's range (can be nil).

what is an "iterator range"? Unlike bounds, is this immutable?
Why is the range not simply a function of points and spans?

Oh, I see now that the bounds can't be changed, which is then more confusing since we shouldn't need two things to express the same idea of bounds. What am I missing?

Hmm, maybe the startKey, endKey are simulating file bounds.

internal/iterv2/test_iter.go line 160 at r3 (raw file):

	t.boundaries = t.boundaries[:0]
	if lower != nil {
		t.boundaries = append(t.boundaries, lower)

The comment just above says effectiveLower, effectiveUpper.

internal/iterv2/test_iter.go line 189 at r3 (raw file):

	t.spanKeys = make([][]keyspan.Key, numRegions)
	for i := range numRegions {
		if t.boundaries[i] == nil || t.boundaries[i+1] == nil {

does this only happen when there are no spans and we added nil to both sides?

internal/iterv2/test_iter.go line 194 at r3 (raw file):

		for _, s := range filteredSpans {
			if keyCmp(s.Start, t.boundaries[i]) <= 0 && keyCmp(s.End, t.boundaries[i+1]) >= 0 {
				t.spanKeys[i] = s.Keys

So the filteredSpans don't overlap? Otherwise we'd need to append here, yes?

RaduBerinde requested a review from sumeerbhola March 20, 2026 17:11

RaduBerinde requested a review from a team as a code owner March 20, 2026 17:11

RaduBerinde requested a review from annrpom March 20, 2026 17:15

RaduBerinde force-pushed the iterv2-off branch 5 times, most recently from d30996a to 3863bec Compare March 26, 2026 15:39

RaduBerinde force-pushed the iterv2-off branch from 3863bec to 1f25a51 Compare March 26, 2026 16:12

sumeerbhola requested changes Mar 26, 2026

View reviewed changes

sumeerbhola reviewed Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iterv2: introduce a new iterator stack (off by default)#5851

iterv2: introduce a new iterator stack (off by default)#5851
RaduBerinde wants to merge 1 commit intocockroachdb:masterfrom
RaduBerinde:iterv2-off

RaduBerinde commented Mar 20, 2026

Uh oh!

cockroach-teamcity commented Mar 20, 2026

Uh oh!

RaduBerinde commented Mar 20, 2026

Uh oh!

sumeerbhola left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RaduBerinde commented Mar 20, 2026

Uh oh!

cockroach-teamcity commented Mar 20, 2026

Uh oh!

RaduBerinde commented Mar 20, 2026

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants