Skip to content

iterv2: introduce a new iterator stack (off by default)#5851

Open
RaduBerinde wants to merge 1 commit intocockroachdb:masterfrom
RaduBerinde:iterv2-off
Open

iterv2: introduce a new iterator stack (off by default)#5851
RaduBerinde wants to merge 1 commit intocockroachdb:masterfrom
RaduBerinde:iterv2-off

Conversation

@RaduBerinde
Copy link
Copy Markdown
Member

We introduce a new iterator abstraction along with a new merging
iterator, level iterator, interleaving iterator stack. This
abstraction allows presenting the next span boundary at any point,
which allows the merging iterator to (conceptually) divide the key
space into "slabs". The slab approach will make the addition of new
rangedel-type keys much easier. The boundary also naturally allows
"stopping points" in the level iterator without a special contract
with the merging iterator.

The new stack can be used by setting iterv2.Enabled. The stack
functions correctly (it passed a few hours of the metamorphic test).
It is still a prototype in that it does not yet have many of the
optimizations the existing stack has (like try-seek-using-next). The
integration of the v2 stack also needs some optimization in terms of
allocations.

Most of the code inside the iterators was written by me. Claude helped
with comments and with limited bits and pieces but it was generating
much more complicated code for the core logic. In all fairness, I also
had to go back and rewrite most of it a few times.

Suggestion on what to review:

  • iterv2.Iter semantics;
  • iterv2.TestIter implementation. While this is only used for
    testing, it serves as a formal codification of the semantics (with
    a fairly simple implementation). All the new iterators are tested
    against the TestIter, so it's less important to review the
    actual iterator implementations;
  • iterv2.InterleavingIter struct and Init() comments;
  • levelIterV2 struct comment;
  • mergingIterV2 struct comment;
  • slabState interface (BuildForward).

@RaduBerinde RaduBerinde requested a review from sumeerbhola March 20, 2026 17:11
@RaduBerinde RaduBerinde requested a review from a team as a code owner March 20, 2026 17:11
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@RaduBerinde
Copy link
Copy Markdown
Member Author

CC @jbowens - in case you are interested; this continues and is inspired by your work in this area over the last few years.

@RaduBerinde RaduBerinde requested a review from annrpom March 20, 2026 17:15
@RaduBerinde RaduBerinde force-pushed the iterv2-off branch 5 times, most recently from d30996a to 3863bec Compare March 26, 2026 15:39
We introduce a new iterator abstraction along with a new merging
iterator, level iterator, interleaving iterator stack. This
abstraction allows presenting the next span boundary at any point,
which allows the merging iterator to (conceptually) divide the key
space into "slabs". The slab approach will make the addition of new
rangedel-type keys much easier. The boundary also naturally allows
"stopping points" in the level iterator without a special contract
with the merging iterator.

The new stack can be used by setting `iterv2.Enabled`. The stack
functions correctly (it passed a few hours of the metamorphic test).
It is still a prototype in that it does not yet have many of the
optimizations the existing stack has (like try-seek-using-next). The
integration of the v2 stack also needs some optimization in terms of
allocations.

Most of the code inside the iterators was written by me. Claude helped
with comments and with limited bits and pieces but it was generating
much more complicated code for the core logic. In all fairness, I also
had to go back and rewrite most of it a few times.

Suggestion on what to review:
 - `iterv2.Iter` semantics;
 - `iterv2.TestIter` implementation. While this is only used for
   testing, it serves as a formal codification of the semantics (with
   a fairly simple implementation). All the new iterators are tested
   against the `TestIter`, so it's less important to review the
   actual iterator implementations;
 - `iterv2.InterleavingIter` struct and `Init()` comments;
 - `levelIterV2` struct comment;
 - `mergingIterV2` struct comment;
 - `slabState` interface (`BuildForward`).
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumeerbhola made 5 comments.
Reviewable status: 0 of 33 files reviewed, 5 unresolved discussions (waiting on annrpom and RaduBerinde).


internal/iterv2/iter.go line 43 at r3 (raw file):

// # Boundary keys
//
// When the iterator is about to leave the current Span, it returns a synthetic

And the Span() for this key is still the same span, yes?


internal/iterv2/iter.go line 51 at r3 (raw file):

// same user key. When a point key coincides with a span boundary (e.g. point
// key "c" at the start of span [c, g)), the boundary key for the previous span
// is returned first, followed by the point key (now in the new span).

This example is the forward iteration case.
Worth clarifying that.

And adding an example for the reverse case too.

Also, are we side-stepping the issue of how InternalKeyKindSpanBoundary compares with the rangedel or the rangekey key kinds?
We currently have these boundary/sentinel/fake/synthetic keys that use I think the rangedel and other rangekey kind. I suppose we no longer need to expose them. So we won't have a situation where both a rangedel end is exposed as a point and so is this InternalKeyKindSpanBoundary and they are in the wrong order.


internal/iterv2/iter.go line 54 at r3 (raw file):

//
// At a boundary key, Span() still returns the Span being exited; it is updated
// to the adjacent Span on the subsequent Next or Prev call.

This answers my question about the Span().


internal/iterv2/iter.go line 100 at r3 (raw file):

// nil, the caller would not discover any span keys (like RANGEDEL) in this
// region.
//

I'm a bit lost here. Could use some more examples.
For example with the same SeekPrefixGE(a) and the only keys are b keys and a RANGEDEL [b, ...) we won't return anything, yes?

Or is this saying that it will not return the b keys but will go all the way to the b boundary, and stop before showing the RANGEDEL [b, ...)?

The nuanced example doesn't have any point keys that don't match the prefix that are skipped over until the boundary is reached.


internal/iterv2/iter.go line 165 at r3 (raw file):

	//
	// Boundary can be nil when the iteration range is unbounded in the
	// corresponding direction: nil for BoundaryEnd means no upper bound, nil for

Is this unbounded case for dealing with memtable and batch iterators? I am assuming any iterator on a file will always expose a non-nil Boundary.

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just flushing comments as I read. This is neat!

@sumeerbhola made 1 comment.
Reviewable status: 0 of 33 files reviewed, 5 unresolved discussions (waiting on annrpom and RaduBerinde).

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumeerbhola made 5 comments.
Reviewable status: 0 of 33 files reviewed, 10 unresolved discussions (waiting on annrpom and RaduBerinde).


internal/iterv2/test_iter.go line 70 at r3 (raw file):

//   - points: point keys to include.
//   - spans: span definitions. Note that a span with Start=End can be used to
//     induce a spurious boundary inside a gap between spans.

I didn't understand this part about spurious boundary.

Are these spans already fragmented to be non-overlapping?


internal/iterv2/test_iter.go line 71 at r3 (raw file):

//   - spans: span definitions. Note that a span with Start=End can be used to
//     induce a spurious boundary inside a gap between spans.
//   - startKey: the overall start key of the iterator's range (can be nil).

what is an "iterator range"? Unlike bounds, is this immutable?
Why is the range not simply a function of points and spans?


Oh, I see now that the bounds can't be changed, which is then more confusing since we shouldn't need two things to express the same idea of bounds. What am I missing?

Hmm, maybe the startKey, endKey are simulating file bounds.


internal/iterv2/test_iter.go line 160 at r3 (raw file):

	t.boundaries = t.boundaries[:0]
	if lower != nil {
		t.boundaries = append(t.boundaries, lower)

The comment just above says effectiveLower, effectiveUpper.


internal/iterv2/test_iter.go line 189 at r3 (raw file):

	t.spanKeys = make([][]keyspan.Key, numRegions)
	for i := range numRegions {
		if t.boundaries[i] == nil || t.boundaries[i+1] == nil {

does this only happen when there are no spans and we added nil to both sides?


internal/iterv2/test_iter.go line 194 at r3 (raw file):

		for _, s := range filteredSpans {
			if keyCmp(s.Start, t.boundaries[i]) <= 0 && keyCmp(s.End, t.boundaries[i+1]) >= 0 {
				t.spanKeys[i] = s.Keys

So the filteredSpans don't overlap? Otherwise we'd need to append here, yes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants