Skip to content

Conversation

@yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jan 3, 2026

This PR contains several commits that address some edge cases around disabling the streamer due to features that are not supported by DistSQL. See each commit for details.

Epic: None
Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Jan 3, 2026

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the distsql-streamer branch 9 times, most recently from da16a48 to b0f3f6d Compare January 4, 2026 20:26
@yuzefovich yuzefovich changed the title sql: extract distsql check code into a separate file sql: audit DistSQL supportability for streamer Jan 4, 2026
@yuzefovich yuzefovich force-pushed the distsql-streamer branch 3 times, most recently from 971c63b to 33c48f1 Compare January 5, 2026 04:49
@yuzefovich yuzefovich force-pushed the distsql-streamer branch 2 times, most recently from 5a01250 to 3c19a73 Compare January 5, 2026 19:39
@yuzefovich yuzefovich requested a review from DrewKimball January 5, 2026 19:40
@yuzefovich yuzefovich marked this pull request as ready for review January 5, 2026 19:40
@yuzefovich yuzefovich requested review from a team as code owners January 5, 2026 19:40
@yuzefovich yuzefovich removed request for a team January 5, 2026 19:40
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice improvement. I just have some nits you can take or leave. Also, it seems like there's some opportunity to call this set of properties something more general now that we're using it for both DistSQL and streamer decisions. What do you think?

@DrewKimball reviewed 23 files and all commit messages, and made 4 comments.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich).


pkg/sql/distsql_check.go line 38 at r4 (raw file):

// It is separate from distSQLProhibitedCauses so that we can generate the
// stringer for each cause.
type distSQLProhibitedCause uint64

Alternate names (though feel free to keep the current one):

  • distSQLBlocker
  • distSQLRestriction
  • distSQLConstraint

pkg/sql/distsql_check.go line 64 at r4 (raw file):

// distSQLProhibitedCauses is a bit-map describing all reasons for why a query
// couldn't be distributed.
type distSQLProhibitedCauses uint64

nit: worth having an Empty() or HasNone() method for readability?


pkg/sql/distsql_check.go line 222 at r4 (raw file):

	sd *sessiondata.SessionData,
	txnHasBufferedWrites bool,
) (retRec distRecommendation, retCauses distSQLProhibitedCauses) {

nit: What if we instead had a method on distSQLProhibitedCauses mapping to a distRecommendation, and had this function only return the causes?

In f6038a1 we made it so that if
LocalPlanNodeSpec handles `scanBufferNode` or `bufferNode`, that doesn't
prevent usage of the streamer. However - theoretically - it's possible
that a single LocalPlanNodeSpec handles multiple planNodes. This is the
case since we traverse the tree of planNodes until we find first
DistSQL-plannable one ("firstNotWrapped"), and all planNodes before that
are handled via the single processor. Thus we could've applied an
exception to more cases than we initially wanted (but in practice it
should be unlikely if possible), so this commit hardens that change by
tracking whether a single planNode is being wrapped or not and applying
the exception only if so.

Release note: None
This commit fixes a data race that was exposed by the following commit
(but has existed since a4ed1a1 where we
started reusing the same planner's visitor for DistSQL-supportability
check). Namely, if we have concurrent parallel checks, each will run
`checkSupportForPlanNode` which might internally use
`distSQLExprCheckVisitor` from the planner. In order to fix the race we
now will allocate a fresh visitor for all worker goroutines but keep on
using the planner's one in all other contexts.

The impact of the bug is minor if any - we don't expect the parallel
checks to actually have any distsql-unsupported features, so `err` field
might never be modified to have a race.

Release note: None
This commit extracts the code for checking distsql supportability into
a separate file. This is done because the following commit will be
extending the state we're storing during the distsql check.

It's purely a mechanical move, with a couple of tweaks:
- `checkSupportForInvertedFilterNode` is inlined in its single call
site (to get symmetry with all other planNodes).
- all unsupported errors have been extracted into global singletons. The
following commit will take advantage of this by replacing errors with
bit-map of causes - the loss of some extra context should be negligible.

Release note: None
In c17591e we made so that if we have
a plan that cannot be distributed for any reason, we would disable the
streamer out of caution. This was known to be a super-set of problematic
scenarios, and this commit addresses that oversight.

In particular, it revamps `checkSupportForPlanNode` function so that we
accumulate all possible blockers for DistSQL, and then we examine each
one to decide whether it's safe to use the streamer given the blockers
found. The blockers are accumulated into a bit-map, and the methods have
been adjusted to do a full walk over the planNode tree (as well as over
all expressions) to capture all blockers - this should have negligible
overhead.

An additional improvement is that we now replace errors with different
bits and we can log all blockers for DistSQL not being supported.

What prompted this work was a customer query where we didn't use the
streamer because of "unsupported planNode" distsql error. This was
attempted to be fixed in f6038a1, but
the fix was incomplete, and now it is.

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question. I feel like if we were to find the common denominator, we'd have to call it something like "execution blockers" or "execution properties", which seems rather generic. I'll keep it as distSQLBlocker since it's clearly related to the distsql exec engine, and the way we examine blockers for the streamer - to decide whether it's ok to use the leaf txn - is also somewhat related to distsql exec engine.

Thanks for the quick review!

bors r+

@yuzefovich made 4 comments.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball).


pkg/sql/distsql_check.go line 38 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Alternate names (though feel free to keep the current one):

  • distSQLBlocker
  • distSQLRestriction
  • distSQLConstraint

Thanks for the ideas, I like distSQLBlocker the most. Done.


pkg/sql/distsql_check.go line 64 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

nit: worth having an Empty() or HasNone() method for readability?

Hm, I went back and forth on this and didn't quite land on anything I liked. It seems like comparing against zero is simple enough and perhaps the most clear.


pkg/sql/distsql_check.go line 222 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

nit: What if we instead had a method on distSQLProhibitedCauses mapping to a distRecommendation, and had this function only return the causes?

Single return value is not enough - if we have any distsql blockers, then it must be cannotDistribute, but otherwise it could either be canDistribute or shouldDistribute.

@craig
Copy link
Contributor

craig bot commented Jan 5, 2026

@craig craig bot merged commit 9458ee0 into cockroachdb:master Jan 5, 2026
24 of 25 checks passed
@yuzefovich yuzefovich deleted the distsql-streamer branch January 5, 2026 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants