Skip to content

support WHERE predicate in indexes#2743

Merged
jennifersp merged 10 commits into
mainfrom
jennifer/index-where
May 27, 2026
Merged

support WHERE predicate in indexes#2743
jennifersp merged 10 commits into
mainfrom
jennifer/index-where

Conversation

@jennifersp
Copy link
Copy Markdown
Contributor

No description provided.

@jennifersp jennifersp requested a review from Hydrocharged May 20, 2026 23:48
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

Main PR
covering_index_scan_postgres 1309.09/s 1305.97/s -0.3%
index_join_postgres 187.30/s 186.43/s -0.5%
index_join_scan_postgres 196.22/s 196.02/s -0.2%
index_scan_postgres 12.38/s 12.30/s -0.7%
oltp_point_select 2394.96/s 2396.06/s 0.0%
oltp_read_only 1786.59/s 1793.06/s +0.3%
select_random_points 128.48/s 128.35/s -0.2%
select_random_ranges 866.86/s 862.93/s -0.5%
table_scan_postgres 12.00/s 12.04/s +0.3%
types_table_scan_postgres 5.59/s 5.51/s -1.5%

Copy link
Copy Markdown
Collaborator

@Hydrocharged Hydrocharged left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread testing/go/index_test.go Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

Main PR
Total 42090 42090
Successful 18197 18250
Failures 23893 23840
Partial Successes1 5392 5385
Main PR
Successful 43.2335% 43.3595%
Failures 56.7665% 56.6405%

${\color{lightgreen}Progressions (60)}$

aggregates

QUERY: create index minmaxtest3i on minmaxtest3(f1) where f1 is not null;

alter_table

QUERY: alter table attbl replica identity using index pk_attbl;

create_index

QUERY: SELECT pg_get_indexdef('unique_idx3'::regclass);
QUERY: ALTER TABLE concur_replident REPLICA IDENTITY
  USING INDEX concur_replident_i_idx;

create_table

QUERY: create index part_column_drop_b_pred on part_column_drop(b) where b = 1;
QUERY: create index part_column_drop_d_pred on part_column_drop(d) where d = 2;

create_table_like

QUERY: CREATE UNIQUE INDEX inhz_xx_idx on inhz (xx) WHERE xx <> 'test';

generated

QUERY: CREATE INDEX gtest22c_pred_idx ON gtest22c (a) WHERE b > 0;

indexing

QUERY: create index on idxpart1 (a) where b > 1;
QUERY: create index idxparti3 on idxpart ((a+b)) where d = true;
QUERY: create index idxpart_2_idx on only idxpart ((b + a)) where a > 1;
QUERY: create index idxpart1_2_idx on idxpart1 ((b + a)) where a > 1;
QUERY: create index idxpart1_2b_idx on idxpart1 ((a + b)) where a > 1;
QUERY: create index idxpart1_2c_idx on idxpart1 ((b + a)) where b > 1;
QUERY: create index on idxpart (a) where b > 1000;
QUERY: alter table only parted_replica_tab_1 replica identity
  using index parted_replica_idx_1;
QUERY: alter table only parted_replica_tab_1 replica identity
  using index parted_replica_idx_1;

insert_conflict

QUERY: create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
QUERY: drop index part_comp_key_index;
QUERY: create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
QUERY: drop index partial_key_index;
QUERY: create unique index insertconflicti2 on insertconflict(b)
  where coalesce(a, 1) > 0;

join

QUERY: create index j1_id1_idx on j1 (id1) where id1 % 1000 = 1;
QUERY: create index j2_id1_idx on j2 (id1) where id1 % 1000 = 1;

publication

QUERY: ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
QUERY: ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
QUERY: ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
QUERY: ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
QUERY: ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
QUERY: ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
QUERY: ALTER TABLE testpub_tbl5 REPLICA IDENTITY USING INDEX testpub_tbl5_b_key;
QUERY: ALTER TABLE testpub_tbl5 REPLICA IDENTITY USING INDEX testpub_tbl5_b_key;
QUERY: ALTER TABLE testpub_tbl6 REPLICA IDENTITY FULL;
QUERY: ALTER TABLE testpub_tbl8_0 REPLICA IDENTITY USING INDEX testpub_tbl8_0_pkey;
QUERY: ALTER TABLE testpub_tbl8_1 REPLICA IDENTITY USING INDEX testpub_tbl8_1_pkey;
QUERY: ALTER TABLE testpub_tbl8_1 REPLICA IDENTITY FULL;
QUERY: ALTER TABLE testpub_tbl8_1 REPLICA IDENTITY USING INDEX testpub_tbl8_1_pkey;
QUERY: ALTER TABLE testpub_tbl8_1 REPLICA IDENTITY FULL;
QUERY: ALTER TABLE testpub_tbl8_1 REPLICA IDENTITY USING INDEX testpub_tbl8_1_pkey;
QUERY: ALTER TABLE testpub_tbl8_0 REPLICA IDENTITY USING INDEX testpub_tbl8_0_pkey;

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

@jennifersp jennifersp marked this pull request as ready for review May 27, 2026 17:06
@itoqa
Copy link
Copy Markdown

itoqa Bot commented May 27, 2026

Ito Test Report ❌

13 test cases ran. 2 failed, 11 passed.

Overall, 11 of 13 tests passed, confirming expected behavior for partial-index creation and enforcement, planner predicate-aware index usage, migration-batch failure visibility, concurrent duplicate-name index creation, and most catalog and replica-identity compatibility checks under real SQL engine paths (with only auth/bootstrap bypassed for deterministic local setup). The two validated regressions introduced by this PR are: a medium-severity catalog DDL rendering bug that truncates dotted/qualified index expressions in pg_get_indexdef/pg_indexes, and a high-severity ALTER TABLE REPLICA IDENTITY bug where commands return success but never persist state (relreplident remains default 'd'), risking incorrect schema diffing and replication assumptions.

❌ Failed (2)
Category Summary Screenshot
Catalog 🟠 Index introspection DDL rendering truncates expressions by splitting on '.' and keeping only split[1], corrupting qualified and dotted expressions in pg_get_indexdef/pg_indexes output. CATALOG-4
Replica ⚠️ REPLICA IDENTITY USING INDEX returned success, but catalog metadata remained at default replica identity (relreplident='d'). REPLICA-2
🟠 Dotted index expressions are truncated in catalog DDL
  • What failed: The generated CREATE INDEX text drops expression segments by splitting on '.' and taking only the second token, so introspection output no longer preserves original index expressions and cannot be reliably round-tripped.
  • Impact: Schema tooling that relies on catalog DDL text can mis-parse or diff indexes incorrectly. Teams may apply incorrect migrations or fail drift checks when qualified/expression indexes are present.
  • Steps to reproduce:
    1. Create indexes using qualified and dotted expressions.
    2. Query pg_catalog.pg_get_indexdef() and pg_catalog.pg_indexes.indexdef for those indexes.
    3. Compare rendered expressions against original CREATE INDEX statements and observe truncated expression text.
  • Stub / mock context: Authentication bootstrap and SCRAM checks were bypassed to keep a deterministic local superuser session during SQL execution. The failed behavior comes from index-definition rendering code, not a mocked catalog payload.
  • Code analysis: I reviewed the index-definition builders used by both catalog surfaces and found deterministic token-truncation logic in production code paths.
  • Why this is likely a bug: The rendering logic is deterministic and strips valid expression content, directly causing incorrect catalog DDL output independent of test harness behavior.

Relevant code:

server/functions/pg_get_indexdef.go (lines 67-75)

cols := make([]string, len(index.Expressions()))
for i, expr := range index.Expressions() {
	split := strings.Split(expr, ".")
	if len(split) > 1 {
		cols[i] = split[1]
	} else {
		cols[i] = expr
	}
}

server/tables/pgcatalog/pg_indexes.go (lines 119-127)

cols := make([]string, len(index.Expressions()))
for i, expr := range index.Expressions() {
	split := strings.Split(expr, ".")
	if len(split) > 1 {
		cols[i] = split[1]
	} else {
		cols[i] = expr
	}
}
⚠️ Replica identity command succeeds without persisting state
  • What failed: The command succeeds as if replica identity was configured, but the stored metadata stays at default (d) instead of reflecting a persisted replica identity mode.
  • Impact: Replication-oriented migration scripts can report success while silently leaving replica identity unset. Teams relying on this DDL behavior can ship incorrect replication assumptions with no runtime warning.
  • Steps to reproduce:
    1. Connect to localhost:5432 and create table t_replica_noop(id INT PRIMARY KEY, c1 INT).
    2. Run ALTER TABLE t_replica_noop REPLICA IDENTITY USING INDEX some_idx.
    3. Query pg_class.relreplident for t_replica_noop and verify it remains d.
  • Stub / mock context: Authentication was bypassed for the bootstrap postgres user to keep local startup deterministic. No SQL response stubs or route interception were used for this test logic.
  • Code analysis: server/ast/alter_table.go explicitly treats AlterTableReplicaIdentity as unsupported-and-ignored, so no state mutation occurs. server/tables/pgcatalog/pg_class.go always emits relreplident as "d", which confirms the command path cannot persist the requested identity state.
  • Why this is likely a bug: The implementation guarantees replica-identity ALTER commands are no-ops while still returning success semantics, which creates a real behavior mismatch for production DDL workflows.

Relevant code:

server/ast/alter_table.go (lines 164-170)

case *tree.AlterTableSetStatistics:
			// is unsupported and ignored
		case *tree.AlterTableRowLevelSecurity:
			// is unsupported and ignored
		case *tree.AlterTableReplicaIdentity:
			// is unsupported and ignored
		default:
			return nil, nil, errors.Errorf("ALTER TABLE with unsupported command type %T", cmd)

server/tables/pgcatalog/pg_class.go (lines 450-454)

false,            // relrowsecurity
		false,            // relforcerowsecurity
		true,             // relispopulated
		"d",              // relreplident
		false,            // relispartition
✅ Passed (11)
Category Summary Screenshot
Catalog pg_catalog.pg_indexes returned CREATE INDEX text with a WHERE predicate for idx_catalog_partial. N/A
Catalog pg_get_indexdef(oid) returned a concrete CREATE INDEX statement for idx_catalog_fn. N/A
Catalog Full and partial index metadata agreed across pg_get_indexdef, pg_indexes.indexdef, and pg_index.indpred for tested indexes. N/A
Predicate CREATE UNIQUE INDEX ... WHERE is_active allowed inactive duplicate user_id but rejected second active duplicate with duplicate unique key error. PREDICATE-1
Predicate On localhost:5432, CREATE INDEX idx_partial_ok ON t_partial_ok (a) WHERE a > 1 executed successfully and pg_catalog.pg_indexes returned the idx_partial_ok metadata row. PREDICATE-2
Predicate CREATE INDEX with predicate WHERE (a ^ 2) > 1 failed with conversion error "the power operator is not yet supported", and idx_partial_badexpr was absent from pg_catalog.pg_indexes (count=0). PREDICATE-3
Predicate Planner uses the partial index only when predicate implication is valid. PREDICATE-4
Predicate Batch failure is explicit and leaves detectable partial deployment state. PREDICATE-5
Predicate Concurrent duplicate-name index creation produced one winner with clean catalog state. PREDICATE-6
Replica REPLICA IDENTITY FULL executed without unsupported-command errors, and a follow-up ALTER TABLE ADD COLUMN succeeded. REPLICA-1
Replica A mixed ALTER TABLE failed on a later invalid clause, and schema inspection confirmed no partial column addition remained. REPLICA-3

Commit: 6da430f

View Full Run


Tell us how we did: Give Ito Feedback

@jennifersp jennifersp enabled auto-merge (squash) May 27, 2026 19:28
@itoqa
Copy link
Copy Markdown

itoqa Bot commented May 27, 2026

Ito Diff Report ❌

Tested: 6da430f8aa469c
10 test cases ran this commit: 5 passed ✅, 2 failed ❌, 3 additional findings ⚠️.
↪️ Carried forward from prior run (not retested this commit): 11 passing.

Across 10 test cases, the unified run had mixed results with 5 passing and 5 confirmed medium-severity failures: core tuple/record NULL handling now behaved consistently in the covered record-expression scenarios, but significant SQL and introspection defects remain. The key issues were lossy index-definition rendering in pg_get_indexdef/pg_indexes that truncates dotted expressions, ALTER TABLE ... REPLICA IDENTITY succeeding as a silent no-op with no relreplident change, and tuple IN path inconsistencies/unsafety where malformed operands can hit internal interface-conversion failure paths and equivalent subquery or VALUES forms fail with bool-comparison errors while direct tuple forms do not.

❌ Failures (2)
Origin Category Severity Summary Screenshot
🔻 Still broken (verified) Catalog 🟠 Medium pg_get_indexdef and pg_indexes drop dotted expression segments, producing invalid index definitions. CATALOG-4
🔻 Still broken (verified) Replica 🟠 Medium Replica identity ALTER TABLE command succeeds but does not persist any replica identity metadata. REPLICA-2
🟠 Index introspection truncates dotted expressions
  • What failed: Generated index definitions rewrite or truncate dotted expressions by splitting on '.' and keeping only one segment, so introspection output no longer preserves original semantics and round-trip parsing fails.
  • Impact: Schema introspection and migration tooling can consume corrupted index definitions and fail round-trip operations. This undermines reliability of automated diff/import workflows even when index creation itself succeeded.
  • Steps to reproduce:
    1. Create a table and define indexes using mixed expression forms, including qualified columns and expressions containing dots.
    2. Query pg_catalog.pg_get_indexdef(index_oid) and pg_catalog.pg_indexes.indexdef for those indexes.
    3. Compare the generated definition text to the original CREATE INDEX statements and attempt to replay the generated SQL.
  • Stub / mock context: Authentication bootstrap and SCRAM checks were bypassed to establish a deterministic local session, and the confirmed defect is in catalog index-definition formatting logic that is independent of authentication flow.
  • Code analysis: I inspected the production catalog rendering paths in server/functions/pg_get_indexdef.go and server/tables/pgcatalog/pg_indexes.go; both rebuild expression text by strings.Split(expr, ".") and keep split[1], which discards additional segments and alters qualified/function-style expressions.
  • Why this is likely a bug: Both production introspection paths apply the same lossy split logic to expression text, directly explaining corrupted output independent of test harness behavior.

Relevant code:

server/functions/pg_get_indexdef.go (lines 67-75)

cols := make([]string, len(index.Expressions()))
for i, expr := range index.Expressions() {
	split := strings.Split(expr, ".")
	if len(split) > 1 {
		cols[i] = split[1]
	} else {
		cols[i] = expr
	}
}

server/tables/pgcatalog/pg_indexes.go (lines 119-127)

cols := make([]string, len(index.Expressions()))
for i, expr := range index.Expressions() {
	split := strings.Split(expr, ".")
	if len(split) > 1 {
		cols[i] = split[1]
	} else {
		cols[i] = expr
	}
}
🟠 Replica identity command silently performs no operation
  • What failed: The command returns success, but no replica identity configuration is persisted (relreplident remains default d) and no durable change is visible afterward.
  • Impact: Operators can believe replication identity was configured when it was not, leading to incorrect migration assumptions and follow-on replication/debugging issues. There is no in-command signal in this path that the operation was ignored.
  • Steps to reproduce:
    1. Connect to localhost:5432 and create table t_replica_noop(id INT PRIMARY KEY, c1 INT).
    2. Execute ALTER TABLE t_replica_noop REPLICA IDENTITY USING INDEX some_idx.
    3. Query pg_class.relreplident and related catalog outputs for the table to verify whether replica identity state changed.
  • Stub / mock context: Authentication was intentionally bypassed for this run by short-circuiting authorization and SCRAM checks so SQL behavior could be tested with deterministic local credentials. This means the test exercised database DDL and catalog behavior directly without real auth handshakes.
  • Code analysis: I inspected ALTER TABLE command conversion and no-op handling in the AST layer. AlterTableReplicaIdentity is explicitly ignored, and this path returns a normal alter-table result without adding warning text for this command.
  • Why this is likely a bug: The production command handler explicitly drops AlterTableReplicaIdentity while still returning successful execution, which matches the observed no-op behavior and indicates a real logic gap rather than test setup noise.

Relevant code:

server/ast/alter_table.go (lines 159-170)

case *tree.AlterTableOwner:
			unsupportedWarnings = append(unsupportedWarnings, fmt.Sprintf("ALTER TABLE %s OWNER TO %s", tableName.String(), cmd.Owner))
		case *tree.AlterTableComputed:
			return nil, nil, errors.New("This command does not currently support multiple actions in one statement")
		case *tree.AlterTableSetStatistics:
			// is unsupported and ignored
		case *tree.AlterTableRowLevelSecurity:
			// is unsupported and ignored
		case *tree.AlterTableReplicaIdentity:
			// is unsupported and ignored
		default:
			return nil, nil, errors.Errorf("ALTER TABLE with unsupported command type %T", cmd)

server/ast/alter_table.go (lines 53-69)

// If there are no valid statements return a no-op statement
	if len(noOps) > 0 && len(statements) == 0 {
		return NewNoOp(noOps...), nil
	}

	// Otherwise emit warnings now, then return an AlterTable statement
	// TODO: we don't have a way to send or store the warnings alongside a valid AlterTable statement. We could either
	//  get a *sql.Context here and emit warnings, or we could store the warnings in the Context and make the caller
	//  emit them before it sends |ReadyForQuery|

	return &vitess.AlterTable{
		Table:      tableName,
		Statements: statements,
	}, nil
✅ Verified Passing (5)
Category Summary Screenshot
Record Tuple IN with NULL literal member executed successfully and returned rows consistent with SQL NULL semantics (only (2,2) matched). RECORD-1
Record Row comparison expressions containing NULL members executed without tuple-type assertion failures and returned PostgreSQL-consistent NULL results. RECORD-2
Record Created stress_t with 120 mixed NULL/non-NULL rows and executed a large repeated NULL-containing tuple IN predicate three times in one session; all runs returned stable count=5 with consistent ordering and no crashes/type errors. RECORD-3
Record Created table r, inserted (NULL,NULL),(1,NULL),(2,2), and executed tuple null-check query successfully. Query returned row-wise outcomes (both_null,both_not_null) = (t,f),(f,f),(f,t) with no tuple typing/runtime error. RECORD-4
Record Comparison and IS NULL / IS NOT NULL behavior stayed consistent for NULL-containing tuples. RECORD-7
↪️ Inherited from Prior Run (11)

Tests that passed in the prior run. c3 judged them unaffected by the diff and did not retest.

Category Summary Screenshot
Catalog pg_catalog.pg_indexes returned CREATE INDEX text with a WHERE predicate for idx_catalog_partial. N/A
Catalog pg_get_indexdef(oid) returned a concrete CREATE INDEX statement for idx_catalog_fn. N/A
Catalog Full and partial index metadata agreed across pg_get_indexdef, pg_indexes.indexdef, and pg_index.indpred for tested indexes. N/A
Predicate CREATE UNIQUE INDEX ... WHERE is_active allowed inactive duplicate user_id but rejected second active duplicate with duplicate unique key error. N/A
Predicate On localhost:5432, CREATE INDEX idx_partial_ok ON t_partial_ok (a) WHERE a > 1 executed successfully and pg_catalog.pg_indexes returned the idx_partial_ok metadata row. N/A
Predicate CREATE INDEX with predicate WHERE (a ^ 2) > 1 failed with conversion error "the power operator is not yet supported", and idx_partial_badexpr was absent from pg_catalog.pg_indexes (count=0). N/A
Predicate Planner uses the partial index only when predicate implication is valid. N/A
Predicate Batch failure is explicit and leaves detectable partial deployment state. N/A
Predicate Concurrent duplicate-name index creation produced one winner with clean catalog state. N/A
Replica REPLICA IDENTITY FULL executed without unsupported-command errors, and a follow-up ALTER TABLE ADD COLUMN succeeded. N/A
Replica A mixed ALTER TABLE failed on a later invalid clause, and schema inspection confirmed no partial column addition remained. N/A
⚠️ Additional Findings (3)

These findings are unrelated to the current changes but were observed during testing.

Origin Category Severity Summary Screenshot
🆕 New Record 🟠 Medium Unsupported IN right operand shape triggers an internal interface-conversion failure path instead of a controlled validation error. RECORD-5
🆕 New Record 🟠 Medium Equivalent tuple IN forms diverge: baseline tuple returns NULL, while IN-subquery variant fails with a comparison-bool error. RECORD-6
🆕 New Record 🟠 Medium Baseline and ROW tuple forms succeed, but VALUES-based equivalent form fails with an IN-subquery comparison-bool error. RECORD-8
🟠 Unsafe composite cast on unsupported IN operand
  • What failed: The unsupported-shape query should fail with a controlled operand-validation error, but instead enters a panic-style interface conversion path; a follow-up query still succeeds.
  • Impact: Invalid tuple IN input can trigger internal-type failure behavior instead of predictable SQL-level validation errors. This degrades reliability and makes client-side error handling less safe for malformed or generated SQL.
  • Steps to reproduce:
    1. Connect to Doltgres and open a SQL session.
    2. Run SELECT (1, NULL) IN (1);.
    3. Run SELECT (1,NULL) IN ((1,NULL)); to confirm the session continues after the first failure path.
  • Stub / mock context: The run used a local Doltgres instance with deterministic test credentials and an auth-bypass environment setting for stable startup; SQL behavior was exercised against real engine code paths without API route mocking.
  • Code analysis: I traced tuple IN execution through InTuple evaluation into composite value comparison and found an unchecked type assertion in composite compare. The right operand can be a non-record value in this path, and the current compare implementation does not defensively reject it before assertion.
  • Why this is likely a bug: Production code allows a scalar right value to proceed and then performs an unchecked []RecordValue cast, which directly explains the observed internal interface-conversion failure path.

Relevant code:

server/types/type.go (lines 404-415)

case []RecordValue:
	if !t.IsCompositeType() {
		return 0, errors.New("record value received in Compare for non composite type")
	}
	bb := v2.([]RecordValue)
	minLength := utils.Min(len(ab), len(bb))
	for i := 0; i < minLength; i++ {
		dgType, isDgType1 := ab[i].Type.(*DoltgresType)
		otherDgType, isDgType2 := bb[i].Type.(*DoltgresType)
		if !isDgType1 || !isDgType2 {

server/expression/in_tuple.go (lines 107-114)

rightValues, ok := rightInterface.([]any)
if !ok {
	// Tuples will return the value directly if it has a length of one, so we'll check for that first
	if len(it.rightExpr) == 1 {
		rightValues = []any{rightInterface}
	} else {
		return nil, errors.Errorf("%T: expected right child to return `%T` but returned `%T`", it, []any{}, rightInterface)
	}
}
🟠 Equivalent tuple IN forms diverge by evaluation path
  • What failed: Equivalent query intent diverges by path: one form returns NULL while the IN-subquery form errors with found equality comparison that does not return a bool.
  • Impact: Users can get inconsistent behavior for logically equivalent tuple filters depending on query form. This breaks SQL predictability for query generators that normalize predicates into subqueries.
  • Steps to reproduce:
    1. Execute a NULL-containing baseline tuple IN query and note the result.
    2. Execute a logically equivalent query that routes through an IN subquery form.
    3. Compare outcomes and observe divergence between NULL result and comparison-bool error.
  • Stub / mock context: The run used a local Doltgres instance with deterministic test credentials and an auth-bypass environment setting for stable startup; SQL behavior was exercised against real engine code paths without API route mocking.
  • Code analysis: The AST path forces subquery RHS forms into InSubquery, and InSubquery.WithChildren fails whenever tuple equality does not compile to a bool-returning operator. Unlike InTuple, this path does not apply the explicit unknown-type cast fallback, so equivalent expressions can fail only in the subquery-normalized route.
  • Why this is likely a bug: Production code routes equivalent IN syntax into different evaluator implementations and one implementation rejects valid tuple-comparison semantics with a hard error, causing deterministic, code-path-dependent divergence.

Relevant code:

server/ast/expr.go (lines 375-387)

case tree.In, tree.NotIn:
	var innerExpression vitess.InjectedExpr
	switch right := right.(type) {
	case vitess.ValTuple:
		innerExpression = vitess.InjectedExpr{
			Expression: pgexprs.NewInTuple(),
			Children:   vitess.Exprs{left, right},
		}
	case *vitess.Subquery:
		innerExpression = vitess.InjectedExpr{
			Expression: pgexprs.NewInSubquery(),
			Children:   vitess.Exprs{left, right},
		}

server/expression/in_subquery.go (lines 214-222)

rightLiterals[i] = expression.NewLiteral(nil, rightType)
compFuncs[i] = framework.GetBinaryFunction(framework.Operator_BinaryEqual).Compile(ctx, "internal_in_comparison", leftLiteral, rightLiterals[i])
if compFuncs[i] == nil {
	return nil, errors.Errorf("operator does not exist: %s = %s", leftType.String(), rightType.String())
}
if compFuncs[i].Type(ctx).(*pgtypes.DoltgresType).ID != pgtypes.Bool.ID {
	// This should never happen, but this is just to be safe
	return nil, errors.Errorf("%T: found equality comparison that does not return a bool", in)
}
🟠 VALUES tuple form fails via subquery path
  • What failed: Baseline and ROW variants complete with NULL semantics, but VALUES-based equivalent fails with IN-subquery bool-comparison error before consistent evaluation.
  • Impact: SQL clients that emit VALUES-based or normalized subquery tuple forms can fail even when equivalent tuple predicates work in direct form. This creates compatibility gaps across query-builder output styles.
  • Steps to reproduce:
    1. Run a baseline tuple IN query with NULL literals and confirm behavior.
    2. Run a ROW constructor variant and confirm it remains consistent.
    3. Run a VALUES-based equivalent IN form and observe the IN-subquery comparison-bool failure.
  • Stub / mock context: The run used a local Doltgres instance with deterministic test credentials and an auth-bypass environment setting for stable startup; SQL behavior was exercised against real engine code paths without API route mocking.
  • Code analysis: VALUES-based forms are treated as subquery right operands and go through InSubquery compilation checks that require bool-returning equality without the tuple-friendly fallback used by direct tuple IN handling. That implementation mismatch explains why semantically equivalent query forms diverge.
  • Why this is likely a bug: Equivalent tuple predicates should not fail solely due to VALUES/subquery representation, but production path selection currently causes that exact form-dependent failure.

Relevant code:

server/ast/expr.go (lines 383-387)

case *vitess.Subquery:
	innerExpression = vitess.InjectedExpr{
		Expression: pgexprs.NewInSubquery(),
		Children:   vitess.Exprs{left, right},
	}

server/expression/in_subquery.go (lines 219-222)

if compFuncs[i].Type(ctx).(*pgtypes.DoltgresType).ID != pgtypes.Bool.ID {
	// This should never happen, but this is just to be safe
	return nil, errors.Errorf("%T: found equality comparison that does not return a bool", in)
}

View Full Run


Tell us how we did: Give Ito Feedback

@jennifersp jennifersp merged commit 6d2c063 into main May 27, 2026
22 checks passed
@jennifersp jennifersp deleted the jennifer/index-where branch May 27, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants