Skip to content

Logical tables MSE test#17697

Open
krishan1390 wants to merge 5 commits intoapache:masterfrom
krishan1390:logical_tables_mse_test
Open

Logical tables MSE test#17697
krishan1390 wants to merge 5 commits intoapache:masterfrom
krishan1390:logical_tables_mse_test

Conversation

@krishan1390
Copy link
Contributor

@krishan1390 krishan1390 commented Feb 13, 2026

  1. Add multi-stage engine tests for logical tables.
  2. Refactor few test classes (BaseLogicalTableIntegrationTest) to avoid duplicate code.

The primary gap is in "testBetween" test. The explain plan for a "not-between" query is different when queried via Logical tables vs without Logical tables. The results are consistent, but the explained plan is different. For now, I am filtering the validation for logical tables.

Explain plan for "not-between" query for logical table

PinotLogicalAggregate(group=[{}], agg#0=[COUNT($0)], aggType=[FINAL])
        PlanWithNoSegments(table=[mytable])

Explain plan for "not-between" query for physical table

  PinotLogicalAggregate(group=[{}], agg#0=[COUNT($0)], aggType=[FINAL])
    PinotLogicalExchange(distribution=[hash])
      LeafStageCombineOperator(table=[mytable])
        StreamingInstanceResponse
          CombineAggregate
            Aggregate(aggregations=[[count(*)]])
              Project(columns=[[]])
                DocIdSet(maxDocs=[120000])
                  FilterNot
                    FilterFullScan(predicate=[RandomAirports BETWEEN 'GTR' AND 'SUN'], operator=[RANGE])

@codecov-commenter
Copy link

codecov-commenter commented Feb 13, 2026

❌ 3 Tests Failed:

Tests completed Failed Passed Skipped
9381 3 9378 51
View the full list of 3 ❄️ flaky test(s)
org.apache.pinot.integration.tests.KafkaConfluentSchemaRegistryAvroMessageDecoderRealtimeClusterIntegrationTest::setUp

Flake rate in main: 100.00% (Passed 0 times, Failed 76 times)

Stack Traces | 12.4s run time
Could not find a valid Docker environment. Please see logs and check configuration
org.apache.pinot.plugin.inputformat.json.confluent.JsonConfluentSchemaTest::@BeforeClass setup

Flake rate in main: 100.00% (Passed 0 times, Failed 49 times)

Stack Traces | 0.491s run time
Could not find a valid Docker environment. Please see logs and check configuration
org.apache.pinot.plugin.inputformat.json.confluent.JsonConfluentSchemaTest::setup

Flake rate in main: 100.00% (Passed 0 times, Failed 98 times)

Stack Traces | 0.491s run time
Could not find a valid Docker environment. Please see logs and check configuration

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@krishan1390
Copy link
Contributor Author

The test failures are due to the docker environment issue.

2026-02-13T11:57:03.9083814Z [ERROR]   KafkaConfluentSchemaRegistryAvroMessageDecoderRealtimeClusterIntegrationTest.setUp:292->BaseRealtimeClusterIntegrationTest.setUp:67->startKafka:88->startSchemaRegistry:99 » IllegalState Could not find a valid Docker environment. Please see logs and check configuration

This is unrealted to this PR

return getTableConfigBuilder(TableType.REALTIME).build();
}

private List<String> getTimeBoundaryTable(List<String> offlineTables) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods are moved from BaseLogicalTableIntegrationTest to this class so that LogicalTableMultiStageEngineIntegrationTest can also leverage them

@krishan1390 krishan1390 changed the title Logical tables mse test Logical tables MSE test Feb 13, 2026
@yashmayya yashmayya added testing multi-stage Related to the multi-stage query engine logical-tables labels Feb 13, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds multi-stage engine (MSE) integration coverage for logical tables and refactors shared logical-table test setup to reduce duplication, including factoring common “upload offline tables” and logical-table-config creation into the base integration test utilities.

Changes:

  • Add LogicalTableMultiStageEngineIntegrationTest to run MSE test suite against a logical table backed by multiple offline physical tables.
  • Refactor logical table integration test setup to reuse shared helpers for offline-table data upload and logical-table schema creation.
  • Refactor MultiStageEngineIntegrationTest setup/teardown into overridable hooks and make testBetween explain-plan validation configurable.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/logicaltable/LogicalTableMultiStageEngineIntegrationTest.java New MSE logical-table integration test; overrides setup, logical-table config, and selectively disables validations for logical tables.
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/logicaltable/BaseLogicalTableIntegrationTest.java Refactors logical-table test setup to reuse shared base helpers and separates logical-table schema creation from table creation.
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/TlsIntegrationTest.java Adjusts logical-table config creation to support realtime-only logical table in TLS test setup.
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/MultiStageEngineIntegrationTest.java Refactors setup into initCluster/initTables/initOtherDependencies, adds getOfflineTablesCreated(), and makes NOT BETWEEN plan validation optional.
pinot-integration-test-base/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTest.java Adds shared helpers: offline-table config overload, logical-table-config builder for custom physical sets, logical-table+schema helper, and offline upload/distribution utilities.

Comment on lines +440 to +441
return timeBoundaryTable != null ? List.of(TableNameBuilder.OFFLINE.tableNameWithType(timeBoundaryTable))
: List.of();
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTimeBoundaryTable() iterates over offlineTables and sets timeBoundaryTable = tableName, but then returns TableNameBuilder.OFFLINE.tableNameWithType(timeBoundaryTable). In current call sites (createLogicalTableConfig() and BaseLogicalTableIntegrationTest), offlineTables are already table-name-with-type (e.g. o_1_OFFLINE), so this produces an invalid name like o_1_OFFLINE_OFFLINE and breaks the includedTables in TimeBoundaryConfig.

Suggested change
return timeBoundaryTable != null ? List.of(TableNameBuilder.OFFLINE.tableNameWithType(timeBoundaryTable))
: List.of();
return timeBoundaryTable != null ? List.of(timeBoundaryTable) : List.of();

Copilot uses AI. Check for mistakes.
protected void createLogicalTableAndSchema()
throws IOException {
Schema schema = createSchema(getSchemaFileName());
schema.setSchemaName(getTableName());
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createLogicalTableAndSchema() sets the schema name to getTableName(), but the schema for a logical table should be created with the logical table name (i.e., getLogicalTableName()). If a subclass uses a logical table name different from the physical table name, the controller will not find the expected schema for the logical table.

Suggested change
schema.setSchemaName(getTableName());
schema.setSchemaName(getLogicalTableName());

Copilot uses AI. Check for mistakes.
Comment on lines +98 to +113
@Ignore
public void testValidateQueryApiBatchMixedResults() {
}

@Override
@Ignore
public void testValidateQueryApiSuccessfulQueries() {
}

@Override
@Ignore
public void testValidateQueryApiUnsuccessfulQueries() {
}

@Override
@Ignore
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These overrides are annotated with @Ignore but not @Test. In TestNG, overriding a @Test method without @Test removes it from the test run entirely (it won't show as skipped/ignored). If the intent is to skip these inherited tests explicitly, add @Test (or @Test(enabled = false)) alongside @Ignore on each override.

Suggested change
@Ignore
public void testValidateQueryApiBatchMixedResults() {
}
@Override
@Ignore
public void testValidateQueryApiSuccessfulQueries() {
}
@Override
@Ignore
public void testValidateQueryApiUnsuccessfulQueries() {
}
@Override
@Ignore
@Ignore
@Test(enabled = false)
public void testValidateQueryApiBatchMixedResults() {
}
@Override
@Ignore
@Test(enabled = false)
public void testValidateQueryApiSuccessfulQueries() {
}
@Override
@Ignore
@Test(enabled = false)
public void testValidateQueryApiUnsuccessfulQueries() {
}
@Override
@Ignore
@Test(enabled = false)

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +45
@BeforeClass
@Override
public void setUp()
throws Exception {
initCluster();
List<File> avroFiles = unpackAvroData(_tempDir);
initTables(avroFiles);
initOtherDependencies(avroFiles);
}
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setUp() override duplicates the exact initialization sequence already implemented in MultiStageEngineIntegrationTest.setUp(). Since the base setUp() calls overridable hooks (initTables, etc.), this override can be removed (or replaced with super.setUp()) to reduce duplication and the risk of future drift.

Suggested change
@BeforeClass
@Override
public void setUp()
throws Exception {
initCluster();
List<File> avroFiles = unpackAvroData(_tempDir);
initTables(avroFiles);
initOtherDependencies(avroFiles);
}

Copilot uses AI. Check for mistakes.
Comment on lines +84 to +92
@Override
@Test
public void testBetween()
throws Exception {
testBetween(false);
// TODO - Explain plan result is different for logical table vs physical table.
// That is why we're overriding the test and avoiding explain plan validation.
// This needs to be fixed
}
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testBetween() in the logical-table suite skips the NOT BETWEEN explain-plan assertions entirely. This reduces coverage for an optimizer/planner-sensitive case and may allow regressions to slip in unnoticed. Consider validating a logical-table-appropriate invariant for the NOT BETWEEN plan (even if different from physical tables), or making the assertion accept both known plan shapes instead of disabling it.

Copilot generated this review using guidance from repository custom instructions.
Assert.assertFalse(plan.contains(">="));
Assert.assertFalse(plan.contains("<="));
Assert.assertFalse(plan.contains("Sarg"));
if (testExplainPlanForNotBetween) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this, why is NOT BETWEEN a special case here? Or is this the only one where the logical plan is being compared with the segment level plan and segment level explain plan for MSE with logical tables is broken?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-tables multi-stage Related to the multi-stage query engine testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants