Skip to content

Conversation

@anlowee
Copy link

@anlowee anlowee commented Aug 15, 2025

Description

This PR updates enables the end-to-end testing functionality to CLP connector. The end-to-end testing feature is bundled for all native connectors such as Hive, Iceberge, and TPCH, so it is good for CLP to support it as well. Another important benefit is that it can save a lot of time setup the end-to-end testing environment and also ensure the we don't break functionality for each PR.

The first change is instead of ClpMetadataDbSetUp we implemented another ClpMockMetadataDatabase. This class provides a builder style way to setup the metadata database. For consistency, besides ArchivesTableRows we also defined ColumnMetadataTableRows and DatasetsTableRows to represent the three tables that will be used by CLP connector, and ClpMockMetadataDatabase has some methods to register the necessary information by the data whose types are these three classes.

The second change is in the presto-native-execution module. Similar to TestPrestoNativeHiveGeneralQueries and TestPrestoNativeIcebergGeneralQueries, we implemented TestPrestoNativeClpGeneralQueries. The major difference between the last one and the former two is that the entire framework is designed for those have both Java worker implementation and Pretissimo implementation, so they can use a Java coordinator and a Java worker (expected query runner) to compare with the Java coordinator and a Pretissimo worker (actual query runner) to do the unit testing. Also the native worker creation is hardcoded to create hive connector, so we also create a function getExternalClpWorkerLauncher to create our own worker.

This PR only adds a select * query for sanity check. Later, we should integrate the TestClpMetadata, TestClpSplit and TestClpQueryBase. Eventually, we should get rid of ClpMetadataDbSetUp. The tests related to the metadata database should be all integrated into the end-to-end testing framework. The end-to-end testing framework should also cover some basic queries to validate the functionality. For presto-clp tests, we should only keep those testing without the need of metadata database, for example the pushdown generation etc.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Passed the CI (the unit test class will be automatically added into CI):
image

Summary by CodeRabbit

  • Tests

    • Added end-to-end CLP integration tests and test utilities for native execution, including a mock metadata DB, table-population helpers, CLP query-runner harness, and an external CLP worker launcher; validates query execution and exact result values.
  • Chores

    • Centralized CLP dependency management and added runtime/test-jar/test-scoped CLP dependencies across modules to support testing.
  • Notes

    • No user-visible API or runtime behavior changes.

@coderabbitai
Copy link

coderabbitai bot commented Aug 15, 2025

Walkthrough

Adds CLP test infrastructure and integration: dependencyManagement entries for presto-clp, module dependencies, H2-backed mock metadata DB test helpers and table-row inserters, a CLP test harness (ClpQueryRunner), native external CLP worker launcher wiring, and an end-to-end Presto Native CLP integration test.

Changes

Cohort / File(s) Summary
Build: dependency management
pom.xml
Adds com.facebook.presto:presto-clp entries in dependencyManagement for the runtime artifact and the test-jar variant using ${project.version}.
Build: module dependencies
presto-native-execution/pom.xml, presto-clp/pom.xml
Adds presto-clp as a runtime dependency and a test-jar variant in presto-native-execution; adds test-scoped com.facebook.presto:presto-tests in presto-clp.
CLP mock DB test utilities
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java, presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java, .../mockdb/table/ColumnMetadataTableRows.java, .../mockdb/table/DatasetsTableRows.java
Adds an H2 file-backed mock metadata DB builder with teardown and JDBC helpers to create/insert rows into Archives, ColumnMetadata, and Datasets tables; provides fluent Builder and row-insert helper classes.
CLP test harness
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java
Adds test helper to create a DistributedQueryRunner configured for the CLP connector/catalog with configurable worker count and optional external worker launcher.
Native execution: CLP launcher and E2E test
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java, presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java
Adds getExternalClpWorkerLauncher(...) returning a factory that writes per-worker config/node/catalog files and starts external CLP workers; adds an end-to-end test that seeds the mock DB, launches workers, runs a query, and validates results and schema.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Test as E2E Test
  participant MDB as ClpMockMetadataDatabase
  participant QR as DistributedQueryRunner
  participant Launcher as External CLP Launcher (factory)
  participant Worker as Presto Worker Process
  participant Coord as Coordinator/Discovery

  rect rgb(245,245,255)
  note over Test,MDB: Prepare mock metadata DB
  Test->>MDB: builder() -> create tables & insert rows
  MDB-->>Test: JDBC URL, creds, tablePrefix
  end

  rect rgb(245,255,245)
  note over Test,QR: Start runner with external CLP workers
  Test->>QR: createQueryRunner(config referencing MDB)
  QR->>Launcher: request launcher factory
  Launcher-->>QR: BiFunction(workerIndex, discoveryUri)
  loop spawn workers
    QR->>Worker: invoke BiFunction -> writes configs & starts process
    Worker->>Coord: register via discoveryUri
  end
  end

  rect rgb(255,245,245)
  note over Test,QR: Execute query and validate
  Test->>QR: submit SELECT ...
  QR->>Worker: dispatch fragments
  Worker-->>QR: return results
  QR-->>Test: aggregated results
  Test->>Test: assert row count & types
  end

  rect rgb(250,250,250)
  note over Test,MDB: Cleanup
  Test->>MDB: teardown()
  MDB-->>Test: drop objects & remove files
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • wraymo
  • kirkrodrigues

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7c60cee and 8abeb89.

📒 Files selected for processing (1)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (5)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (49-287)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (26-72)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)
  • PrestoNativeQueryRunnerUtils (74-752)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
  • ClpQueryRunner (28-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
🔇 Additional comments (3)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (3)

101-103: Good: node-count assertion derives from configured WORKER_COUNT.

This prevents flakes when WORKER_COUNT is overridden and accounts for the coordinator (+1).


71-76: Confirm intentional duplicate column name with differing types.

Column "msg" appears twice (ClpString and VarString). If this mirrors CLP’s schema-typing semantics, fine; otherwise it may introduce ambiguous projections later.


108-117: Verify nested column shape for dollar_sign_date
If the native execution connector exposes t as a ROW with a dollar_sign_date field, keep the current t.dollar_sign_date dereference; if it instead flattens into a literal column named t.dollar_sign_date, quote it as "t.dollar_sign_date". Confirm which schema shape the connector produces.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@anlowee anlowee marked this pull request as ready for review August 25, 2025 16:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 23

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b00627f and bd25ca8.

📒 Files selected for processing (8)
  • pom.xml (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1 hunks)
  • presto-native-execution/pom.xml (1 hunks)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1 hunks)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (5)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (40-288)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (27-94)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (31-84)
presto-tests/src/main/java/com/facebook/presto/tests/DistributedQueryRunner.java (1)
  • DistributedQueryRunner (108-1224)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (2)
  • PrestoNativeQueryRunnerUtils (72-733)
  • NativeQueryRunnerParameters (686-698)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (3)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (27-94)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (31-84)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1)
  • DatasetsTableRows (30-83)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/split/ClpMySqlSplitProvider.java (1)
  • ClpMySqlSplitProvider (35-108)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMySqlMetadataProvider.java (1)
  • ClpMySqlMetadataProvider (35-133)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMySqlMetadataProvider.java (1)
  • ClpMySqlMetadataProvider (35-133)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)
presto-native-execution/presto_cpp/main/common/Configs.cpp (2)
  • discoveryUri (349-351)
  • discoveryUri (349-349)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-resource-manager)
  • GitHub Check: test (8.0.442, :presto-main)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (17.0.13, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: prestocpp-linux-build-for-test
  • GitHub Check: prestissimo-worker-images-build
🔇 Additional comments (4)
presto-native-execution/pom.xml (1)

257-267: LGTM: CLP runtime and test-jar dependencies are correctly wired to dependencyManagement.

The additions follow the existing pattern (e.g., Iceberg/Hive test-jar usage) and will resolve to ${project.version}. No issues from here.

pom.xml (1)

2423-2434: LGTM: Centralized com.facebook.presto:presto-clp artifacts in dependencyManagement.

Both the main artifact and the test-jar are versioned via ${project.version}, aligning with the rest of the BOM. This enables clean, versionless declarations downstream.

presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1)

199-206: No additional MySQL JDBC dependency needed in tests

The presto-clp module already includes the MySQL driver as a compile-scope dependency in its POM (artifactId: mysql-connector-j).
The presto-native-execution module brings in presto-clp both at runtime and as a test-jar dependency, so the driver is transitively on the test classpath.

• presto-clp/pom.xml declares <artifactId>mysql-connector-j</artifactId> as a regular dependency.
• presto-native-execution/pom.xml depends on presto-clp (runtime scope) and its test-jar.

presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1)

51-72: UPSERT syntax is portable: H2 runs in MySQL mode

I’ve verified that the H2 URL template explicitly includes MODE=MySQL in ClpMockMetadataDatabase.java (line 47), so the MySQL-specific ON DUPLICATE KEY UPDATE syntax will work as intended. No further changes are needed.

• File: presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java
– Line 47: jdbc:h2:file:%s;MODE=MySQL;DATABASE_TO_UPPER=FALSE

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (9)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (3)

82-86: Good fix: fail fast and avoid URL-encoding pitfalls for test resource path.

Using requireNonNull with toURI() → Paths.get(...).toString() is robust and cross-platform enough for the test assets path.


134-136: Node-count assertion now derives from runner parameters.

Deriving expected workers from getNativeQueryRunnerParameters() and adding +1 for the coordinator makes the assertion resilient to WORKER_COUNT overrides.


163-169: Null-safe teardown.

Guarding teardown with a null-check prevents NPEs if startup fails before initialisation.

presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)

39-49: Quote the table identifier to match production and avoid reserved-word issues.

CREATE TABLE currently leaves the table name unquoted while columns are quoted. Quote the table name for consistency and resilience.

-        String createTableSql = format(
-                "CREATE TABLE IF NOT EXISTS %s (" +
+        String createTableSql = format(
+                "CREATE TABLE IF NOT EXISTS `%s` (" +
                 "`%s` VARCHAR(512) NOT NULL, " +
                 "`%s` TINYINT NOT NULL, " +
                 "PRIMARY KEY (`%s`, `%s`))",
             format("%s%s%s", tablePrefix, tableName, COLUMN_METADATA_TABLE_SUFFIX),
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)

71-72: Avoid NPE from auto-unboxing Long; allow nullable timestamps.

Use setObject with explicit SQL type so nulls are handled safely in tests.

-                pstmt.setLong(2, beginTimestamps.get(i));
-                pstmt.setLong(3, endTimestamps.get(i));
+                pstmt.setObject(2, beginTimestamps.get(i), java.sql.Types.BIGINT);
+                pstmt.setObject(3, endTimestamps.get(i), java.sql.Types.BIGINT);
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (4)

110-112: Add context to failure in addTables.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to create per-table structures or upsert datasets for %s: %s [SQLState=%s, ErrorCode=%s]",
+                    tableNames, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

123-124: Add context to failure in addColumnMetadata.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to insert column metadata for tables %s: %s [SQLState=%s, ErrorCode=%s]",
+                    clpFields.keySet(), e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

135-136: Add context to failure in addSplits.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to insert archive splits for tables %s: %s [SQLState=%s, ErrorCode=%s]",
+                    splits.keySet(), e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

73-75: Improve teardown error context.

Augment failure with DB URL and JDBC diagnostics for easier triage.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to teardown mock metadata DB at %s: %s [SQLState=%s, ErrorCode=%s]",
+                    url, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bd25ca8 and ec7ded5.

📒 Files selected for processing (4)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1 hunks)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMySqlMetadataProvider.java (1)
  • ClpMySqlMetadataProvider (35-133)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (3)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (27-94)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (31-84)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1)
  • DatasetsTableRows (30-83)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/split/ClpMySqlSplitProvider.java (1)
  • ClpMySqlSplitProvider (35-108)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (2)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (39-278)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (2)
  • PrestoNativeQueryRunnerUtils (72-733)
  • NativeQueryRunnerParameters (686-698)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: test (17.0.13, :presto-main)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-main)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-resource-manager)
  • GitHub Check: test (8.0.442, :presto-main-base)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
  • GitHub Check: prestocpp-linux-build-for-test
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: prestissimo-worker-images-build
🔇 Additional comments (1)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1)

200-204: Resolved: setNodeCount configures worker count; getNodeCount() returns coordinator + worker count

Verification shows in this branch’s DistributedQueryRunner.Builder:

  • setNodeCount(int nodeCount) sets the number of worker nodes (field nodeCount) and setCoordinatorCount(int) sets the number of coordinators (default 1).
  • In build(), both values are passed into the non-deprecated constructor:
    new DistributedQueryRunner(
        …,
        defaultSession,
        nodeCount,        // workers
        coordinatorCount, // coordinators
        …,
        externalWorkerLauncher,
        …);
  • The implementation of getNodeCount() returns servers.size() + externalWorkers.size(), which under the hood is coordinatorCount + nodeCount when an external-worker launcher is used (or worker-server count otherwise).
  • Thus the test’s assertion
    assertEquals(
        queryRunner.getNodeCount(),
        workerCount.orElse(DEFAULT_NUM_OF_WORKERS) + 1);
    correctly validates workers + 1 coordinator.

No changes required.

* Provides a builder-driven setup and a single-call teardown that drops all objects and deletes
* files.
*/
public class ClpMockMetadataDatabase
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a lot of duplication with ClpMetadataDbSetUp?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a refactored ClpMetadataDbSetup to provide interface which can be used for E2E testing. I iniitally directly modified ClpMetadataDbSetUp but gave up because it needed to modify the unit tests using it as well. To make this PR clean and don't touch existing unit tests, I rewrote a new one. In the next PR we can use this class to refactor the exsiting unit tests and finally get rid of ClpMetadataDbSetup.

import static org.testng.Assert.assertEquals;
import static org.testng.Assert.fail;

public class ArchivesTableRows
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this class

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as above.

private final List<Long> endTimestamps;
private final int numberOfRows;

public static void createTableIfNotExist(Connection connection, String tablePrefix, String tableName)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, this should be implemented as a method in the place where it’s being called, and the same applies to the other two classes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way is to have an ArchivesTable class and an internal class called Rows

Comment on lines 558 to 559
"register-test-functions=false%n" +
"runtime-metrics-collection-enabled=true%n", discoveryUri);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two seem unnecessary?

import static org.testng.Assert.assertTrue;
import static org.testng.Assert.fail;

public class TestPrestoNativeClpGeneralQueries
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a ClpQueryRunner and a builder to encapsulate the logic and out createQueryRuner in PrestoNativeQueryRunnerUtils?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by "out createQueryRuner in PrestoNativeQueryRunnerUtils"?

We can have the ClpQueryRunner


public static final class Builder
{
private final ClpMockMetadataDatabase mockMetadataDatabase;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the builder itself shouldn't hold this instance

Comment on lines 132 to 147
public void test()
{
QueryRunner queryRunner = getQueryRunner();
assertEquals(queryRunner.getNodeCount(), getNativeQueryRunnerParameters().workerCount.orElse(DEFAULT_NUM_OF_WORKERS) + 1);
assertTrue(queryRunner.tableExists(getSession(), TABLE_NAME));

// Test select star
assertClpQuery(
queryRunner,
getSession(),
format("SELECT * FROM %s LIMIT 1", TABLE_NAME),
ImmutableList.of(
VARCHAR,
RowType.from(ImmutableList.of(RowType.field("dollar_sign_date", TIMESTAMP))),
BIGINT,
RowType.from(ImmutableList.of(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we directly use assertQuery and add more test cases?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertQuery is not usable because it needs we provide an "expected" query runner, which we must implment Java worker implementation for CLP connector. This is why I wrote assertClpQuery here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mechnism of assertQuery it it runs the same query in the Pretissimo query runner (Java coordinator + Pretissimo worker) and a Presto query runner (Java coordiantor + Java worker), then compare if the result are the same. It seems they assume the Presto query runner is correct.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More test cases will be in coming PR due to too many lines of this PR.

Copy link

@wraymo wraymo Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? The expected one is actually an in-memory h2 database, not the java worker

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not using H2 by looking this function and the IcebergQueryRunner. The database it uses seems from Hive metastore.

    public static QueryRunner createJavaIcebergQueryRunner(Optional<Path> baseDataDirectory, String storageFormat, boolean addStorageFormatToPath)
            throws Exception
    {
        ImmutableMap.Builder<String, String> icebergPropertiesBuilder = new ImmutableMap.Builder<>();
        icebergPropertiesBuilder.put("hive.parquet.writer.version", "PARQUET_1_0");

        return IcebergQueryRunner.builder()
                .setExtraProperties(ImmutableMap.of(
                        "regex-library", "RE2J",
                        "offset-clause-enabled", "true",
                        "query.max-stage-count", "110"))
                .setExtraConnectorProperties(icebergPropertiesBuilder.build())
                .setAddJmxPlugin(false)
                .setCreateTpchTables(false)
                .setDataDirectory(baseDataDirectory)
                .setAddStorageFormatToPath(addStorageFormatToPath)
                .setFormat(FileFormat.valueOf(storageFormat))
                .setTpcdsProperties(getNativeWorkerTpcdsProperties())
                .build().getQueryRunner();
    }

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's because it overrides the default expectedQueryRunner by providing an iceberg one. We can use either query runner to do the result check.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the assertQuery I don't find anyways to by pass this:

        long expectedStart = System.nanoTime();
        MaterializedResult expectedResults = null;
        try {
            expectedResults = expectedQueryRunner.execute(expectedSession, expected, actualResults.getTypes());
        }
        catch (RuntimeException ex) {
            fail("Execution of 'expected' query failed: " + expected, ex);
        }
        Duration totalTime = nanosSince(start);
        if (totalTime.compareTo(Duration.succinctDuration(1, SECONDS)) > 0) {
            log.info("FINISHED in presto: %s, expected: %s, total: %s", actualTime, nanosSince(expectedStart), totalTime);
        }

What I can come up with is that we implement a fake Java query runner that and manually construct the MaterializedResult according to different query that we test. But I think this is essentially the same as we implement a assertClpQuery because anyways we are hardcoding the expected results rather than using a real Java worker to execute the same query.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try using the default java one by passing VALUES... to the expected sql string

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createJavaQueryRunner for expected one and use assertQuery("SELECT * FROM %s LIMIT 1", "VALUES (...)");

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (9)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)

541-583: Reduce drift vs. generic launcher; clean up temp dirs; minor robustness

This CLP-specific launcher largely duplicates getExternalWorkerLauncher(...) and misses a couple of hygiene items already discussed.

  • Temp dirs: mark for deletion or add a shutdown hook so they don’t accumulate under /tmp.
  • Robustness: create catalog dir with createDirectories in case of re-entry; keep parity with generic launcher toggles when needed (e.g., runtime-metrics-collection).
  • Duplication: consider extracting shared config/node/catalog file creation into a private helper used by both launchers to avoid divergence.

Suggested minimal improvements without changing call sites:

@@
-                    Path tempDirectoryPath = Files.createTempDirectory(dir, "worker");
+                    Path tempDirectoryPath = Files.createTempDirectory(dir, "worker");
+                    // Best-effort cleanup on JVM exit (keeps logs during run)
+                    tempDirectoryPath.toFile().deleteOnExit();
@@
-                        Path catalogDirectoryPath = tempDirectoryPath.resolve("catalog");
-                        Files.createDirectory(catalogDirectoryPath);
+                        Path catalogDirectoryPath = tempDirectoryPath.resolve("catalog");
+                        Files.createDirectories(catalogDirectoryPath);

If you want parity with the generic launcher, add an overload that accepts enableRuntimeMetricsCollection and conditionally appends runtime-metrics-collection-enabled=true to config.properties. I can draft the helper extraction if you’d like.

presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (2)

38-58: Null-safe JDBC bindings and richer diagnostics

  • setLong on List will NPE if any timestamp is null. Use setObject with Types.BIGINT to allow nulls in test data.
  • Improve failure messages with SQLState/ErrorCode for CI triage.
@@
-                pstmt.setLong(2, beginTimestamps.get(i));
-                pstmt.setLong(3, endTimestamps.get(i));
+                pstmt.setObject(2, beginTimestamps.get(i), java.sql.Types.BIGINT);
+                pstmt.setObject(3, endTimestamps.get(i), java.sql.Types.BIGINT);
@@
-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to insert rows into `%s%s%s`: %s [SQLState=%s, ErrorCode=%s]",
+                    tablePrefix, tableName, ARCHIVES_TABLE_SUFFIX, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

60-71: Constructor should validate inputs before size checks

Add requireNonNull guards and helpful assertion messages.

@@
     public ArchivesTableRows(
             List<String> ids,
             List<Long> beginTimestamps,
             List<Long> endTimestamps)
     {
-        assertEquals(ids.size(), beginTimestamps.size());
-        assertEquals(beginTimestamps.size(), endTimestamps.size());
+        requireNonNull(ids, "ids is null");
+        requireNonNull(beginTimestamps, "beginTimestamps is null");
+        requireNonNull(endTimestamps, "endTimestamps is null");
+        assertEquals(ids.size(), beginTimestamps.size(), "ids and beginTimestamps size mismatch");
+        assertEquals(beginTimestamps.size(), endTimestamps.size(), "beginTimestamps and endTimestamps size mismatch");
         this.ids = ids;
         this.beginTimestamps = beginTimestamps;
         this.endTimestamps = endTimestamps;
         this.numberOfRows = ids.size();
     }

Add import at top of file:

import static java.util.Objects.requireNonNull;
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1)

27-28: Prefer @afterclass(alwaysRun = true) and null-safe teardown

Guard against clpQueryRunner being null if setup fails, and make teardown run once per class.

-import org.testng.annotations.AfterTest;
+import org.testng.annotations.AfterClass;
@@
-    @AfterTest
+    @AfterClass(alwaysRun = true)
     public void teardown()
     {
-        if (null != clpQueryRunner.getMockMetadataDatabase()) {
-            clpQueryRunner.getMockMetadataDatabase().teardown();
-        }
+        if (clpQueryRunner != null && clpQueryRunner.getMockMetadataDatabase() != null) {
+            clpQueryRunner.getMockMetadataDatabase().teardown();
+        }
     }

Also applies to: 107-113

presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (2)

51-53: Strengthen diagnostics on SQL failure.

Plain fail(e.getMessage()) loses key triage data in CI.

Apply:

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to insert column metadata into `%s%s%s`: %s [SQLState=%s, ErrorCode=%s]",
+                    tablePrefix, tableName, COLUMN_METADATA_TABLE_SUFFIX, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

18-26: Defensive null checks and immutable copies for value-object semantics.

Make constructor inputs non-null and copy to immutable lists to prevent outside mutation after construction.

Apply:

@@
 import java.sql.SQLException;
 import java.util.List;
+import com.google.common.collect.ImmutableList;
@@
 import static com.facebook.presto.plugin.clp.metadata.ClpMySqlMetadataProvider.COLUMN_METADATA_TABLE_SUFFIX;
 import static java.lang.String.format;
+import static java.util.Objects.requireNonNull;
 import static org.testng.Assert.assertEquals;
 import static org.testng.Assert.fail;
@@
-    public ColumnMetadataTableRows(List<String> names, List<ClpSchemaTreeNodeType> types)
-    {
-        assertEquals(names.size(), types.size());
-        this.names = names;
-        this.types = types;
-        numOfRows = names.size();
-    }
+    public ColumnMetadataTableRows(List<String> names, List<ClpSchemaTreeNodeType> types)
+    {
+        requireNonNull(names, "names is null");
+        requireNonNull(types, "types is null");
+        assertEquals(names.size(), types.size(), "names and types size mismatch");
+        this.names = ImmutableList.copyOf(names);
+        this.types = ImmutableList.copyOf(types);
+        this.numOfRows = this.names.size();
+    }

Also applies to: 56-62

presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (3)

82-86: Include DB context and SQLState on teardown errors.

Surface URL and JDBC diagnostics to speed up CI triage.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to teardown mock metadata DB at %s: %s [SQLState=%s, ErrorCode=%s]",
+                    url, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

231-235: Ensure parent directory exists for file-backed H2 URL.

Avoid CI flakes when a custom path is provided.

         public Builder setDatabaseUrl(String databaseFilePath)
         {
-            this.url = format(MOCK_METADATA_DB_URL_TEMPLATE, databaseFilePath);
+            try {
+                java.nio.file.Path dbPath = java.nio.file.Paths.get(databaseFilePath).toAbsolutePath();
+                java.nio.file.Path parent = dbPath.getParent();
+                if (parent != null) {
+                    java.nio.file.Files.createDirectories(parent);
+                }
+                this.url = format(MOCK_METADATA_DB_URL_TEMPLATE, dbPath.toString());
+            }
+            catch (java.io.IOException e) {
+                fail(format("Failed to create directory for mock metadata DB at %s: %s", databaseFilePath, e.getMessage()));
+            }
             return this;
         }

268-272: Null-check inputs to builder mutators.

Fail fast with actionable messages.

         public Builder addTables(List<String> tableNames)
         {
-            tableNamesBuilder.addAll(tableNames);
+            requireNonNull(tableNames, "tableNames is null");
+            tableNamesBuilder.addAll(tableNames);
             return this;
         }
@@
         public Builder addColumnMetadata(Map<String, ColumnMetadataTableRows> clpFields)
         {
-            clpFieldsBuilder.putAll(clpFields);
+            requireNonNull(clpFields, "clpFields is null");
+            clpFieldsBuilder.putAll(clpFields);
             return this;
         }
@@
         public Builder addSplits(Map<String, ArchivesTableRows> splits)
         {
-            splitsBuilder.putAll(splits);
+            requireNonNull(splits, "splits is null");
+            splitsBuilder.putAll(splits);
             return this;
         }

Also applies to: 281-285, 293-297

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ec7ded5 and 7b8a9cd.

📒 Files selected for processing (8)
  • presto-clp/pom.xml (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1 hunks)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1 hunks)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)
presto-native-execution/presto_cpp/main/common/Configs.cpp (2)
  • discoveryUri (349-351)
  • discoveryUri (349-349)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/split/ClpMySqlSplitProvider.java (1)
  • ClpMySqlSplitProvider (35-108)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMySqlMetadataProvider.java (1)
  • ClpMySqlMetadataProvider (35-133)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (4)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (50-334)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (26-72)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-tests/src/main/java/com/facebook/presto/tests/DistributedQueryRunner.java (1)
  • DistributedQueryRunner (108-1224)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMySqlMetadataProvider.java (1)
  • ClpMySqlMetadataProvider (35-133)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (5)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (26-72)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1)
  • DatasetsTableRows (28-66)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMySqlMetadataProvider.java (1)
  • ClpMySqlMetadataProvider (35-133)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/split/ClpMySqlSplitProvider.java (1)
  • ClpMySqlSplitProvider (35-108)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (3)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
  • ClpQueryRunner (43-170)
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueryFramework.java (1)
  • AbstractTestQueryFramework (80-647)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (2)
  • PrestoNativeQueryRunnerUtils (72-731)
  • NativeQueryRunnerParameters (684-696)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-resource-manager)
  • GitHub Check: test (8.0.442, :presto-main-base)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: prestocpp-linux-build-for-test
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: prestissimo-worker-images-build
🔇 Additional comments (4)
presto-clp/pom.xml (1)

132-136: Add test-scoped presto-tests dependency — OK

This aligns the module with Presto’s test utilities and keeps it test-only. No concerns from me.

presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (2)

61-73: Resource resolution and runner wiring — looks solid

Using requireNonNull + URL.toURI() prevents NPEs and encoding mishaps; passing the CLP worker launcher from PrestoNativeQueryRunnerUtils keeps the test fully E2E.


75-105: Node-count assertion and sanity query — good coverage for a first pass

Deriving expected nodes from getNativeQueryRunnerParameters() avoids flakes when WORKER_COUNT changes. The SELECT * shape/type assertions validate decoding across nested structures.

presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)

67-87: Duplicate “msg” column in default metadata: intentional?

There are two entries named “msg” with different types (ClpString, VarString). If this is intended to exercise polymorphic schema, all good. If not, it will create two rows in the column-metadata table and may impact handle collection.

Would you like me to adjust the sample defaults or add a comment clarifying intent?

@anlowee anlowee requested a review from wraymo August 27, 2025 00:41
public static final String DEFAULT_SCHEMA = "default";
public static final String DEFAULT_TABLE_NAME = "test_e2e";

private DistributedQueryRunner actualQueryRunner;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private DistributedQueryRunner actualQueryRunner;
private DistributedQueryRunner queryRunner;

import static com.facebook.presto.spi.security.SelectedRole.Type.ROLE;
import static com.facebook.presto.testing.TestingSession.testSessionBuilder;

public class ClpQueryRunner
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make it more general since there's a chance the udf test will use this QueryRunner

Comment on lines 141 to 146
Optional<SelectedRole> role = Optional.of(new SelectedRole(ROLE, Optional.of("admin")));
return testSessionBuilder()
.setIdentity(new Identity(
"clp",
Optional.empty(),
role.map(selectedRole -> ImmutableMap.of(CLP_CATALOG, selectedRole))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed


public class TestPrestoNativeClpGeneralQueries
extends AbstractTestQueryFramework
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not hard code a lot of stuff in ClpQueryRunner but put it here (createTable)?

private ClpQueryRunner clpQueryRunner;

@Override
protected QueryRunner createQueryRunner()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And is there a chance we put the impl in PrestoNativeQueryRunnerUtils like other connectors

@anlowee anlowee requested a review from wraymo August 27, 2025 15:53
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (10)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1)

142-147: Use @afterclass(alwaysRun = true) for one-time teardown.

Teardown is per-class, not per-test; switch to @afterclass to avoid duplicate invocations across groups/suites.

-import org.testng.annotations.AfterTest;
+import org.testng.annotations.AfterClass;
@@
-    @AfterTest
+    @AfterClass(alwaysRun = true)
     public void teardown()
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)

562-604: Tidy temp dirs and align behaviour with generic launcher.

  • Mark temp workspace for deletion on JVM exit.
  • Enable runtime metrics collection (if desired) like the generic launcher.
  • Consider extracting shared file-creation to avoid drift with getExternalWorkerLauncher.
                     Path dir = Paths.get("/tmp", PrestoNativeQueryRunnerUtils.class.getSimpleName());
                     Files.createDirectories(dir);
-                    Path tempDirectoryPath = Files.createTempDirectory(dir, "worker");
+                    Path tempDirectoryPath = Files.createTempDirectory(dir, "worker");
+                    tempDirectoryPath.toFile().deleteOnExit();
@@
-                    String configProperties = format("discovery.uri=%s%n" +
+                    String configProperties = format("discovery.uri=%s%n" +
                             "presto.version=testversion%n" +
                             "system-memory-gb=4%n" +
-                            "http-server.http.port=0%n", discoveryUri);
+                            "http-server.http.port=0%n" +
+                            "runtime-metrics-collection-enabled=true%n", discoveryUri);
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)

67-70: Defend against invalid workerCount (clamp to ≥1).

Avoid accidental zero/negative node counts.

-        DistributedQueryRunner clpQueryRunner = DistributedQueryRunner.builder(session)
-                        .setNodeCount(workerCount.orElse(DEFAULT_NUM_OF_WORKERS))
+        DistributedQueryRunner clpQueryRunner = DistributedQueryRunner.builder(session)
+                        .setNodeCount(Math.max(1, workerCount.orElse(DEFAULT_NUM_OF_WORKERS)))
                         .setExternalWorkerLauncher(externalWorkerLauncher)
                         .build();
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (7)

155-171: Improve diagnostics on archives table creation.

Quote identifiers already; just enrich failure context.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to create archives table `%s%s%s`: %s [SQLState=%s, ErrorCode=%s]",
+                    tablePrefix, tableName, ARCHIVES_TABLE_SUFFIX, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

137-147: Add context to failures in addSplits.

Same rationale as above.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to insert archive splits for tables %s: %s [SQLState=%s, ErrorCode=%s]",
+                    splits.keySet(), e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

80-86: Include richer context on teardown failure.

Makes CI triage faster.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to teardown mock metadata DB at %s: %s [SQLState=%s, ErrorCode=%s]",
+                    url, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

225-229: Ensure parent directory exists for custom DB paths.

Prevents H2 from failing to create the file.

-        public Builder setDatabaseUrl(String databaseFilePath)
-        {
-            this.url = format(MOCK_METADATA_DB_URL_TEMPLATE, databaseFilePath);
-            return this;
-        }
+        public Builder setDatabaseUrl(String databaseFilePath)
+        {
+            try {
+                java.nio.file.Path dbPath = java.nio.file.Paths.get(databaseFilePath).toAbsolutePath();
+                java.nio.file.Path parent = dbPath.getParent();
+                if (parent != null) {
+                    java.nio.file.Files.createDirectories(parent);
+                }
+                this.url = format(MOCK_METADATA_DB_URL_TEMPLATE, dbPath.toString());
+            }
+            catch (java.io.IOException e) {
+                fail(format("Failed to create directory for mock metadata DB at %s: %s", databaseFilePath, e.getMessage()));
+            }
+            return this;
+        }

125-135: Add context to failures in addColumnMetadata.

Helpful when a single table causes issues.

-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to insert column metadata for tables %s: %s [SQLState=%s, ErrorCode=%s]",
+                    clpFields.keySet(), e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

196-207: Quote datasets table/columns and improve failure context.

Keeps quoting consistent with reader side.

-        final String createTableSql = format(
-                "CREATE TABLE IF NOT EXISTS %s (%s VARCHAR(255) PRIMARY KEY, %s VARCHAR(4096) NOT NULL)",
-                format("%s%s", tablePrefix, DATASETS_TABLE_SUFFIX),
-                DATASETS_TABLE_COLUMN_NAME,
-                DATASETS_TABLE_COLUMN_ARCHIVE_STORAGE_DIRECTORY);
+        final String createTableSql = format(
+                "CREATE TABLE IF NOT EXISTS `%s` (`%s` VARCHAR(255) PRIMARY KEY, `%s` VARCHAR(4096) NOT NULL)",
+                format("%s%s", tablePrefix, DATASETS_TABLE_SUFFIX),
+                DATASETS_TABLE_COLUMN_NAME,
+                DATASETS_TABLE_COLUMN_ARCHIVE_STORAGE_DIRECTORY);
@@
-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to create datasets table `%s%s`: %s [SQLState=%s, ErrorCode=%s]",
+                    tablePrefix, DATASETS_TABLE_SUFFIX, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }

176-191: Quote column-metadata table identifier and improve failure context.

Aligns with other statements and avoids edge cases.

-        String createTableSql = format(
-                "CREATE TABLE IF NOT EXISTS %s (" +
+        String createTableSql = format(
+                "CREATE TABLE IF NOT EXISTS `%s` (" +
                         "`%s` VARCHAR(512) NOT NULL, " +
                         "`%s` TINYINT NOT NULL, " +
                         "PRIMARY KEY (`%s`, `%s`))",
                 format("%s%s%s", tablePrefix, tableName, COLUMN_METADATA_TABLE_SUFFIX),
                 COLUMN_METADATA_TABLE_COLUMN_NAME,
                 COLUMN_METADATA_TABLE_COLUMN_TYPE,
                 COLUMN_METADATA_TABLE_COLUMN_NAME,
                 COLUMN_METADATA_TABLE_COLUMN_TYPE);
@@
-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to create column metadata table `%s%s%s`: %s [SQLState=%s, ErrorCode=%s]",
+                    tablePrefix, tableName, COLUMN_METADATA_TABLE_SUFFIX, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7b8a9cd and 96e7bd7.

📒 Files selected for processing (4)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1 hunks)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (4 hunks)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (2)
presto-main-base/src/main/java/com/facebook/presto/Session.java (1)
  • Session (76-950)
presto-tests/src/main/java/com/facebook/presto/tests/DistributedQueryRunner.java (1)
  • DistributedQueryRunner (108-1224)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (6)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (49-287)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (26-72)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueryFramework.java (1)
  • AbstractTestQueryFramework (80-647)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)
  • PrestoNativeQueryRunnerUtils (74-752)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
  • ClpQueryRunner (28-101)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
  • ClpQueryRunner (28-101)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (5)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (26-72)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/DatasetsTableRows.java (1)
  • DatasetsTableRows (28-66)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMySqlMetadataProvider.java (1)
  • ClpMySqlMetadataProvider (35-133)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/split/ClpMySqlSplitProvider.java (1)
  • ClpMySqlSplitProvider (35-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (17.0.13, :presto-main-base)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-main-base)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: prestocpp-linux-build-for-test
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: prestissimo-worker-images-build
🔇 Additional comments (4)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (2)

67-71: Good fix: robust resource resolution.

Null-safe getResource + URI-to-Path conversion avoids encoding pitfalls.


113-116: Node-count assertion derives from config.

Asserting workers + coordinator via getNativeQueryRunnerParameters() removes flakes with WORKER_COUNT overrides.

presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)

222-239: CLP native runner wiring looks correct.

Delegates workerCount and launcher into ClpQueryRunner; matches existing patterns.

presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)

71-83: LGTM: catalog properties and plugin install.

Properties align with ClpMySqlMetadataProvider expectations; using H2 in MySQL mode via URL is fine.

Comment on lines +108 to +123
public void addTableToDatasetsTableIfNotExist(List<String> tableNames)
{
try (Connection connection = DriverManager.getConnection(url, username, password)) {
ImmutableList.Builder<String> repeatedArchiveStorageDirectory = ImmutableList.builder();
for (String tableName : tableNames) {
createArchivesTableIfNotExist(connection, tableName);
createColumnMetadataTableIfNotExist(connection, tableName);
repeatedArchiveStorageDirectory.add(archiveStorageDirectory);
}
DatasetsTableRows datasetsTableRows = new DatasetsTableRows(tableNames, repeatedArchiveStorageDirectory.build());
datasetsTableRows.insertToTable(connection, tablePrefix);
}
catch (SQLException e) {
fail(e.getMessage());
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Normalise archiveStorageDirectory and validate input; improve diagnostics.

Missing trailing “/” yields bad tablePath; add null-check and normalise once.

-    public void addTableToDatasetsTableIfNotExist(List<String> tableNames)
+    public void addTableToDatasetsTableIfNotExist(List<String> tableNames)
     {
         try (Connection connection = DriverManager.getConnection(url, username, password)) {
             ImmutableList.Builder<String> repeatedArchiveStorageDirectory = ImmutableList.builder();
-            for (String tableName : tableNames) {
+            requireNonNull(tableNames, "tableNames is null");
+            final String normalizedArchiveDir = archiveStorageDirectory.endsWith("/") ? archiveStorageDirectory : archiveStorageDirectory + "/";
+            for (String tableName : tableNames) {
                 createArchivesTableIfNotExist(connection, tableName);
                 createColumnMetadataTableIfNotExist(connection, tableName);
-                repeatedArchiveStorageDirectory.add(archiveStorageDirectory);
+                repeatedArchiveStorageDirectory.add(normalizedArchiveDir);
             }
             DatasetsTableRows datasetsTableRows = new DatasetsTableRows(tableNames, repeatedArchiveStorageDirectory.build());
             datasetsTableRows.insertToTable(connection, tablePrefix);
         }
-        catch (SQLException e) {
-            fail(e.getMessage());
-        }
+        catch (SQLException e) {
+            fail(format("Failed to upsert datasets or create per-table structures for %s: %s [SQLState=%s, ErrorCode=%s]",
+                    tableNames, e.getMessage(), e.getSQLState(), e.getErrorCode()));
+        }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public void addTableToDatasetsTableIfNotExist(List<String> tableNames)
{
try (Connection connection = DriverManager.getConnection(url, username, password)) {
ImmutableList.Builder<String> repeatedArchiveStorageDirectory = ImmutableList.builder();
for (String tableName : tableNames) {
createArchivesTableIfNotExist(connection, tableName);
createColumnMetadataTableIfNotExist(connection, tableName);
repeatedArchiveStorageDirectory.add(archiveStorageDirectory);
}
DatasetsTableRows datasetsTableRows = new DatasetsTableRows(tableNames, repeatedArchiveStorageDirectory.build());
datasetsTableRows.insertToTable(connection, tablePrefix);
}
catch (SQLException e) {
fail(e.getMessage());
}
}
public void addTableToDatasetsTableIfNotExist(List<String> tableNames)
{
try (Connection connection = DriverManager.getConnection(url, username, password)) {
ImmutableList.Builder<String> repeatedArchiveStorageDirectory = ImmutableList.builder();
requireNonNull(tableNames, "tableNames is null");
final String normalizedArchiveDir = archiveStorageDirectory.endsWith("/")
? archiveStorageDirectory
: archiveStorageDirectory + "/";
for (String tableName : tableNames) {
createArchivesTableIfNotExist(connection, tableName);
createColumnMetadataTableIfNotExist(connection, tableName);
repeatedArchiveStorageDirectory.add(normalizedArchiveDir);
}
DatasetsTableRows datasetsTableRows = new DatasetsTableRows(
tableNames,
repeatedArchiveStorageDirectory.build());
datasetsTableRows.insertToTable(connection, tablePrefix);
}
catch (SQLException e) {
fail(format(
"Failed to upsert datasets or create per-table structures for %s: %s [SQLState=%s, ErrorCode=%s]",
tableNames,
e.getMessage(),
e.getSQLState(),
e.getErrorCode()));
}
}
🤖 Prompt for AI Agents
In
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java
around lines 108 to 123, normalize and validate archiveStorageDirectory before
use: add a null-check (throw or fail early with a diagnostic including the
provided value), trim it once and ensure it ends with a single "/" (append if
missing) prior to the loop so tablePath is built correctly, and improve error
diagnostics by including the archiveStorageDirectory value and full exception
details in the catch/fail call.

Comment on lines 160 to 183
private static void assertClpQuery(
QueryRunner queryRunner,
Session session,
String query,
List<Type> expectedTypes,
int expectedNumOfRows)
{
try {
long start = System.nanoTime();
QueryRunner.MaterializedResultWithPlan resultWithPlan = queryRunner.executeWithPlan(session, query, WarningCollector.NOOP);
MaterializedResult actualResults = resultWithPlan.getMaterializedResult().toTestTypes();
assertEquals(actualResults.getRowCount(), expectedNumOfRows);
List<Type> actualTypes = actualResults.getTypes();
assertEquals(actualTypes.size(), expectedTypes.size());
for (int i = 0; i < actualTypes.size(); ++i) {
assertEquals(actualTypes.get(i), expectedTypes.get(i));
}
long end = System.nanoTime();
log.info("Query [%s] end-to-end latency: %s s", query, (end - start) / 1e9);
}
catch (RuntimeException ex) {
fail("Execution of 'actual' query failed: ", ex);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Optional: assert column names too (not only types).

If stability matters, also validate column names from MaterializedResult.

-            List<Type> actualTypes = actualResults.getTypes();
+            List<Type> actualTypes = actualResults.getTypes();
             assertEquals(actualTypes.size(), expectedTypes.size());
             for (int i = 0; i < actualTypes.size(); ++i) {
                 assertEquals(actualTypes.get(i), expectedTypes.get(i));
             }
+            // Optionally:
+            // assertEquals(actualResults.getMaterializedRows().get(0).getFieldCount(), expectedTypes.size());

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java
around lines 160-183, add optional validation of column names: introduce an
overloaded assertClpQuery that accepts a List<String> expectedColumnNames (or
add a new parameter defaulted via an overload to preserve existing calls), and
after obtaining MaterializedResult actualResults, fetch actual column names via
actualResults.getColumnNames() and assert equality with expectedColumnNames when
it is non-null (check size and content order). Update any test call sites that
need name checks to call the new overload with the expected names.

Comment on lines +219 to +223
setDatabaseUrl(format("/tmp/%s", UUID.randomUUID()));
setUsername(MOCK_METADATA_DB_DEFAULT_USERNAME);
setPassword(MOCK_METADATA_DB_DEFAULT_PASSWORD);
setTablePrefix(MOCK_METADATA_DB_DEFAULT_TABLE_PREFIX);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can have use default values for this variables?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already the default;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean in the class variable declaration part.

assertTrue(queryRunner.tableExists(getSession(), DEFAULT_TABLE_NAME));

// Test select star
assertClpQuery(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't compare the results?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (2)

101-103: LGTM: node-count assertion now follows WORKER_COUNT.

Deriving the expected total from getNativeQueryRunnerParameters() removes flakiness. ✅


24-24: Prefer one-time, always-run teardown.

Use @afterclass(alwaysRun = true) for class-scoped resources; it avoids multiple invocations and ensures cleanup even when groups/suites shift.

-import org.testng.annotations.AfterTest;
+import org.testng.annotations.AfterClass;
@@
-    @AfterTest
+    @AfterClass(alwaysRun = true)
     public void teardown()

Also applies to: 122-122

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 96e7bd7 and 7c60cee.

📒 Files selected for processing (1)
  • presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (5)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (49-287)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ArchivesTableRows.java (1)
  • ArchivesTableRows (26-72)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java (1)
  • PrestoNativeQueryRunnerUtils (74-752)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
  • ClpQueryRunner (28-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (17.0.13, :presto-main-base)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-resource-manager)
  • GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
  • GitHub Check: test (8.0.442, :presto-main-base)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-resource-manager)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: prestocpp-linux-build-for-test
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: prestissimo-worker-images-build
🔇 Additional comments (2)
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java (2)

71-91: Duplicate column names and NullValue: confirm intended schema mapping.

You register "msg" twice with different types (ClpString, VarString) and a NullValue-only column ("attr.build_u_u_i_d"). Confirm the connector’s schema builder merges/handles these as expected and doesn’t emit duplicate column names or unexpected null-only fields.


125-127: Null-safe teardown looks good.

Guard prevents NPEs if startup fails. ✅

Comment on lines +55 to +59
URL resource = requireNonNull(
getClass().getClassLoader().getResource("clp-archives"),
"Test resource 'clp-archives' not found on classpath");
String archiveStorageDirectory = format("%s/", Paths.get(resource.toURI()));
mockMetadataDatabase = ClpMockMetadataDatabase.builder().setArchiveStorageDirectory(archiveStorageDirectory).build();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Good fix on resource handling; consider OS‑portable trailing separator.

Nice use of requireNonNull and URL→URI→Path. Minor: avoid mixing separators by using File.separator instead of "/".

-        String archiveStorageDirectory = format("%s/", Paths.get(resource.toURI()));
+        String archiveStorageDirectory = Paths.get(resource.toURI()).toString() + File.separator;

Add the missing import:

+import java.io.File;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
URL resource = requireNonNull(
getClass().getClassLoader().getResource("clp-archives"),
"Test resource 'clp-archives' not found on classpath");
String archiveStorageDirectory = format("%s/", Paths.get(resource.toURI()));
mockMetadataDatabase = ClpMockMetadataDatabase.builder().setArchiveStorageDirectory(archiveStorageDirectory).build();
// Add at top of file
import java.io.File;
URL resource = requireNonNull(
getClass().getClassLoader().getResource("clp-archives"),
"Test resource 'clp-archives' not found on classpath");
String archiveStorageDirectory = Paths.get(resource.toURI()).toString() + File.separator;
mockMetadataDatabase = ClpMockMetadataDatabase.builder().setArchiveStorageDirectory(archiveStorageDirectory).build();
🤖 Prompt for AI Agents
In
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java
around lines 55 to 59, avoid hardcoding the POSIX "/" trailing separator when
building archiveStorageDirectory; change the format call to use File.separator
(e.g. format("%s%s", Paths.get(resource.toURI()), File.separator)) or append
File.separator to the path string, and add the missing import for java.io.File
at the top of the file.

Comment on lines +98 to +100
@Test
public void test()
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Name the test descriptively.

A clearer method name helps future triage.

-    public void test()
+    public void testSelect_basicFields_formatsDate_andMatchesExpectedRow()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@Test
public void test()
{
@Test
public void testSelect_basicFields_formatsDate_andMatchesExpectedRow()
{
🤖 Prompt for AI Agents
In
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java
around lines 98 to 100, the test method is named generically as "test" which is
not descriptive; rename the method to a clear, descriptive name that reflects
the scenario it verifies (e.g., testBasicSelectQueries or
testGeneralQueryExecution), update any references or annotations accordingly,
and run/adjust test runners/imports if necessary to ensure the renamed test
compiles and executes.

Comment on lines 108 to 119
assertQuery(format("SELECT msg, format_datetime(t.dollar_sign_date, 'yyyy-MM-dd HH:mm:ss.SSS'), id, attr, tags FROM %s ORDER BY t.dollar_sign_date LIMIT 1", DEFAULT_TABLE_NAME),
"SELECT\n" +
" 'Initialized wire specification',\n" +
" TIMESTAMP '2023-03-22 12:34:54.576',\n" +
" 4915701,\n" +
" ARRAY[\n" +
" NULL,\n" +
" ARRAY[ARRAY[ARRAY[ARRAY[ARRAY[NULL]]]]],\n" +
" NULL,\n" +
" ARRAY[ARRAY[NULL]]\n" +
" ],\n" +
" NULL");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Stabilize type coercion for the formatted timestamp.

Since the actual column is VARCHAR (format_datetime), make the expected literal a string to avoid runner-dependent TIMESTAMP→VARCHAR formatting.

-                        "  TIMESTAMP '2023-03-22 12:34:54.576',\n" +
+                        "  '2023-03-22 12:34:54.576',\n" +
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assertQuery(format("SELECT msg, format_datetime(t.dollar_sign_date, 'yyyy-MM-dd HH:mm:ss.SSS'), id, attr, tags FROM %s ORDER BY t.dollar_sign_date LIMIT 1", DEFAULT_TABLE_NAME),
"SELECT\n" +
" 'Initialized wire specification',\n" +
" TIMESTAMP '2023-03-22 12:34:54.576',\n" +
" 4915701,\n" +
" ARRAY[\n" +
" NULL,\n" +
" ARRAY[ARRAY[ARRAY[ARRAY[ARRAY[NULL]]]]],\n" +
" NULL,\n" +
" ARRAY[ARRAY[NULL]]\n" +
" ],\n" +
" NULL");
assertQuery(
format(
"SELECT msg, format_datetime(t.dollar_sign_date, 'yyyy-MM-dd HH:mm:ss.SSS'), id, attr, tags FROM %s ORDER BY t.dollar_sign_date LIMIT 1",
DEFAULT_TABLE_NAME),
"SELECT\n" +
" 'Initialized wire specification',\n" +
" '2023-03-22 12:34:54.576',\n" +
" 4915701,\n" +
" ARRAY[\n" +
" NULL,\n" +
" ARRAY[ARRAY[ARRAY[ARRAY[ARRAY[NULL]]]]],\n" +
" NULL,\n" +
" ARRAY[ARRAY[NULL]]\n" +
" ],\n" +
" NULL");
🤖 Prompt for AI Agents
In
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java
around lines 108 to 119, the test expects a TIMESTAMP literal for a column
produced by format_datetime (which is VARCHAR); change the expected value to a
string literal (or explicitly CAST the timestamp to VARCHAR) so the assertion
compares VARCHAR to VARCHAR consistently across runners; update the expected
query result to use '2023-03-22 12:34:54.576' (or equivalent varchar cast)
instead of TIMESTAMP '2023-03-22 12:34:54.576'.

assertEquals(queryRunner.getNodeCount(), getNativeQueryRunnerParameters().workerCount.orElse(DEFAULT_NUM_OF_WORKERS) + 1);
assertTrue(queryRunner.tableExists(getSession(), DEFAULT_TABLE_NAME));

// H2QueryRunner currently can't change the timestamp format, and the default timestamp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should use H2QueryRunner or create our own dummy query runner because a lot of tables are created and a lot of rows are inserted for H2QueryRunner, which adds unnecessary overehead

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, but the overhead won't affect our query because it's just for returning the expected result. I thought of writing our customized H2QueryRunner before, but we had to copy a lot of code in the H2QueryRunner because the helper functions that used by H2QueryRunner to implement the ExpectedQueryRunner interface are private (and there is a critical member variable hanlde is private final), so it is not very convenient to do so. At lease this will increase this PR's size.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the Java one instead of h2?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and even for h2, why do we need to copy a lot of code? We just need a few lines of code to initialize the db (no need to create tables and insert rows)

Copy link
Author

@anlowee anlowee Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a rowMapper not sure it is necessary. It is used in the overrided execute function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that rowMapper might be necessary because it is responsible for producing the results of type ResultIterable<MaterializedRow>, which is used to check the results (QueryAssertions.java:209-223)

@anlowee anlowee requested a review from wraymo September 1, 2025 16:17
// H2QueryRunner currently can't change the timestamp format, and the default timestamp
// format of Presto is different, so for now we have to manually format the timestamp
// field.
assertQuery(format("SELECT msg, format_datetime(t.dollar_sign_date, 'yyyy-MM-dd HH:mm:ss.SSS'), id, attr, tags FROM %s ORDER BY t.dollar_sign_date LIMIT 1", DEFAULT_TABLE_NAME),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we shorten this line a bit?

@anlowee anlowee requested a review from wraymo September 2, 2025 23:05
@anlowee anlowee merged commit f9653b2 into y-scope:release-0.293-clp-connector Sep 3, 2025
57 of 58 checks passed
@anlowee anlowee deleted the xwei/presto-e2e-unit-test branch September 3, 2025 02:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants