-
Notifications
You must be signed in to change notification settings - Fork 3
feat: Add Prestissimo support for the CLP connector and update to the latest y-scope/velox commit. #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Prestissimo support for the CLP connector and update to the latest y-scope/velox commit. #34
Conversation
## Walkthrough
A new "clp" connector is integrated into Presto Native Execution. This includes protocol definitions, C++ classes, JSON (de)serialization, registration logic, and build system updates. The changes introduce new source, header, schema, and template files, update the protocol generation workflow, and ensure the "clp" connector is available in both production and test builds.
## Changes
| File(s) | Change Summary |
|---------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
| presto-native-execution/CMakeLists.txt | Added `-Wno-restrict` to disabled compiler warnings with explanatory comment. |
| presto-native-execution/presto_cpp/main/CMakeLists.txt<br>presto_cpp/main/connectors/CMakeLists.txt<br>presto_cpp/main/tests/CMakeLists.txt<br>presto_cpp/main/types/tests/CMakeLists.txt | Linked `velox_clp_connector` library to server, connector, and test targets. |
| presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp<br>presto_cpp/main/connectors/PrestoToVeloxConnector.h | Implemented and declared `ClpPrestoToVeloxConnector`, with methods for split, column, and table handle conversion.|
| presto-native-execution/presto_cpp/main/connectors/Registration.cpp | Registered the "clp" connector and its factory in the connector registry. |
| presto-native-execution/presto_cpp/main/types/tests/PrestoToVeloxConnectorTest.cpp | Added "clp" connector to test registration and verification. |
| presto-native-execution/presto_cpp/presto_protocol/Makefile | Integrated "clp" connector into protocol code and schema generation workflow. |
| presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h | Added type alias for `ClpConnectorProtocol` specialization. |
| presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h<br>presto_protocol_clp.cpp | Defined and implemented CLP connector protocol structs and their JSON (de)serialization. |
| presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json<br>presto_protocol_clp.yml | Added schema and YAML config for CLP protocol entities and subclass mappings. |
| presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache<br>presto_protocol-json-hpp.mustache | Added Mustache templates for generating JSON (de)serialization for CLP protocol types. |
| presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc<br>special/ClpTransactionHandle.hpp.inc<br>special/ClpTransactionHandle.cpp.inc | Added special-case header and source for CLP column and transaction handle (de)serialization. |
| presto-native-execution/presto_cpp/presto_protocol/java-to-struct-json.py | Updated license header removal script path. |
| presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp<br>presto_protocol.h | Included CLP protocol implementation and header. |
| presto-native-execution/velox | Updated subproject commit reference. |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant PrestoServer
participant ConnectorRegistry
participant ClpPrestoToVeloxConnector
participant VeloxCLPConnector
participant Protocol
PrestoServer->>ConnectorRegistry: registerConnectors()
ConnectorRegistry->>ClpPrestoToVeloxConnector: Instantiate
ClpPrestoToVeloxConnector->>VeloxCLPConnector: Register with factory
PrestoServer->>ClpPrestoToVeloxConnector: toVeloxSplit/ColumnHandle/TableHandle
ClpPrestoToVeloxConnector->>Protocol: Convert Presto protocol objects to Velox objects
Protocol-->>ClpPrestoToVeloxConnector: Return Velox connector objects
ClpPrestoToVeloxConnector-->>PrestoServer: Return Velox objectsLearnt from: wraymo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
🔭 Outside diff range comments (1)
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt (1)
20-22: Unconditionally linkingvelox_clp_connectorcan break builds where CLP isn’t compiledUnlike Arrow-Flight, this connector isn’t guarded with a feature flag. If a consumer builds Velox without
velox_clp_connector, the link will fail.Suggested adjustment:
-if(PRESTO_ENABLE_ARROW_FLIGHT_CONNECTOR) +if(PRESTO_ENABLE_CLP_CONNECTOR) + target_link_libraries(presto_connectors velox_clp_connector) +endif() + +if(PRESTO_ENABLE_ARROW_FLIGHT_CONNECTOR) add_subdirectory(arrow_flight) target_link_libraries(presto_connectors presto_flight_connector) endif() - -target_link_libraries(presto_connectors presto_velox_expr_conversion - velox_type_fbhive velox_clp_connector) +target_link_libraries(presto_connectors presto_velox_expr_conversion + velox_type_fbhive)This mirrors the existing Arrow-Flight pattern and keeps optional builds working.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (24)
presto-native-execution/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp(3 hunks)presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.h(2 hunks)presto-native-execution/presto_cpp/main/connectors/Registration.cpp(3 hunks)presto-native-execution/presto_cpp/main/tests/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt(3 hunks)presto-native-execution/presto_cpp/main/types/tests/PrestoToVeloxConnectorTest.cpp(1 hunks)presto-native-execution/presto_cpp/presto_protocol/Makefile(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-hpp.mustache(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.cpp.inc(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.hpp.inc(1 hunks)presto-native-execution/presto_cpp/presto_protocol/java-to-struct-json.py(1 hunks)presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp(1 hunks)presto-native-execution/presto_cpp/presto_protocol/presto_protocol.h(1 hunks)presto-native-execution/velox(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h (1)
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml (1)
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
🧬 Code Graph Analysis (2)
presto-native-execution/presto_cpp/main/connectors/Registration.cpp (2)
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.h (1)
registerPrestoToVeloxConnector(30-31)presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (2)
registerPrestoToVeloxConnector(49-59)registerPrestoToVeloxConnector(49-50)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (1)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (4)
ClpColumnHandle(43-43)ClpSplit(56-56)ClpTableHandle(66-66)ClpTableLayoutHandle(77-77)
🪛 YAMLlint (1.37.1)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml
[error] 18-18: too many spaces inside braces
(braces)
[error] 18-18: too many spaces after comma
(commas)
[error] 18-18: too many spaces inside braces
(braces)
[error] 23-23: too many spaces inside braces
(braces)
[error] 23-23: too many spaces after comma
(commas)
[error] 23-23: too many spaces inside braces
(braces)
[error] 28-28: too many spaces inside braces
(braces)
[error] 28-28: too many spaces after comma
(commas)
[error] 28-28: too many spaces inside braces
(braces)
[error] 33-33: too many spaces inside braces
(braces)
[error] 33-33: too many spaces after comma
(commas)
[error] 33-33: too many spaces inside braces
(braces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: test (8.0.442, :presto-main)
- GitHub Check: test (17.0.13, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-resource-manager)
- GitHub Check: test (17.0.13, :presto-tests -P presto-tests-execution-memory)
- GitHub Check: test (8.0.442, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-main)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: prestocpp-linux-build-for-test
🔇 Additional comments (32)
presto-native-execution/presto_cpp/main/CMakeLists.txt (1)
42-59: Verifyvelox_clp_connectoris a proper library target and its dependencies are wired in
- File:
presto-native-execution/presto_cpp/main/CMakeLists.txt(lines 42–59)- There’s no local
add_library(velox_clp_connector …)in this repo—this target must come from Velox’s exported CMake config.- Confirm in Velox that
velox_clp_connectoris exported as a STATIC or SHARED library (not an OBJECT/INTERFACE target).- Ensure all of its transitive dependencies (e.g.
velox_dwio_*,velox_core) are pulled in.- If it turns out to be an OBJECT or INTERFACE target, either add
(as you do foradd_dependencies(presto_server_lib velox_clp_connector)velox_tpch_connector) or update Velox’s CMake to export it properly.presto-native-execution/velox (1)
1-1: Floating Velox commit not on any tag or branch – please verifyNo tags or remote branches contain commit 749d28944723ad746b60ca7073f42b3df9896ff4. Before merging, confirm that:
- This commit is reachable via an immutable tag or long-lived branch expected by downstream builds.
- CI passes for all existing connectors (not just CLP) against this Velox revision.
- Any new APIs used by the native CLP connector were introduced in this commit or later, so older tags remain compatible.
presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp (1)
18-22: Addition is fine, but remember this compilation unit is marked deprecated
presto_protocol.cppis explicitly scheduled for removal. Adding the CLP.cppinclude here keeps the build green today, but the connector will silently vanish when this file is finally dropped—unless the generation template that replaces it also includes CLP.Please double-check:
- The new code-gen path (the
chevrontemplate) contains the same include.- There are no duplicate translation units for
presto_protocol_clp.cpp, which could trigger ODR violations.presto-native-execution/presto_cpp/presto_protocol/presto_protocol.h (1)
19-22: Same caveat as the.cppaggregatorGood to see the header aggregator updated, but make sure the upcoming replacement header (once this deprecated file is removed) also exports
presto_protocol_clp.h. Otherwise downstream code will break.presto-native-execution/presto_cpp/main/types/tests/PrestoToVeloxConnectorTest.cpp (1)
36-37: LGTM! Consistent test coverage for the new CLP connector.The addition follows the established pattern for connector testing and ensures the CLP connector is properly covered by the existing test infrastructure.
presto-native-execution/presto_cpp/presto_protocol/java-to-struct-json.py (1)
173-173: Verified: license-header script path is correctThe updated path (
../../velox/scripts/checks/license-header.py) resolves topresto-native-execution/velox/scripts/checks/license-header.py, which exists in the repo. No further changes are needed.presto-native-execution/presto_cpp/main/connectors/Registration.cpp (3)
22-22: LGTM! Proper header inclusion for CLP connector.The include follows the established pattern for connector headers.
49-53: LGTM! Consistent connector factory registration pattern.The CLP connector factory registration follows the same pattern as other connectors, including the proper duplicate check using
hasConnectorFactory().
84-85: LGTM! Proper PrestoToVeloxConnector registration.The ClpPrestoToVeloxConnector registration follows the established pattern, using the factory's connector name constant for consistency.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.cpp.inc (1)
20-24: LGTM! JSON serialization follows the expected enum pattern.The
to_jsonfunction correctly serializes the ClpTransactionHandle as a JSON array containing the type and instance, which is appropriate for enum-like structures.presto-native-execution/presto_cpp/presto_protocol/Makefile (3)
55-60: LGTM! Consistent build integration for CLP connector.The build rules for CLP connector structs follow the established pattern used by other connectors, ensuring consistency in the build process.
68-68: LGTM! Proper JSON generation integration.The CLP connector is correctly integrated into the JSON generation process, maintaining consistency with other connectors.
76-76: LGTM! Proper protobuf generation integration.The CLP connector is correctly integrated into the protobuf generation process, maintaining consistency with other connectors.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.hpp.inc (3)
15-16: Good documentation of special handling.The comment clearly explains why this handle requires special treatment due to its Java counterpart being an enum.
20-22: LGTM! Proper struct definition.The struct correctly inherits from ConnectorTransactionHandle and has appropriate member initialization.
24-26: LGTM! Proper JSON serialization declarations.The JSON serialization function declarations follow the established pattern used throughout the codebase.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc (3)
19-24: LGTM! Well-structured column handle definition.The struct members are appropriately defined with proper types and default initialization.
31-32: LGTM! Proper JSON serialization declarations.The JSON serialization function declarations follow the established pattern used throughout the codebase.
27-29: No change needed for dynamic_cast in ClpColumnHandle::operator<()All connector implementations (Arrow, Hive, Iceberg, Tpch and Clp) use the same
dynamic_cast-based pattern in theiroperator<()overrides, so this is a consistent design choice rather than an isolated issue. No modifications are required here.presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h (1)
19-28: LGTM! Proper connector protocol definition.The ClpConnectorProtocol type alias correctly specifies CLP-specific types and appropriately uses NotImplemented placeholders for unsupported features, following the established pattern.
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.h (2)
16-16: LGTM! Include reordering for consistency.The include reordering improves code organization and consistency.
227-252: LGTM! Well-structured CLP connector class.The ClpPrestoToVeloxConnector class properly follows the established pattern of other connector classes, implementing all required virtual methods and correctly forwarding constructor parameters to the base class.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-hpp.mustache (1)
29-31: Ignore hard-codedClpTransactionHandlein this template
The JSON header templates for all connectors (Hive, TPCH, ArrowFlight, etc.) likewise include their transaction‐handle definitions directly. HavingClpTransactionHandlehere is intentional and aligns with the existing pattern needed for correct JSON (de)serialization. No change required.Likely an incorrect or invalid review comment.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json (1)
1-114: LGTM! Well-structured JSON schema for CLP connector protocol.The JSON schema definitions are well-formed and follow established patterns. The field types and inheritance relationships are appropriately defined for the CLP connector protocol structures.
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (6)
18-18: LGTM! Proper inclusion of CLP connector headers.The new includes for CLP protocol and Velox connector components are correctly added following the established pattern of other connectors.
Also applies to: 24-26
1558-1568: LGTM! CLP split conversion implementation follows established patterns.The
toVeloxSplitmethod correctly converts CLP protocol splits to Velox connector splits with proper error handling and type validation.
1570-1582: LGTM! CLP column handle conversion is well-implemented.The
toVeloxColumnHandlemethod properly converts CLP protocol column handles to Velox column handles, using the type parser as expected.
1584-1603: LGTM! CLP table handle conversion follows connector patterns.The
toVeloxTableHandlemethod correctly extracts the table layout handle and creates the appropriate Velox table handle with proper validation.
1605-1608: LGTM! CLP connector protocol creation is consistent.The
createConnectorProtocolmethod follows the standard pattern established by other connectors.
1419-1424: LGTM! Improved formatting in Iceberg connector.The formatting improvement with proper variable declaration and initialization enhances code readability.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (1)
1-158: LGTM! Generated JSON serialization code follows established patterns.The generated C++ code provides consistent JSON serialization/deserialization for all CLP protocol types. The default constructors properly set the type discriminator, and the serialization functions correctly handle all fields using the established helper function patterns.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (1)
1-2: Ensure manualoperator<implementations are correctly preserved by the generatorThe generated header contains inlined manual code for
operator<(lines 34 and 45), and there’s a separate inc file underspecial/ClpColumnHandle.hpp.inc(lines 16 and 27) holding the same implementation. However,presto_protocol_clp.hdoes not reference or include that inc file at build time.Please verify that your Chevron-based generation pipeline actually inlines or merges
special/ClpColumnHandle.hpp.incwhen regenerating this header to avoid losing manual changes. If it does not, consider:
- Moving all manual
operator<code intospecial/ClpColumnHandle.hpp.inconly- Updating the Mustache template or generation script to explicitly pull in that inc file
Files to check:
- presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (manual code at lines 34, 45)
- presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc (manual code at lines 16, 27)
| velox_hive_connector | ||
| velox_tpch_connector | ||
| velox_clp_connector | ||
| velox_presto_serializer | ||
| velox_functions_prestosql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Same optional-build issue for test target
presto_server_test now always links against velox_clp_connector. Add the same if(PRESTO_ENABLE_CLP_CONNECTOR) guard to avoid link failures when CLP isn’t present.
🤖 Prompt for AI Agents
In presto-native-execution/presto_cpp/main/tests/CMakeLists.txt around lines 48
to 52, the test target presto_server_test always links against
velox_clp_connector, which causes link failures if the CLP connector is not
enabled. Wrap the addition of velox_clp_connector in an
if(PRESTO_ENABLE_CLP_CONNECTOR) conditional block to ensure it is only linked
when the CLP connector is present, preventing build errors.
| velox_hive_connector | ||
| velox_tpch_connector | ||
| velox_clp_connector | ||
| velox_exec | ||
| velox_dwio_common_exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Guard CLP linkage in presto_velox_split_test
Add a feature flag guard around velox_clp_connector just like the main connector target.
🤖 Prompt for AI Agents
In presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt around
lines 28 to 32, the dependency on velox_clp_connector in the
presto_velox_split_test target is not guarded by a feature flag. Add a
conditional check for the CLP feature flag around the inclusion of
velox_clp_connector, similar to how it is done in the main connector target, to
ensure the linkage only occurs when the feature is enabled.
| velox_hive_connector | ||
| velox_tpch_connector | ||
| velox_clp_connector | ||
| GTest::gtest | ||
| GTest::gtest_main) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Guard CLP linkage in presto_to_velox_connector_test
Mirror the conditional linkage pattern so builds that exclude CLP stay healthy.
🤖 Prompt for AI Agents
In presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt around
lines 99 to 103, the target_link_libraries for presto_to_velox_connector_test
always includes velox_clp_connector, which can cause build failures if CLP is
excluded. Modify the CMakeLists.txt to conditionally link velox_clp_connector
only if CLP is enabled, mirroring the existing pattern used elsewhere to guard
CLP linkage and ensure builds without CLP remain healthy.
| velox_hive_connector | ||
| velox_tpch_connector | ||
| velox_clp_connector | ||
| velox_hive_partition_function | ||
| velox_presto_serializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Guard CLP linkage in presto_expressions_test
Same optional-build concern as above.
🤖 Prompt for AI Agents
In presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt around
lines 66 to 70, the linkage of the velox_clp_connector in the
presto_expressions_test target is not guarded for optional builds. Modify the
CMakeLists.txt to conditionally link velox_clp_connector only if the CLP
component is enabled, using appropriate CMake conditional statements to prevent
build errors when CLP is not included.
| "-Wno-nullability-completeness \ | ||
| -Wno-deprecated-declarations \ | ||
| -Wno-restrict") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the necessity of suppressing the -Wrestrict warning.
The addition of -Wno-restrict to the disabled warnings list suggests that the new CLP connector code triggers this warning. Please verify that this warning suppression is necessary and doesn't hide important issues.
🏁 Script executed:
#!/bin/bash
# Description: Check if the CLP connector code triggers -Wrestrict warnings
# Expected: Find instances where this warning would be triggered
# Search for patterns that commonly trigger -Wrestrict warnings in CLP-related files
echo "Searching for potential -Wrestrict triggers in CLP connector code..."
rg -A 3 -B 3 "strcpy|strncpy|sprintf|memcpy|memmove" --type cpp --glob "*clp*" || true
rg -A 3 -B 3 "restrict|__restrict" --type cpp --glob "*clp*" || trueLength of output: 304
Remove unnecessary -Wno-restrict suppression
No usage patterns (e.g. strcpy, memcpy, or pointer qualifiers like restrict) were found in the CLP connector code that would trigger a -Wrestrict warning. To avoid hiding potential issues, please remove the -Wno-restrict entry.
• File to update: presto-native-execution/CMakeLists.txt (around lines 34–36)
• Suggested diff:
- "-Wno-nullability-completeness \
- -Wno-deprecated-declarations \
- -Wno-restrict")
+ "-Wno-nullability-completeness \
+ -Wno-deprecated-declarations")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "-Wno-nullability-completeness \ | |
| -Wno-deprecated-declarations \ | |
| -Wno-restrict") | |
| "-Wno-nullability-completeness \ | |
| -Wno-deprecated-declarations") |
🤖 Prompt for AI Agents
In presto-native-execution/CMakeLists.txt around lines 34 to 36, remove the
compiler flag '-Wno-restrict' from the list of warning suppressions since no
code in the CLP connector triggers this warning. This will prevent hiding
potential issues related to restrict usage.
| {{/struct}} | ||
| {{#enum}} | ||
| namespace facebook::presto::protocol::clp { | ||
| //Loosly copied this here from NLOHMANN_JSON_SERIALIZE_ENUM() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Fix typo in comment.
Minor typo in the comment.
- //Loosly copied this here from NLOHMANN_JSON_SERIALIZE_ENUM()
+ //Loosely copied this here from NLOHMANN_JSON_SERIALIZE_ENUM()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| //Loosly copied this here from NLOHMANN_JSON_SERIALIZE_ENUM() | |
| //Loosely copied this here from NLOHMANN_JSON_SERIALIZE_ENUM() |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache
at line 75, fix the typo in the comment by changing "Loosly" to "Loosely" to
correct the spelling.
| j[0].get_to(p._type); | ||
| j[1].get_to(p.instance); | ||
| } | ||
| } // namespace facebook::presto::protocol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect namespace in closing comment.
The namespace closing comment doesn't match the actual namespace.
-} // namespace facebook::presto::protocol
+} // namespace facebook::presto::protocol::clp📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } // namespace facebook::presto::protocol | |
| } // namespace facebook::presto::protocol::clp |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache
at line 39, the closing comment for the namespace is incorrect. Update the
comment to match the actual namespace being closed to ensure clarity and
correctness.
| void from_json(const json& j, ClpTransactionHandle& p) { | ||
| j[0].get_to(p._type); | ||
| j[1].get_to(p.instance); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add bounds checking for JSON array deserialization.
The from_json function assumes the JSON array has at least 2 elements without validation, which could cause out-of-bounds access.
void from_json(const json& j, ClpTransactionHandle& p) {
+ if (!j.is_array() || j.size() < 2) {
+ throw ParseError("ClpTransactionHandle requires JSON array with at least 2 elements");
+ }
j[0].get_to(p._type);
j[1].get_to(p.instance);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| void from_json(const json& j, ClpTransactionHandle& p) { | |
| j[0].get_to(p._type); | |
| j[1].get_to(p.instance); | |
| } | |
| void from_json(const json& j, ClpTransactionHandle& p) { | |
| if (!j.is_array() || j.size() < 2) { | |
| throw ParseError("ClpTransactionHandle requires JSON array with at least 2 elements"); | |
| } | |
| j[0].get_to(p._type); | |
| j[1].get_to(p.instance); | |
| } |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache
around lines 35 to 38, the from_json function accesses JSON array elements
without checking if they exist, risking out-of-bounds errors. Add bounds
checking to verify the JSON array has at least two elements before accessing
j[0] and j[1], and handle the case where the array is too short, such as by
throwing an exception or returning an error.
| const auto* it = std::find_if(std::begin({{&class_name}}_enum_table), std::end({{&class_name}}_enum_table), | ||
| [e](const std::pair<{{&class_name}}, json>& ej_pair) -> bool | ||
| { | ||
| return ej_pair.first == e; | ||
| }); | ||
| j = ((it != std::end({{&class_name}}_enum_table)) ? it : std::begin({{&class_name}}_enum_table))->second; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider throwing exception for invalid enum values.
The current implementation silently falls back to the first enum value when an invalid value is encountered, which could mask deserialization errors.
void to_json(json& j, const {{&class_name}}& e)
{
static_assert(std::is_enum<{{&class_name}}>::value, "{{&class_name}} must be an enum!");
const auto* it = std::find_if(std::begin({{&class_name}}_enum_table), std::end({{&class_name}}_enum_table),
[e](const std::pair<{{&class_name}}, json>& ej_pair) -> bool
{
return ej_pair.first == e;
});
- j = ((it != std::end({{&class_name}}_enum_table)) ? it : std::begin({{&class_name}}_enum_table))->second;
+ if (it != std::end({{&class_name}}_enum_table)) {
+ j = it->second;
+ } else {
+ throw std::invalid_argument("Invalid {{&class_name}} enum value: " + std::to_string(static_cast<int>(e)));
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const auto* it = std::find_if(std::begin({{&class_name}}_enum_table), std::end({{&class_name}}_enum_table), | |
| [e](const std::pair<{{&class_name}}, json>& ej_pair) -> bool | |
| { | |
| return ej_pair.first == e; | |
| }); | |
| j = ((it != std::end({{&class_name}}_enum_table)) ? it : std::begin({{&class_name}}_enum_table))->second; | |
| } | |
| void to_json(json& j, const {{&class_name}}& e) | |
| { | |
| static_assert(std::is_enum<{{&class_name}}>::value, "{{&class_name}} must be an enum!"); | |
| const auto* it = std::find_if( | |
| std::begin({{&class_name}}_enum_table), | |
| std::end({{&class_name}}_enum_table), | |
| [e](const std::pair<{{&class_name}}, json>& ej_pair) -> bool { | |
| return ej_pair.first == e; | |
| }); | |
| if (it != std::end({{&class_name}}_enum_table)) { | |
| j = it->second; | |
| } else { | |
| throw std::invalid_argument( | |
| "Invalid {{&class_name}} enum value: " + | |
| std::to_string(static_cast<int>(e))); | |
| } | |
| } |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache
around lines 87 to 93, the code currently falls back to the first enum value if
an invalid enum is found, which hides errors. Modify the code to check if the
iterator equals the end of the enum table, and if so, throw an exception
indicating an invalid enum value instead of defaulting to the first element.
This will ensure deserialization errors are properly surfaced.
| bool operator<(const ColumnHandle& o) const override { | ||
| return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete comparison operator may cause issues in sorted containers.
The operator< only compares columnName, ignoring other fields like originalColumnName, columnType, and nullable. This could lead to incorrect behavior when ClpColumnHandle objects are used in sorted containers or algorithms, as two handles with the same columnName but different other fields would be considered equal.
Consider implementing a complete comparison that includes all relevant fields:
bool operator<(const ColumnHandle& o) const override {
- return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName;
+ const auto& other = dynamic_cast<const ClpColumnHandle&>(o);
+ if (columnName != other.columnName) return columnName < other.columnName;
+ if (originalColumnName != other.originalColumnName) return originalColumnName < other.originalColumnName;
+ if (columnType != other.columnType) return columnType < other.columnType;
+ return nullable < other.nullable;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bool operator<(const ColumnHandle& o) const override { | |
| return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName; | |
| } | |
| bool operator<(const ColumnHandle& o) const override { | |
| const auto& other = dynamic_cast<const ClpColumnHandle&>(o); | |
| if (columnName != other.columnName) return columnName < other.columnName; | |
| if (originalColumnName != other.originalColumnName) return originalColumnName < other.originalColumnName; | |
| if (columnType != other.columnType) return columnType < other.columnType; | |
| return nullable < other.nullable; | |
| } |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h
around lines 45 to 47, the operator< only compares columnName, ignoring other
important fields like originalColumnName, columnType, and nullable. To fix this,
update the operator< to perform a lexicographical comparison of all these fields
in a consistent order, ensuring that all relevant members are compared to
correctly distinguish different ClpColumnHandle objects in sorted containers.
The unit test can be passed from my end. |
kirkrodrigues
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned offline, shouldn't we guard the CLP connector with an option like PRESTO_ENABLE_ARROW_FLIGHT_CONNECTOR?
| target_link_libraries(presto_connectors presto_velox_expr_conversion | ||
| velox_type_fbhive) | ||
| velox_type_fbhive velox_clp_connector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this? If so, let's alphabetize the list of libraries to link against.
presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt
Outdated
Show resolved
Hide resolved
| connectorList.emplace_back( | ||
| std::pair("clp", std::make_unique<ClpPrestoToVeloxConnector>("clp"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To alphabetize, can we move this above line 27?
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
presto-native-execution/presto_cpp/main/tests/CMakeLists.txt (1)
48-50: Conditionally linkvelox_clp_connectorso non-CLP builds don’t breakSame feedback as the earlier review: wrap the link line in an
if(TARGET velox_clp_connector)(or a dedicated feature flag) to prevent link-time failures when the CLP connector is not built.presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt (3)
24-31: Guard CLP linkage inpresto_velox_split_testThe test unconditionally links against
velox_clp_connector; add a target/option guard to keep optional builds functional.
64-70: Guard CLP linkage inpresto_expressions_testSame optional-build concern—enclose the CLP connector dependency in a conditional block.
96-102: Guard CLP linkage inpresto_to_velox_connector_testRepeat of the above: protect the link line with a feature-flag or
if(TARGET …)check.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
presto-native-execution/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/tests/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-resource-manager)
- GitHub Check: test (17.0.13, :presto-main-base)
- GitHub Check: test (8.0.442, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-resource-manager)
- GitHub Check: test (8.0.442, :presto-main)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: prestocpp-linux-build-for-test
| # Known warnings that are benign can be disabled: | ||
| # - `restrict` since it triggers a bug in gcc 12. See | ||
| # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105651 | ||
| set(DISABLED_WARNINGS | ||
| "-Wno-nullability-completeness -Wno-deprecated-declarations") | ||
| "-Wno-nullability-completeness \ | ||
| -Wno-deprecated-declarations \ | ||
| -Wno-restrict") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Gate -Wno-restrict to GCC 12 only (avoid masking real issues under other tool-chains)
-Wno-restrict suppresses a real warning class and should be disabled only for the compiler/versions that mis-fire (GCC 12, bug 105651). Applying it unconditionally hides legitimate misuse of restrict for Clang, GCC 13+, etc.
- -Wno-deprecated-declarations \
- -Wno-restrict")
+ -Wno-deprecated-declarations")
+
+# Suppress -Wrestrict only for the GCC 12 series (see
+# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105651). Other
+# compilers/versions still benefit from the diagnostic.
+if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU"
+ AND CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 12
+ AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 13)
+ list(APPEND DISABLED_WARNINGS "-Wno-restrict")
+endif()This keeps the workaround while preserving the warning where it is still valuable.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Known warnings that are benign can be disabled: | |
| # - `restrict` since it triggers a bug in gcc 12. See | |
| # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105651 | |
| set(DISABLED_WARNINGS | |
| "-Wno-nullability-completeness -Wno-deprecated-declarations") | |
| "-Wno-nullability-completeness \ | |
| -Wno-deprecated-declarations \ | |
| -Wno-restrict") | |
| # Known warnings that are benign can be disabled: | |
| # - `restrict` since it triggers a bug in gcc 12. See | |
| # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105651 | |
| set(DISABLED_WARNINGS | |
| "-Wno-nullability-completeness \ | |
| -Wno-deprecated-declarations") | |
| # Suppress -Wrestrict only for the GCC 12 series (see | |
| # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105651). Other | |
| # compilers/versions still benefit from the diagnostic. | |
| if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU" | |
| AND CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 12 | |
| AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 13) | |
| list(APPEND DISABLED_WARNINGS "-Wno-restrict") | |
| endif() |
🤖 Prompt for AI Agents
In presto-native-execution/CMakeLists.txt around lines 32 to 38, the
-Wno-restrict warning suppression is applied unconditionally, which hides
legitimate warnings on compilers other than GCC 12. Modify the CMakeLists.txt to
apply the -Wno-restrict flag only when the compiler is GCC version 12 by adding
a conditional check for the compiler and version before including this flag in
DISABLED_WARNINGS.
kirkrodrigues
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR title, how about:
feat: Add Prestissimo support for the CLP connector and update to the latest y-scope/velox commit.
20d8f9c
into
y-scope:release-0.293-clp-connector
Description
This PR adds Prestissimo support for the CLP connector. The native connector is now the default option and is responsible for deserializing classes from the Java coordinator, which are then used by Velox for query execution.
This PR also disables the gcc
restrictwarning since it's triggering a false positive with gcc 12:gcc 12 -Wrestrict false positive
This PR also modifies
java-to-struct-json.pywhere we changed the path oflicense-header.py, since y-scope/velox#11 refactored Velox's pre-commit check.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Chores
Style