-
Notifications
You must be signed in to change notification settings - Fork 3
feat: Add CLP connector native code #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add CLP connector native code #24
Conversation
## Walkthrough
This change introduces support for a new "clp" connector in the Presto native execution codebase. It adds protocol definitions, serialization logic, connector registration, and build integration for the CLP connector, including updates to CMake, protocol generation scripts, and tests. No existing connector logic is altered.
## Changes
| File(s) / Path(s) | Change Summary |
|-------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
| .../main/CMakeLists.txt<br>.../main/tests/CMakeLists.txt<br>.../main/types/tests/CMakeLists.txt | Added `velox_clp_connector` as a library dependency for server and test executables. |
| .../main/connectors/PrestoToVeloxConnector.{h,cpp} | Added `ClpPrestoToVeloxConnector` class and its methods for split, column, table handle conversion, and protocol instantiation. |
| .../main/connectors/Registration.cpp | Registered the CLP connector factory and the `ClpPrestoToVeloxConnector` in the connector registry. |
| .../main/types/tests/PrestoToVeloxConnectorTest.cpp | Included `"clp"` connector in connector registration test. |
| .../presto_protocol/Makefile | Integrated CLP connector into protocol code and JSON generation workflow. |
| .../presto_protocol/connector/clp/presto_protocol_clp.{h,cpp,json,yml} | Added CLP connector protocol: structs, JSON schema, YAML config, and serialization/deserialization logic. |
| .../presto_protocol/connector/clp/ClpConnectorProtocol.h | Added type alias for `ClpConnectorProtocol` using protocol template. |
| .../presto_protocol/connector/clp/presto_protocol-json-{cpp,hpp}.mustache | Added mustache templates for CLP connector protocol C++ code and header generation. |
| .../presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc | Added `ClpColumnHandle` struct with comparison operator and JSON functions. |
| .../presto_protocol/connector/clp/special/ClpTransactionHandle.{hpp,cpp}.inc | Added `ClpTransactionHandle` struct and its JSON serialization/deserialization. |
| .../presto_protocol/java-to-struct-json.py | Updated license header removal script path for protocol generation. |
| .../presto_protocol/presto_protocol.{h,cpp} | Included CLP connector protocol header and source in main protocol files. |
| .../velox | Updated submodule reference to a newer commit. |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant PrestoServer
participant ConnectorRegistry
participant ClpConnectorFactory
participant ClpPrestoToVeloxConnector
PrestoServer->>ConnectorRegistry: registerConnectors()
ConnectorRegistry->>ClpConnectorFactory: isRegistered()?
alt Not registered
ConnectorRegistry->>ClpConnectorFactory: registerFactory()
end
ConnectorRegistry->>ClpPrestoToVeloxConnector: register (with factory name "clp")sequenceDiagram
participant PrestoServer
participant ClpPrestoToVeloxConnector
participant Protocol
participant VeloxClpConnector
PrestoServer->>ClpPrestoToVeloxConnector: toVeloxSplit/ColumnHandle/TableHandle
ClpPrestoToVeloxConnector->>Protocol: Parse protocol object (split/handle)
Protocol-->>ClpPrestoToVeloxConnector: Return parsed object
ClpPrestoToVeloxConnector->>VeloxClpConnector: Construct Velox object (split/handle)
VeloxClpConnector-->>ClpPrestoToVeloxConnector: Return Velox object
ClpPrestoToVeloxConnector-->>PrestoServer: Return Velox object
Suggested reviewers
Learnt from: wraymo |
a592410 to
754026b
Compare
| velox::connector::registerConnectorFactory( | ||
| std::make_shared<velox::connector::clp::ClpConnectorFactory>()); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to modify the CMakeLists.txt under this connectors dir:
target_link_libraries(presto_connectors presto_velox_expr_conversion
velox_type_fbhive velox_clp_connector)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise it cannot find the ClpConnectorFactory() when building
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is we haven't updated velox code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
velox_clp_connector was already added to another tareget.
| # build clp connector related structs | ||
| echo "// DO NOT EDIT : This file is generated by chevron" > connector/clp/presto_protocol_clp.cpp | ||
| chevron -d connector/clp/presto_protocol_clp.json connector/clp/presto_protocol-json-cpp.mustache >> connector/clp/presto_protocol_clp.cpp | ||
| echo "// DO NOT EDIT : This file is generated by chevron" > connector/clp/presto_protocol_clp.h | ||
| chevron -d connector/clp/presto_protocol_clp.json connector/clp/presto_protocol-json-hpp.mustache >> connector/clp/presto_protocol_clp.h | ||
| clang-format -style=file -i connector/clp/presto_protocol_clp.h connector/clp/presto_protocol_clp.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent seems incorrect, refer to the lines above, they start with a tab not 4 spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the ide again auto-formats it to spaces. I'll take a took
...to-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
♻️ Duplicate comments (3)
presto-native-execution/presto_cpp/main/tests/CMakeLists.txt (1)
50-50: Same conditional-link concern for test binary.
UseTARGET_EXISTSor an option guard to avoid test-time link failures when the CLP lib is absent.presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt (1)
30-30: Repeat of unconditionalvelox_clp_connectorlinkage.
Please apply the same conditional-link strategy here to keep CI passing across environments that do not yet ship the connector.Also applies to: 68-68, 101-101
presto-native-execution/presto_cpp/presto_protocol/Makefile (1)
55-55: Fix indentation to use tabs consistently.Based on past review feedback, ensure consistent indentation using tabs rather than spaces to match the existing lines in the file.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (22)
presto-native-execution/presto_cpp/main/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp(3 hunks)presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.h(2 hunks)presto-native-execution/presto_cpp/main/connectors/Registration.cpp(3 hunks)presto-native-execution/presto_cpp/main/tests/CMakeLists.txt(1 hunks)presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt(3 hunks)presto-native-execution/presto_cpp/main/types/tests/PrestoToVeloxConnectorTest.cpp(1 hunks)presto-native-execution/presto_cpp/presto_protocol/Makefile(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-hpp.mustache(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.cpp.inc(1 hunks)presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.hpp.inc(1 hunks)presto-native-execution/presto_cpp/presto_protocol/java-to-struct-json.py(1 hunks)presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp(1 hunks)presto-native-execution/presto_cpp/presto_protocol/presto_protocol.h(1 hunks)presto-native-execution/velox(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
presto-native-execution/presto_cpp/main/connectors/Registration.cpp (1)
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h (1)
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml (1)
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
🧬 Code Graph Analysis (2)
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.h (2)
presto-native-execution/presto_cpp/main/connectors/SystemConnector.h (3)
catalogId(184-187)column(189-191)tableHandle(193-200)presto-native-execution/presto_cpp/main/connectors/arrow_flight/ArrowPrestoToVeloxConnector.h (3)
catalogId(25-28)column(30-32)tableHandle(34-41)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (1)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (4)
ClpColumnHandle(43-43)ClpSplit(56-56)ClpTableHandle(66-66)ClpTableLayoutHandle(77-77)
🪛 YAMLlint (1.37.1)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml
[error] 18-18: too many spaces inside braces
(braces)
[error] 18-18: too many spaces after comma
(commas)
[error] 18-18: too many spaces inside braces
(braces)
[error] 23-23: too many spaces inside braces
(braces)
[error] 23-23: too many spaces after comma
(commas)
[error] 23-23: too many spaces inside braces
(braces)
[error] 28-28: too many spaces inside braces
(braces)
[error] 28-28: too many spaces after comma
(commas)
[error] 28-28: too many spaces inside braces
(braces)
[error] 33-33: too many spaces inside braces
(braces)
[error] 33-33: too many spaces after comma
(commas)
[error] 33-33: too many spaces inside braces
(braces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: test (17.0.13, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-main)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (8.0.442, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-resource-manager)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: prestocpp-linux-build-for-test
🔇 Additional comments (21)
presto-native-execution/velox (1)
1-1: No API-breaking changes detected in Velox submodule bumpA diff of headers between commits 52f4e37 → 749d2894 shows only new files under velox/connectors/clp were added—no existing public or vector interface headers were modified. The Velox core API remains unchanged, so ABI/API compatibility is intact. You can proceed with your standard CI native build and integration tests.
presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp (1)
19-19:clpprotocol TU inclusion aligns with existing aggregation pattern – looks good.
The new include maintains alphabetical ordering and mirrors the pattern used for other connector protocol units. No issues spotted.presto-native-execution/presto_cpp/presto_protocol/presto_protocol.h (1)
20-20: Header inclusion is correct; no further action required.
Theclpprotocol header is added consistently with other connector headers. Compiles should remain deterministic.presto-native-execution/presto_cpp/main/types/tests/PrestoToVeloxConnectorTest.cpp (1)
36-37: LGTM! CLP connector test integration follows established patterns.The addition of the CLP connector to the test suite is consistent with existing connector implementations and properly tests the registration lifecycle.
presto-native-execution/presto_cpp/main/connectors/Registration.cpp (3)
22-22: LGTM! CLP connector header inclusion is appropriate.The include follows the established pattern for other connector headers.
49-53: LGTM! CLP connector factory registration follows established patterns.The registration includes proper existence checks and uses the correct factory class, consistent with other connector implementations.
84-85: LGTM! CLP PrestoToVeloxConnector registration is correctly implemented.The registration follows the established pattern and uses the appropriate connector name constant.
presto-native-execution/presto_cpp/presto_protocol/java-to-struct-json.py (1)
173-173: New license-header.py path is validThe updated invocation of
../../velox/scripts/checks/license-header.pyresolves to
presto-native-execution/velox/scripts/checks/license-header.py, which exists in the repo.
The old location is gone, so this change correctly follows the reorganization.presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.hpp.inc (1)
15-28: LGTM! ClpTransactionHandle header is well-structured.The struct correctly inherits from
ConnectorTransactionHandle, includes appropriate member initialization, and declares the necessary JSON serialization functions. The comment explaining the special nature (Java enum vs C++ struct) is helpful.presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.cpp.inc (1)
20-29: LGTM! JSON serialization implementation is correct.The
to_jsonandfrom_jsonfunctions properly serialize the transaction handle using array format, which is appropriate for enum-like structures. The implementation correctly handles both the inherited_typefield and theinstancemember.presto-native-execution/presto_cpp/presto_protocol/Makefile (1)
55-61: LGTM! CLP connector build integration follows established pattern.The CLP connector build rules correctly follow the same pattern as existing connectors (hive, iceberg, tpch, arrow_flight) for generating protocol files, JSON schemas, and protobuf definitions.
Also applies to: 68-68, 76-76
presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h (1)
18-29: LGTM! Clean connector protocol type alias definition.The ClpConnectorProtocol type alias correctly specializes the ConnectorProtocolTemplate with CLP-specific types (ClpTableHandle, ClpTableLayoutHandle, ClpColumnHandle, ClpSplit, ClpTransactionHandle) and appropriately uses NotImplemented for unused template parameters. This follows the established pattern for other connector protocols.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc (1)
19-25: LGTM! Well-structured column handle implementation.The ClpColumnHandle struct properly defines all necessary members (columnName, originalColumnName, columnType, nullable) and includes appropriate JSON serialization function declarations.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml (1)
13-40: LGTM! Correct protocol configuration structure.The YAML configuration properly defines the AbstractClasses with their subclass mappings and correctly references the Java source files for the CLP connector protocol types.
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.h (1)
227-252: LGTM! ClpPrestoToVeloxConnector follows established pattern.The ClpPrestoToVeloxConnector class declaration correctly follows the same pattern as other connector implementations (HivePrestoToVeloxConnector, IcebergPrestoToVeloxConnector, etc.). All required virtual methods are properly declared with final specifiers, and the constructor signature is consistent with other connectors.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json (1)
1-114: LGTM! Well-structured CLP protocol definitions.The JSON schema correctly defines the CLP connector protocol types with appropriate inheritance and field definitions. The custom
operator<implementation forClpColumnHandleprovides proper ordering by column name.presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (1)
1558-1608: LGTM! CLP connector implementation follows established patterns.The
ClpPrestoToVeloxConnectorimplementation correctly handles the conversion from Presto protocol types to Velox connector types, with proper null checks and type casting consistent with other connectors.presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (1)
1-158: LGTM! Generated code follows protocol serialization patterns.This generated file correctly implements JSON serialization/deserialization for all CLP protocol types. The use of array format for
ClpTransactionHandleand object format for other types is consistent with the protocol design.presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (3)
1-18: Generated code structure looks good.The file header appropriately indicates this is generated code and includes proper licensing information.
19-24: Includes are appropriate.The necessary dependencies for JSON handling and core protocol types are properly included.
25-81: Overall structure is well-designed.The CLP connector protocol structs properly extend the base Presto protocol classes and follow consistent naming conventions. The separation of concerns with distinct structs for different connector components is appropriate.
| velox_abfs | ||
| velox_aggregates | ||
| velox_caching | ||
| velox_clp_connector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Un-guarded linkage on velox_clp_connector risks build breakage on older Velox snapshots.
Unlike Hive/TPCH, the CLP connector is still emerging and may not exist in all Velox builds distributed in the field. Linking it unconditionally will cause a hard failure (undefined target) when consumers compile against an older or stripped-down Velox build.
Consider protecting the linkage behind a CMake feature/option test, e.g.:
-find_package(VeloxClpConnector REQUIRED) # pseudo
-
target_link_libraries(
presto_server_lib
@@
velox_caching
- velox_clp_connector
+ $<$<TARGET_EXISTS:velox_clp_connector>:velox_clp_connector>or introduce a PRESTO_ENABLE_CLP_CONNECTOR option similar to the existing remote-function flag.
This pattern should be mirrored anywhere the target is linked (tests, tools).
🏁 Script executed:
#!/bin/bash
# Search for references to velox_clp_connector
rg -n "velox_clp_connector" -C 3
# Search for existing PRESTO_ENABLE options to mirror pattern
rg -n "PRESTO_ENABLE" -C 3 .Length of output: 40098
Guard CLP connector linkage to avoid hard failures
Introduce a dedicated CMake option and wrap every unconditional reference to velox_clp_connector behind it (and a TARGET_EXISTS test) so that builds against older or stripped-down Velox snapshots don’t break:
-
In
presto-native-execution/CMakeLists.txt(near otherPRESTO_ENABLE_*options), add:option(PRESTO_ENABLE_CLP_CONNECTOR "Enable CLP connector support" OFF) if(PRESTO_ENABLE_CLP_CONNECTOR) # Forward to Velox set(VELOX_ENABLE_CLP_CONNECTOR ON CACHE BOOL "Build CLP connector support" FORCE) add_compile_definitions(PRESTO_ENABLE_CLP_CONNECTOR) endif()
-
Replace every raw
velox_clp_connectorentry in yourtarget_link_librariescalls with a generator expression, for example:target_link_libraries(presto_server_lib … - velox_clp_connector + $<$<AND:$<BOOL:${PRESTO_ENABLE_CLP_CONNECTOR}>,$<TARGET_EXISTS:velox_clp_connector>>>:velox_clp_connector> velox_common_base velox_core … ) -
Apply the same change in all CMakeLists where
velox_clp_connectoris linked:
•presto-native-execution/presto_cpp/main/CMakeLists.txt(lines 55–60)
•presto-native-execution/presto_cpp/main/tests/CMakeLists.txt(around line 50)
•presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt(lines 30, 68, 101)
• Any other CMakeLists discovered viarg -l velox_clp_connector.
This makes CLP support opt-in and guards against undefined-target errors on legacy Velox builds.
🤖 Prompt for AI Agents
In presto-native-execution/presto_cpp/main/CMakeLists.txt at line 58 and related
CMakeLists files, the direct linkage to velox_clp_connector causes build
failures on older Velox versions. Fix this by adding a new CMake option
PRESTO_ENABLE_CLP_CONNECTOR in presto-native-execution/CMakeLists.txt to control
CLP connector support, forwarding it to Velox and defining a compile flag. Then,
replace all unconditional velox_clp_connector references in
target_link_libraries with a conditional generator expression that links it only
if the option is enabled and the target exists. Apply these changes consistently
across all CMakeLists files that link velox_clp_connector to make CLP support
opt-in and avoid undefined target errors.
| bool operator<(const ColumnHandle& o) const override { | ||
| return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider adding safety check for dynamic_cast.
While the operator< implementation is required by the protocol as mentioned in past reviews, the dynamic_cast could throw std::bad_cast if the cast fails. Consider adding a safety check or documenting the assumption that this method will only be called with compatible types.
bool operator<(const ColumnHandle& o) const override {
+ const auto* clpHandle = dynamic_cast<const ClpColumnHandle*>(&o);
+ if (!clpHandle) {
+ throw std::runtime_error("Incompatible ColumnHandle type in comparison");
+ }
- return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName;
+ return columnName < clpHandle->columnName;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bool operator<(const ColumnHandle& o) const override { | |
| return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName; | |
| } | |
| bool operator<(const ColumnHandle& o) const override { | |
| const auto* clpHandle = dynamic_cast<const ClpColumnHandle*>(&o); | |
| if (!clpHandle) { | |
| throw std::runtime_error("Incompatible ColumnHandle type in comparison"); | |
| } | |
| return columnName < clpHandle->columnName; | |
| } |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc
around lines 27 to 29, the operator< uses dynamic_cast without checking if the
cast succeeds, which can throw std::bad_cast. To fix this, add a safety check to
ensure the cast is valid before using the result, or document clearly that this
operator< assumes the argument is always of type ClpColumnHandle to prevent
misuse.
| super: JsonEncodedSubclass | ||
| comparable: true | ||
| subclasses: | ||
| - { name: ClpColumnHandle, key: clp } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Fix YAML formatting issues.
Static analysis detected formatting issues with extra spaces in braces and after commas. Please fix the spacing to comply with YAML style guidelines.
- - { name: ClpColumnHandle, key: clp }
+ - { name: ClpColumnHandle, key: clp }
- - { name: ClpTableHandle, key: clp }
+ - { name: ClpTableHandle, key: clp }
- - { name: ClpTableLayoutHandle, key: clp }
+ - { name: ClpTableLayoutHandle, key: clp }
- - { name: ClpSplit, key: clp }
+ - { name: ClpSplit, key: clp }Also applies to: 23-23, 28-28, 33-33
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 18-18: too many spaces inside braces
(braces)
[error] 18-18: too many spaces after comma
(commas)
[error] 18-18: too many spaces inside braces
(braces)
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml
at lines 18, 23, 28, and 33, fix the YAML formatting by removing extra spaces
inside braces and after commas to comply with YAML style guidelines. Ensure that
the braces and commas have no unnecessary spaces around them for proper static
analysis compliance.
| namespace facebook::presto::protocol::clp { | ||
| struct ClpTransactionHandle : public ConnectorTransactionHandle { | ||
| String instance = {}; | ||
| }; | ||
| void to_json(json& j, const ClpTransactionHandle& p); | ||
|
|
||
| void from_json(const json& j, ClpTransactionHandle& p); | ||
| } //namespace facebook::presto::protocol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the namespace closing comment.
The namespace comment doesn't match the opened namespace.
-} //namespace facebook::presto::protocol
+} //namespace facebook::presto::protocol::clp📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| namespace facebook::presto::protocol::clp { | |
| struct ClpTransactionHandle : public ConnectorTransactionHandle { | |
| String instance = {}; | |
| }; | |
| void to_json(json& j, const ClpTransactionHandle& p); | |
| void from_json(const json& j, ClpTransactionHandle& p); | |
| } //namespace facebook::presto::protocol | |
| namespace facebook::presto::protocol::clp { | |
| struct ClpTransactionHandle : public ConnectorTransactionHandle { | |
| String instance = {}; | |
| }; | |
| void to_json(json& j, const ClpTransactionHandle& p); | |
| void from_json(const json& j, ClpTransactionHandle& p); | |
| } //namespace facebook::presto::protocol::clp |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-hpp.mustache
around lines 28 to 35, the closing namespace comment does not match the opened
namespace facebook::presto::protocol::clp. Update the closing comment to
correctly reflect facebook::presto::protocol::clp instead of
facebook::presto::protocol.
| namespace facebook::presto::protocol::clp { | ||
| void to_json(json& j, const ClpTransactionHandle& p) { | ||
| j = json::array(); | ||
| j.push_back(p._type); | ||
| j.push_back(p.instance); | ||
| } | ||
|
|
||
| void from_json(const json& j, ClpTransactionHandle& p) { | ||
| j[0].get_to(p._type); | ||
| j[1].get_to(p.instance); | ||
| } | ||
| } // namespace facebook::presto::protocol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the namespace closing comment to match the opened namespace.
The namespace comment should reflect the actual namespace being closed.
-} // namespace facebook::presto::protocol
+} // namespace facebook::presto::protocol::clp📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| namespace facebook::presto::protocol::clp { | |
| void to_json(json& j, const ClpTransactionHandle& p) { | |
| j = json::array(); | |
| j.push_back(p._type); | |
| j.push_back(p.instance); | |
| } | |
| void from_json(const json& j, ClpTransactionHandle& p) { | |
| j[0].get_to(p._type); | |
| j[1].get_to(p.instance); | |
| } | |
| } // namespace facebook::presto::protocol | |
| namespace facebook::presto::protocol::clp { | |
| void to_json(json& j, const ClpTransactionHandle& p) { | |
| j = json::array(); | |
| j.push_back(p._type); | |
| j.push_back(p.instance); | |
| } | |
| void from_json(const json& j, ClpTransactionHandle& p) { | |
| j[0].get_to(p._type); | |
| j[1].get_to(p.instance); | |
| } | |
| } // namespace facebook::presto::protocol::clp |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache
lines 27 to 39, the closing namespace comment does not match the opened
namespace facebook::presto::protocol::clp. Update the closing comment to exactly
reflect facebook::presto::protocol::clp to maintain clarity and correctness.
| bool operator<(const ColumnHandle& o) const override { | ||
| return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential runtime safety issue with dynamic_cast.
The operator< implementation uses dynamic_cast which could throw std::bad_cast if the cast fails. This could occur if the comparison is made between different ColumnHandle types.
Consider adding a type check or using safer casting:
bool operator<(const ColumnHandle& o) const override {
- return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName;
+ const auto* other = dynamic_cast<const ClpColumnHandle*>(&o);
+ if (!other) {
+ throw std::invalid_argument("Cannot compare ClpColumnHandle with different ColumnHandle type");
+ }
+ return columnName < other->columnName;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bool operator<(const ColumnHandle& o) const override { | |
| return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName; | |
| } | |
| bool operator<(const ColumnHandle& o) const override { | |
| const auto* other = dynamic_cast<const ClpColumnHandle*>(&o); | |
| if (!other) { | |
| throw std::invalid_argument( | |
| "Cannot compare ClpColumnHandle with different ColumnHandle type"); | |
| } | |
| return columnName < other->columnName; | |
| } |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h
around lines 45 to 47, the operator< uses dynamic_cast without checking the
type, risking std::bad_cast if the cast fails. To fix this, add a type check
before the cast to ensure the object is of type ClpColumnHandle, or use a safer
cast method like static_cast after verifying the type. This prevents runtime
exceptions during comparisons between different ColumnHandle types.
| std::shared_ptr<String> kqlQuery = {}; | ||
| std::shared_ptr<String> metadataFilterQuery = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider std::optional instead of shared_ptr for optional strings.
Using shared_ptr for optional strings may introduce unnecessary heap allocation overhead. Since these appear to be optional fields, std::optional<String> would be more efficient and semantically clearer.
- std::shared_ptr<String> kqlQuery = {};
- std::shared_ptr<String> metadataFilterQuery = {};
+ std::optional<String> kqlQuery = {};
+ std::optional<String> metadataFilterQuery = {};However, this change would require updates to the JSON serialization/deserialization logic as well.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| std::shared_ptr<String> kqlQuery = {}; | |
| std::shared_ptr<String> metadataFilterQuery = {}; | |
| std::optional<String> kqlQuery = {}; | |
| std::optional<String> metadataFilterQuery = {}; |
🤖 Prompt for AI Agents
In
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h
around lines 74 to 75, replace the use of std::shared_ptr<String> for optional
string fields with std::optional<String> to avoid unnecessary heap allocations
and improve clarity. After changing the member types, update the JSON
serialization and deserialization code accordingly to handle std::optional
instead of shared_ptr, ensuring correct presence checks and value extraction
during (de)serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
presto-native-execution/CMakeLists.txt(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: wraymo
PR: y-scope/presto#15
File: presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpMetadataProvider.java:22-33
Timestamp: 2025-06-13T12:56:06.325Z
Learning: `ClpMetadataProvider` is instantiated only once and used solely by the Presto coordinator, so concurrency/thread-safety guarantees are unnecessary.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (8.0.442, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (8.0.442, :presto-main)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: prestocpp-linux-build-for-test
| set(DISABLED_WARNINGS | ||
| "-Wno-nullability-completeness -Wno-deprecated-declarations") | ||
| "-Wno-nullability-completeness \ | ||
| -Wno-deprecated-declarations \ | ||
| -Wno-restrict") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Avoid embedded back-slashes & newline chars in DISABLED_WARNINGS; gate -Wno-restrict on compiler support
The current multiline string contains literal \ characters and newlines, so all three flags are passed to the compiler as one token ("-Wno-nullability-completeness \ -Wno-deprecated-declarations \ -Wno-restrict").
That silently drops the second and third flags and can break warning control.
In addition, -Wno-restrict is recognised by GCC but not by Clang, triggering “unknown warning option” errors on macOS CI.
Proposed fix:
-set(DISABLED_WARNINGS
- "-Wno-nullability-completeness \
- -Wno-deprecated-declarations \
- -Wno-restrict")
+set(DISABLED_WARNINGS
+ "-Wno-nullability-completeness"
+ "-Wno-deprecated-declarations"
+ "-Wno-restrict")
+
+# Only keep -Wno-restrict if the compiler understands it.
+include(CheckCXXCompilerFlag)
+check_cxx_compiler_flag("-Wrestrict" COMPILER_SUPPORTS_WRESTRICT)
+if(NOT COMPILER_SUPPORTS_WRESTRICT)
+ list(REMOVE_ITEM DISABLED_WARNINGS "-Wno-restrict")
+endif()This treats each flag as its own list item (no stray back-slashes) and prevents build breaks on non-GCC tool-chains.
Please verify on both GCC and Clang before merging.
🏁 Script executed:
#!/bin/bash
# Show how DISABLED_WARNINGS is defined and used in CMakeLists
rg -n "DISABLED_WARNINGS" -A5 -B2 presto-native-execution/CMakeLists.txtLength of output: 556
🌐 Web query:
CMake backslash newline line continuation
💡 Result:
In CMake, a backslash at the end of a line inside a quoted argument acts as a line continuation character. This means the backslash (\) and the immediately following newline are both ignored, allowing you to write long strings over multiple lines in your CMake scripts without actually inserting a newline character in the value.
Example:
message("This is the first line. \
This line is continued.")This outputs:
This is the first line. This line is continued.
Key details:
- This feature only works within quoted arguments (i.e., text inside
"..."). - The final
\at the end of a line (if it is an odd number of backslashes) is treated as a line continuation along with the following newline and is ignored.[2] - This makes your CMake code more readable when handling long strings, command arguments, or messages.
Note:
Versions of CMake prior to 3.0 do not support this backslash-newline continuation inside quoted arguments.[2]
Summary table:
| Scenario | Effect |
|---|---|
| Quoted argument | Backslash + newline = line continues, no newline in value[2] |
| Unquoted context | No line continuation; lines are treated as separate commands or arguments[2] |
References to the CMake manual confirm this syntax:
The final
\on any line ending in an odd number of backslashes is treated as a line continuation and ignored along with the immediately following newline character.[2]
If you need to make long argument lists or messages in CMake readable, this is the preferred technique.
Citations:
- 1: https://cmake.org/pipermail/cmake/2005-September/007294.html
- 2: https://cmake.org/cmake/help/latest/manual/cmake-language.7.html
- 3: https://cmake.org/cmake/help/latest/command/file.html
- 4: https://discourse.cmake.org/t/escape-for-use-in-strequal/5362
- 5: Conanfile's package_folder method uses single backslash on windows, not compatible with CMake conan-io/conan#4932
Gate -Wno-restrict on compiler support
File: presto-native-execution/CMakeLists.txt
Lines: 33–36
CMake’s quoted-argument backslash+newline continuation removes the backslashes and newlines, so:
set(DISABLED_WARNINGS
"-Wno-nullability-completeness \
-Wno-deprecated-declarations \
-Wno-restrict")
already produces a single string containing three space-separated flags—nothing is silently dropped.
However, Clang does not recognise -Wno-restrict and will emit unknown-option errors on macOS CI. Please add immediately after line 36:
+# Only keep -Wno-restrict if the compiler recognises it.
+include(CheckCXXCompilerFlag)
+check_cxx_compiler_flag("-Wno-restrict" COMPILER_SUPPORTS_WNO_RESTRICT)
+if(NOT COMPILER_SUPPORTS_WNO_RESTRICT)
+ list(REMOVE_ITEM DISABLED_WARNINGS "-Wno-restrict")
+endif()Then verify the build on both GCC and Clang before merging.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In presto-native-execution/CMakeLists.txt around lines 33 to 36, the
-Wno-restrict flag is included unconditionally in the DISABLED_WARNINGS set, but
Clang does not support this flag and will error on macOS CI. To fix this, add a
conditional check immediately after line 36 to detect if the compiler is GCC and
only then append -Wno-restrict to DISABLED_WARNINGS. This ensures compatibility
with both GCC and Clang. After making this change, verify the build works
correctly on both compilers before merging.
| clang-format -style=file -i connector/arrow_flight/presto_protocol_arrow_flight.h connector/arrow_flight/presto_protocol_arrow_flight.cpp | ||
|
|
||
| presto_protocol-json: | ||
| # build clp connector related structs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to make the comment have consistent indent?
Description
This PR adds Prestissimo support for the CLP connector. The native connector is now the default option and is responsible for deserializing classes from the Java coordinator, which are then used by Velox for query execution.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Chores