Skip to content

chore: Upgrade CLP dependency to v0.8.0.#47

Open
20001020ycx wants to merge 3 commits intoy-scope:presto-0.293-clp-connectorfrom
20001020ycx:2026-01-22-update-clp
Open

chore: Upgrade CLP dependency to v0.8.0.#47
20001020ycx wants to merge 3 commits intoy-scope:presto-0.293-clp-connectorfrom
20001020ycx:2026-01-22-update-clp

Conversation

@20001020ycx
Copy link

@20001020ycx 20001020ycx commented Jan 22, 2026

Description

  • Upgrades CLP dependency from commit 9e991ab to official release v0.8.0
    • This would help us enable S3 support in velox
  • Updates code to adapt to CLP v0.8.0 API changes

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • All unit tests passed.

  • End to end test

    • Query: SELECT CLP_GET_JSON_STRING() from clp.default.default limit 100
    • Set up: official clp package v0.8.0, OSS version of Presto, velox with changes in this PR, and a locally set minio.
    • The result is returned as expected.

Summary by CodeRabbit

Release Notes

  • Chores

    • Updated CLP library dependency to version 0.8.0.
  • Improvements

    • Refined string and timestamp data type handling in archive search operations.

✏️ Tip: You can customize this high-level summary in your review settings.

@20001020ycx 20001020ycx changed the title chore: Upgrade CLP dependency to v0.8.0 chore: Upgrade CLP dependency to v0.8.0. Jan 22, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 22, 2026

📝 Walkthrough

Walkthrough

The PR updates the CLP dependency version to v0.8.0 in CMake configuration, refines column literal type constraints by removing EpochDateT support from String and Timestamp columns, and switches string value processing from clp::ir to clp::ffi wrapper implementations while maintaining equivalent control flow.

Changes

Cohort / File(s) Summary
Dependency Version Update
CMake/resolve_dependency_modules/clp.cmake
Updated CLP FetchContent GIT_TAG from commit hash 9e991ab49681faff0aea259a37b489c3f416e06e to release tag v0.8.0.
Column Type Literal Constraints
velox/connectors/clp/search_lib/archive/ClpArchiveCursor.cpp
Removed EpochDateT from String column literal matching (leaving ClpStringT | VarStringT) and replaced EpochDateT with TimestampT in Timestamp column matching (leaving TimestampT | IntegerT | FloatT).
String Value Wrapper Refactoring
velox/connectors/clp/search_lib/ir/ClpIrVectorLoader.cpp
Replaced clp::ir string wrappers with clp::ffi equivalents (EightByteEncodedTextAst and FourByteEncodedTextAst) and converted decode/unparse calls to to_string() conversions in String and Array value handling paths.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: upgrading the CLP dependency from a specific commit to v0.8.0, which is the primary objective of this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
velox/connectors/clp/search_lib/ir/ClpIrVectorLoader.cpp (1)

120-136: API migration is correct, but the else branch assumes FourByteEncodedTextAst.

The namespace and method updates are correct. However, the else branch (lines 128-135) unconditionally calls get_immutable_view<::clp::ffi::FourByteEncodedTextAst>() without first checking value->is<::clp::ffi::FourByteEncodedTextAst>(). If the value is neither type, this could cause undefined behaviour.

Consider adding an explicit type check for safety:

♻️ Suggested improvement
         if (value->is<::clp::ffi::EightByteEncodedTextAst>()) {
           auto decodeResult =
               value->get_immutable_view<::clp::ffi::EightByteEncodedTextAst>()
                   .to_string();
           if (!decodeResult.has_value()) {
             continue;
           }
           jsonString = std::move(decodeResult.value());
-        } else {
+        } else if (value->is<::clp::ffi::FourByteEncodedTextAst>()) {
           auto decodeResult =
               value->get_immutable_view<::clp::ffi::FourByteEncodedTextAst>()
                   .to_string();
           if (!decodeResult.has_value()) {
             continue;
           }
           jsonString = std::move(decodeResult.value());
+        } else {
+          continue;
         }

@20001020ycx 20001020ycx changed the title chore: Upgrade CLP dependency to v0.8.0. chore: Upgrade CLP dependencies (clp, log_surgeon, spdlog) to v0.8.0. Jan 23, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@CMake/resolve_dependency_modules/spdlog.cmake`:
- Around line 17-18: The file fails the cmake-format check; run the cmake-format
tool on CMake/resolve_dependency_modules/spdlog.cmake and commit the resulting
formatting changes so CI passes. Locate the block that defines
VELOX_SPDLOG_BUILD_SHA256_CHECKSUM (the set(...) line) and reformat the entire
file with the project's cmake-format configuration (e.g., run `cmake-format -i`
or your repo's configured formatter) then add and commit the updated file.

@20001020ycx 20001020ycx force-pushed the 2026-01-22-update-clp branch from 032662b to 38a1b08 Compare January 27, 2026 00:45
@20001020ycx 20001020ycx changed the title chore: Upgrade CLP dependencies (clp, log_surgeon, spdlog) to v0.8.0. chore: Upgrade CLP dependency to v0.8.0. Jan 27, 2026
@kirkrodrigues kirkrodrigues changed the title chore: Upgrade CLP dependency to v0.8.0. chore: Upgrade CLP dependency to v0.8.0. Jan 27, 2026
Copy link

@gibber9809 gibber9809 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor change, but besides that LGTM.

@20001020ycx 20001020ycx force-pushed the 2026-01-22-update-clp branch from 4f4349a to 4d8c2dc Compare January 27, 2026 18:53
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
velox/connectors/clp/search_lib/archive/ClpArchiveCursor.cpp (1)

170-193: Harmonize timestamp literal-type mappings across cursor implementations.

The change correctly aligns ClpArchiveCursor.cpp with CLP v0.8.0's canonical LiteralType::TimestampT. However, ClpIrCursor.cpp maps Timestamp columns to only FloatT | IntegerT (omitting TimestampT), which creates inconsistency between archive and IR formats. This could cause the same timestamp query to match differently depending on the split type.

Additionally, if backward-compat with pre-0.8.0 archives is required, consider whether LiteralType::EpochDateT should be handled, as it is still present in CLP v0.8.0. The codebase currently has no EpochDateT references.

Update ClpIrCursor.cpp to include TimestampT in the Timestamp case, or document why the implementations must differ.

Copy link

@gibber9809 gibber9809 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. PR title is fine as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants