-
Notifications
You must be signed in to change notification settings - Fork 83
feat(presto-clp): Add Docker compose setup for Presto cluster that can connect to clp-json. #1132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis update introduces a comprehensive Presto deployment for CLP, including Docker Compose orchestration, configuration templates, environment variable management, and setup automation scripts. It also adds user documentation for using Presto with CLP, expands linting coverage, and provides environment and configuration files for both coordinator and worker nodes. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant SetupScript as set-up-config.sh
participant PyScript as generate-user-env-vars-file.py
participant DockerCompose
participant Coordinator
participant Worker
User->>SetupScript: Run with CLP package dir
SetupScript->>PyScript: Generate .env from CLP configs
PyScript->>SetupScript: Write .env file
User->>DockerCompose: docker-compose up
DockerCompose->>Coordinator: Start with entrypoint (generate-configs.sh)
DockerCompose->>Worker: Start with entrypoint (generate-configs.sh)
Coordinator->>Coordinator: Generate configs from templates
Worker->>Worker: Generate configs, fetch coordinator version, update properties
Coordinator->>Coordinator: Start Presto server
Worker->>Worker: Start Presto server
User->>Coordinator: Query logs via Presto CLI
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 17
🔭 Outside diff range comments (2)
tools/deployment/presto-clp/demo-assets/clp-config.yml (1)
39-40: Remove trailing blank lines.Static analysis flagged excessive blank lines at the end of the file.
- - +tools/deployment/presto-clp/docker-compose.yaml (1)
39-40: Remove trailing blank lines.Static analysis flagged excessive blank lines at the end of the file.
- - +
♻️ Duplicate comments (2)
tools/deployment/presto-clp/coordinator/scripts/$ (2)
8-8: Fix typo in comment.Same spelling error as in the other script.
-# Substitute environemnt variables in config template +# Substitute environment variables in config template
8-11: Security risk: Avoid shell evaluation for template processing.Same security issue as in the other generate-configs.sh script. The shell evaluation approach poses risks if template files contain malicious code.
Use
envsubstfor safer environment variable substitution as suggested in the other script review.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (18)
tools/deployment/presto-clp/.env(1 hunks)tools/deployment/presto-clp/README.md(1 hunks)tools/deployment/presto-clp/coordinator/config-template/clp.properties(1 hunks)tools/deployment/presto-clp/coordinator/config-template/config.properties(1 hunks)tools/deployment/presto-clp/coordinator/config-template/jvm.config(1 hunks)tools/deployment/presto-clp/coordinator/config-template/log.properties(1 hunks)tools/deployment/presto-clp/coordinator/config-template/metadata-filter.json(1 hunks)tools/deployment/presto-clp/coordinator/config-template/node.properties(1 hunks)tools/deployment/presto-clp/coordinator/scripts/$(1 hunks)tools/deployment/presto-clp/coordinator/scripts/generate-configs.sh(1 hunks)tools/deployment/presto-clp/demo-assets/clp-config.yml(1 hunks)tools/deployment/presto-clp/demo-assets/init.sh(1 hunks)tools/deployment/presto-clp/docker-compose.yaml(1 hunks)tools/deployment/presto-clp/worker/config-template/clp.properties(1 hunks)tools/deployment/presto-clp/worker/config-template/config.properties(1 hunks)tools/deployment/presto-clp/worker/config-template/node.properties(1 hunks)tools/deployment/presto-clp/worker/config-template/velox.properties(1 hunks)tools/deployment/presto-clp/worker/scripts/generate-configs.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: quinntaylormitchell
PR: y-scope/clp#968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
tools/deployment/presto-clp/worker/config-template/clp.properties (2)
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: quinntaylormitchell
PR: #961
File: docs/src/dev-guide/design-clp-structured/single-file-archive-format.md:216-219
Timestamp: 2025-06-18T14:35:20.485Z
Learning: In clp-s documentation, technical abbreviations like "MPT" (Merged Parse Tree) should be defined at first use to improve reader clarity and comprehension.
tools/deployment/presto-clp/.env (2)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
tools/deployment/presto-clp/coordinator/config-template/clp.properties (1)
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
tools/deployment/presto-clp/coordinator/scripts/$ (6)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate set -eu and set -o pipefail commands rather than combining them into set -euo pipefail, to ensure uniform script structure across all platform build scripts.
Learnt from: kirkrodrigues
PR: #881
File: components/core/tools/scripts/lib_install/macos/install-all.sh:11-12
Timestamp: 2025-05-06T10:07:04.654Z
Learning: In CLP installation scripts, temporary directories with downloaded files should not be automatically cleaned up on failure (e.g., with EXIT traps) to preserve artifacts for debugging purposes.
tools/deployment/presto-clp/coordinator/scripts/generate-configs.sh (5)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
tools/deployment/presto-clp/demo-assets/clp-config.yml (3)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
tools/deployment/presto-clp/worker/scripts/generate-configs.sh (7)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: haiqi96
PR: #594
File: components/package-template/src/sbin/del-archives.sh:6-9
Timestamp: 2024-11-15T16:28:08.644Z
Learning: For wrapper scripts in the components/package-template/src/sbin/ directory, keep them simple and avoid adding additional validation code.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
tools/deployment/presto-clp/demo-assets/init.sh (11)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across different platforms (e.g., using separate update and install commands like apk update && apk add, apt update && apt install, yum update && yum install) rather than platform-specific optimizations, to ensure uniform script structure and readability.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate set -eu and set -o pipefail commands rather than combining them into set -euo pipefail, to ensure uniform script structure across all platform build scripts.
tools/deployment/presto-clp/README.md (4)
Learnt from: quinntaylormitchell
PR: #968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: quinntaylormitchell
PR: #1069
File: docs/src/user-guide/quick-start/clp-json.md:135-152
Timestamp: 2025-07-05T03:38:16.779Z
Learning: The CLP project team has decided to refrain from using include directives in their documentation at present, preferring to maintain duplicated content rather than using shared includes or partials for de-duplication.
Learnt from: quinntaylormitchell
PR: #961
File: docs/src/dev-guide/design-clp-structured/single-file-archive-format.md:216-219
Timestamp: 2025-06-18T14:35:20.485Z
Learning: In clp-s documentation, technical abbreviations like "MPT" (Merged Parse Tree) should be defined at first use to improve reader clarity and comprehension.
🪛 dotenv-linter (3.3.0)
tools/deployment/presto-clp/.env
[warning] 2-2: [QuoteCharacter] The value has quote characters (', ")
[warning] 3-3: [QuoteCharacter] The value has quote characters (', ")
[warning] 6-6: [QuoteCharacter] The value has quote characters (', ")
[warning] 7-7: [QuoteCharacter] The value has quote characters (', ")
[warning] 7-7: [UnorderedKey] The PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL key should go before the PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE key
[warning] 8-8: [QuoteCharacter] The value has quote characters (', ")
[warning] 8-8: [UnorderedKey] The PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_NAME key should go before the PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL key
[warning] 9-9: [QuoteCharacter] The value has quote characters (', ")
[warning] 9-9: [UnorderedKey] The PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER key should go before the PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE key
[warning] 10-10: [QuoteCharacter] The value has quote characters (', ")
[warning] 10-10: [UnorderedKey] The PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD key should go before the PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL key
[warning] 11-11: [QuoteCharacter] The value has quote characters (', ")
[warning] 12-12: [QuoteCharacter] The value has quote characters (', ")
[warning] 15-15: [QuoteCharacter] The value has quote characters (', ")
[warning] 16-16: [QuoteCharacter] The value has quote characters (', ")
[warning] 19-19: [QuoteCharacter] The value has quote characters (', ")
[warning] 20-20: [QuoteCharacter] The value has quote characters (', ")
[warning] 20-20: [UnorderedKey] The PRESTO_COORDINATOR_CONFIG_JVMCONFIG_G1HEAPREGIONSIZE key should go before the PRESTO_COORDINATOR_CONFIG_JVMCONFIG_MAXHEAPSIZE key
[warning] 23-23: [QuoteCharacter] The value has quote characters (', ")
[warning] 26-26: [QuoteCharacter] The value has quote characters (', ")
[warning] 29-29: [QuoteCharacter] The value has quote characters (', ")
[warning] 32-32: [QuoteCharacter] The value has quote characters (', ")
🪛 Shellcheck (0.10.0)
tools/deployment/presto-clp/coordinator/scripts/generate-configs.sh
[info] 10-10: Double quote to prevent globbing and word splitting.
(SC2086)
tools/deployment/presto-clp/worker/scripts/generate-configs.sh
[info] 10-10: Double quote to prevent globbing and word splitting.
(SC2086)
[warning] 81-81: In POSIX sh, HOSTNAME is undefined.
(SC3028)
[warning] 82-82: In POSIX sh, HOSTNAME is undefined.
(SC3028)
tools/deployment/presto-clp/demo-assets/init.sh
[warning] 6-6: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 20-20: Declare and assign separately to avoid masking return values.
(SC2155)
[info] 20-20: Double quote to prevent globbing and word splitting.
(SC2086)
[warning] 21-21: Declare and assign separately to avoid masking return values.
(SC2155)
[info] 21-21: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 39-39: Double quote to prevent globbing and word splitting.
(SC2086)
🪛 YAMLlint (1.37.1)
tools/deployment/presto-clp/docker-compose.yaml
[error] 39-39: too many blank lines (1 > 0)
(empty-lines)
tools/deployment/presto-clp/demo-assets/clp-config.yml
[error] 39-39: too many blank lines (1 > 0)
(empty-lines)
🪛 markdownlint-cli2 (0.17.2)
tools/deployment/presto-clp/README.md
7-7: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
22-22: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1
(MD029, ol-prefix)
29-29: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1
(MD029, ol-prefix)
35-35: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
44-44: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
44-44: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
48-48: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
48-48: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
61-61: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
65-65: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
71-71: Multiple consecutive blank lines
Expected: 1; Actual: 2
(MD012, no-multiple-blanks)
72-72: Multiple consecutive blank lines
Expected: 1; Actual: 3
(MD012, no-multiple-blanks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: lint-check (ubuntu-latest)
🔇 Additional comments (16)
tools/deployment/presto-clp/worker/config-template/node.properties (1)
1-4: VerifyREPLACE_MEplaceholders are always substituted at container start-upIf
generate-configs.shfails, leavingnode.internal-addressornode.idasREPLACE_ME, the worker will refuse to start. Consider adding a defensive check in the script (e.g.,grep -q REPLACE_MEwithexit 1) so mis-substitution is caught early in CI rather than at runtime.tools/deployment/presto-clp/worker/config-template/clp.properties (1)
1-1: LGTM – matches Presto catalogue naming conventionstools/deployment/presto-clp/worker/config-template/velox.properties (1)
1-1: Confirm mutable config is desired inside containers
mutable-config=trueallows live reloads but also permits runtime edits that diverge from the committed template. Double-check this aligns with your security/operational requirements.tools/deployment/presto-clp/coordinator/config-template/log.properties (1)
1-1: LGTM – log level wired to env vartools/deployment/presto-clp/coordinator/config-template/clp.properties (1)
9-9: Hard-wired path may break ifetcis relocated
clp.metadata-filter-configis fixed to/opt/presto-server/etc/metadata-filter.json. If the image’slauncher-configoverridesPRESTO_HOMEor mounts configs elsewhere, Presto will fail to resolve the file. Please verify the path or parameterize it like the other fields.tools/deployment/presto-clp/worker/config-template/config.properties (1)
1-6: Ensurepresto.versionplaceholder is replaced at run timeA literal
REPLACE_MEwill cause workers to refuse to join the cluster. Confirm thatscripts/generate-configs.shalways substitutes the value before Presto starts, even on restart scenarios.tools/deployment/presto-clp/.env (1)
35-36:CLP_PACKAGE_ARCHIVESplaceholder left
CLP_PACKAGE_ARCHIVES=REPLACE_MEmust be set to a valid host path or the volume mount in docker-compose.yaml will fail.tools/deployment/presto-clp/coordinator/config-template/config.properties (1)
1-14: Well-structured Presto coordinator configuration.The configuration properly sets up the coordinator with appropriate settings for discovery, memory management, and feature flags. The environment variable placeholders follow a consistent naming convention and will be properly substituted by the generation scripts.
tools/deployment/presto-clp/demo-assets/clp-config.yml (2)
33-38: Clarify the localhost vs placeholder IP usage.The web UI services use "localhost" while other services use "${REPLACE_IP}" placeholders. This might be intentional for security (keeping web UIs local-only), but please verify this is the intended behaviour.
1-39: Comprehensive CLP service configuration.The YAML structure properly defines all necessary CLP services with appropriate connection details and configuration parameters. The placeholder pattern for IP addresses will be correctly processed by the associated scripts.
tools/deployment/presto-clp/docker-compose.yaml (4)
5-5: Verify the use of "dev" tagged images.The services use images with "dev" tags from the GitHub Container Registry. Ensure these are the intended images for the deployment and consider whether versioned tags would be more appropriate for production use.
Also applies to: 17-17
6-6: Well-designed custom entrypoint approach.The entrypoint configuration that runs config generation before starting services is a good pattern for dynamic configuration management in containerized deployments.
Also applies to: 21-21
28-28: Good use of environment variable for flexible volume mounting.Using
${CLP_PACKAGE_ARCHIVES}for the volume mount path allows flexible configuration of the data directory location.
1-39: Well-structured Docker Compose deployment.The configuration properly orchestrates a Presto cluster with appropriate service dependencies, networking, and volume management. The integration with configuration templates and generation scripts provides a flexible deployment solution.
tools/deployment/presto-clp/worker/scripts/generate-configs.sh (2)
67-83: Usehostnamefor node ID –$HOSTNAMEis undefined in POSIX shBusyBox
/bin/shdoesn’t set$HOSTNAME. Use$(hostname).-if grep -q '^node.id=REPLACE_ME' "$NODE_PROPERTIES_FILE"; then - sed -i "s|^node.id=REPLACE_ME|node.id=$HOSTNAME|" "$NODE_PROPERTIES_FILE" +if grep -q '^node.id=REPLACE_ME' "$NODE_PROPERTIES_FILE"; then + sed -i "s|^node.id=REPLACE_ME|node.id=$(hostname)|" "$NODE_PROPERTIES_FILE"
30-37: Unquoted DISCOVERY_URI may break on spaces; quote variable in wgetIf the value in
config.propertiesever contains spaces,wgetreceives split tokens. Wrap it:-if response=$(wget -qO- --timeout=10 "${DISCOVERY_URI}/v1/info" 2>/dev/null); then +if response=$(wget -qO- --timeout=10 "${DISCOVERY_URI}/v1/info" 2>/dev/null); then(no functional diff needed, but keep double quotes around variable)
Likely an incorrect or invalid review comment.
tools/deployment/presto-clp/.env
Outdated
| PRESTO_COORDINATOR_HTTPPORT="8080" | ||
| PRESTO_COORDINATOR_SERVICENAME="presto-coordinator" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quoted values in .env are passed verbatim – will include the quotes
Docker Compose does not strip " in .env, so PRESTO_COORDINATOR_HTTPPORT will become "8080" (quotes included).
That breaks integer expectations in config.properties and JDBC URLs.
Strip the quotes or switch to YAML-style environment: blocks.
-PRESTO_COORDINATOR_HTTPPORT="8080"
-PRESTO_COORDINATOR_SERVICENAME="presto-coordinator"
+PRESTO_COORDINATOR_HTTPPORT=8080
+PRESTO_COORDINATOR_SERVICENAME=presto-coordinator
@@
-PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE="mysql"
-…
+PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE=mysql
+…(Apply similarly to the remaining keys.)
Also applies to: 6-33
🧰 Tools
🪛 dotenv-linter (3.3.0)
[warning] 2-2: [QuoteCharacter] The value has quote characters (', ")
[warning] 3-3: [QuoteCharacter] The value has quote characters (', ")
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/.env lines 2 to 33, the environment variable
values are enclosed in quotes, causing the quotes to be included literally in
the values. Remove the surrounding double quotes from all variable assignments
in this range to ensure the values are interpreted correctly by Docker Compose
and downstream configurations.
| -server | ||
| -Xmx${PRESTO_COORDINATOR_CONFIG_JVMCONFIG_MAXHEAPSIZE} | ||
| -XX:+UseG1GC | ||
| -XX:G1HeapRegionSize=${PRESTO_COORDINATOR_CONFIG_JVMCONFIG_G1HEAPREGIONSIZE} | ||
| -XX:+UseGCOverheadLimit | ||
| -XX:+ExplicitGCInvokesConcurrent | ||
| -XX:+HeapDumpOnOutOfMemoryError | ||
| -XX:+ExitOnOutOfMemoryError | ||
| -Djdk.attach.allowAttachSelf=true | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Add container-aware memory flags & optional G1 tuning
-Xmx hard-codes the ceiling and ignores the cgroup limit the container actually gets. Presto’s launcher (and Java ≥ 11) support:
-XX:+UseContainerSupport
-XX:MaxRAMPercentage=80
These keep the heap bounded to ~80 % of the container limit without manual tuning. Consider replacing -Xmx… or combining it with these flags.
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/coordinator/config-template/jvm.config lines 1 to
10, the current JVM options use a fixed -Xmx value which does not respect
container memory limits. To fix this, remove or complement the -Xmx flag with
container-aware flags by adding -XX:+UseContainerSupport and
-XX:MaxRAMPercentage=80. This will allow the JVM heap size to adapt dynamically
to the container's memory limit, improving resource management without manual
tuning.
| { | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Provide a minimal example or document expected schema
An empty JSON object is syntactically valid, but future maintainers may be unsure what keys are supported. A commented exemplar or pointer to docs beside this file would improve clarity without affecting runtime.
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/coordinator/config-template/metadata-filter.json
at lines 1 to 2, the JSON file is currently empty, which may confuse future
maintainers about the expected keys. Add a minimal example JSON object with
typical keys and values or include comments or a reference to documentation
explaining the expected schema to improve clarity without impacting runtime
behavior.
| node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} | ||
| node.id=${PRESTO_COORDINATOR_SERVICENAME} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node.data-dir missing & static node.id risks collisions
Presto will refuse to start without node.data-dir, and using a constant value for node.id means every fresh container (or an additional coordinator) will register with the same ID, leading to “Node already exists” errors.
node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT}
-node.id=${PRESTO_COORDINATOR_SERVICENAME}
+# A unique ID per container – fall back to HOSTNAME if not supplied
+node.id=${PRESTO_COORDINATOR_NODE_ID:-${HOSTNAME}}
+# Mandatory data directory
+node.data-dir=/opt/presto-server/data📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} | |
| node.id=${PRESTO_COORDINATOR_SERVICENAME} | |
| node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} | |
| # A unique ID per container – fall back to HOSTNAME if not supplied | |
| node.id=${PRESTO_COORDINATOR_NODE_ID:-${HOSTNAME}} | |
| # Mandatory data directory | |
| node.data-dir=/opt/presto-server/data |
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/coordinator/config-template/node.properties at
lines 1 to 3, add the required property node.data-dir with an appropriate
environment variable or path to ensure Presto starts correctly. Also, replace
the static node.id value with a dynamic or unique identifier, such as an
environment variable or generated UUID, to prevent node ID collisions when
multiple coordinators or containers run simultaneously.
| rm -rf ${PRESTO_CONFIG_DIR}/catalog | ||
| mkdir -p ${PRESTO_CONFIG_DIR}/catalog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider consistency with the other generate-configs.sh script.
This script uses rm -rf and mkdir -p for catalog directory management, while the other version uses rm -f. Consider standardizing the approach across both scripts.
Also fix the quoting issue:
-rm -rf ${PRESTO_CONFIG_DIR}/catalog
-mkdir -p ${PRESTO_CONFIG_DIR}/catalog
+rm -rf "${PRESTO_CONFIG_DIR}/catalog"
+mkdir -p "${PRESTO_CONFIG_DIR}/catalog"🤖 Prompt for AI Agents
In tools/deployment/presto-clp/coordinator/scripts/$ around lines 14 to 15, the
script uses `rm -rf` to remove the catalog directory, which is inconsistent with
the other generate-configs.sh script that uses `rm -f`. Change the removal
command to use `rm -f` if only files are being removed or adjust both scripts to
use the same method for directory cleanup. Additionally, fix any quoting issues
by properly quoting variables like "${PRESTO_CONFIG_DIR}/catalog" to prevent
word splitting or globbing errors.
| # Setup local docker stack for presto + clp | ||
|
|
||
| ## Install docker | ||
|
|
||
| Follow the guide here: [docker] | ||
|
|
||
| # Launch clp-package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Convert secondary titles to H2 and fix punctuation for Markdown lint-cleanliness
The document currently contains five #-level headings (Setup…, Launch…, Create Docker Cluster, Use cli: and Delete docker Cluster).
Markdown-lint (MD025/MD026) flags this as multiple H1s and trailing punctuation. Keeping a single top-level H1 improves SEO and stops most linters from failing CI.
-# Launch clp-package
+# ## Launch clp-package
…
-# Create Docker Cluster
+# ## Create Docker Cluster
…
-# Use cli:
+# ## Use CLI
…
-# Delete docker Cluster
+# ## Delete Docker ClusterAlso applies to: 35-41, 48-57, 65-69
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
7-7: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/README.md around lines 1 to 7 and also lines
35-41, 48-57, and 65-69, change all secondary titles currently using a single #
(H1) to use ## (H2) to ensure only one top-level H1 heading in the document.
Additionally, remove any trailing punctuation from these headings to comply with
markdown lint rules MD025 and MD026. This will improve SEO and prevent linter
failures in CI.
| ```sql | ||
| SELECT * FROM default LIMIT 1; | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Minor: qualify SQL fenced block for syntax highlighting
Add sql after the back-ticks.
-```sql
+```sql
SELECT * FROM default LIMIT 1;
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.17.2)</summary>
61-61: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
In tools/deployment/presto-clp/README.md around lines 61 to 63, the SQL code
block is missing the language specifier for syntax highlighting. Add "sql"
immediately after the opening triple backticks to enable proper SQL syntax
highlighting for the code block.
</details>
<!-- fingerprinting:phantom:triton:cougar -->
<!-- This is an auto-generated comment by CodeRabbit -->
| # Substitute environemnt variables in config template | ||
| find /configs -type f | while read -r f; do | ||
| ( echo "cat <<EOF"; cat $f; echo "EOF" ) | sh > "${PRESTO_CONFIG_DIR}/$(basename "$f")" | ||
| done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Template expansion executes arbitrary shell – switch to envsubst
Piping the template through sh (cat <<EOF … | sh) will execute every line, not just expand variables. If anyone edits a template and slips in rm -rf /, the worker container obeys. Use envsubst (in gettext-base) or a safe substitution tool instead.
- ( echo "cat <<EOF"; cat "$f"; echo "EOF" ) | sh > "${PRESTO_CONFIG_DIR}/$(basename "$f")"
+ envsubst < "$f" > "${PRESTO_CONFIG_DIR}/$(basename "$f")"Also quote $f to satisfy SC2086.
🧰 Tools
🪛 Shellcheck (0.10.0)
[info] 10-10: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/worker/scripts/generate-configs.sh around lines 8
to 11, the current template expansion uses a shell execution that runs arbitrary
commands, which is unsafe. Replace the use of `cat <<EOF ... | sh` with
`envsubst` to safely substitute environment variables without executing
commands. Also, ensure to quote the variable `$f` in the `cat` command to
prevent word splitting and globbing issues.
… Python script; Split environment variable file into multiple files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
♻️ Duplicate comments (13)
tools/deployment/presto-clp/coordinator/config-template/metadata-filter.json (1)
1-2: Add minimal schema example or link to docsAn entirely empty JSON object gives no clue what keys are supported. Provide a commented exemplar or reference documentation to aid maintainers.
tools/deployment/presto-clp/coordinator/config-template/node.properties (1)
1-2:node.data-dirstill missing andnode.idis static
(See earlier review)Presto fails to boot without
node.data-dir, and a constantnode.idcauses collisions when multiple containers start.node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} -node.id=${PRESTO_COORDINATOR_SERVICENAME} +# Unique per container; fall back to the host id +node.id=${PRESTO_COORDINATOR_NODE_ID:-${HOSTNAME}} +node.data-dir=/opt/presto-server/datatools/deployment/presto-clp/coordinator/config-template/jvm.config (1)
1-9: Consider container-aware memory flags instead of fixed-XmxJava 11 already honours cgroup limits when given:
-XX:+UseContainerSupport -XX:MaxRAMPercentage=80This avoids hard-coding
${…MAXHEAPSIZE}and keeps the heap proportional to the container limit.tools/deployment/presto-clp/coordinator/scripts/generate-configs.sh (4)
8-8: Typo: “environemnt” → “environment”
9-11: Replace unsafe eval-based template substitution
The pipeline that funnels templates throughshexecutes their contents, creating a command-injection vector and breaking on filenames with spaces. Preferenvsubst(orawk/sed) with proper quoting.
14-14: Quote variable inrmcommand
Unquoted expansion may misbehave on paths containing spaces.
17-17: Quote variables inmvcommand
Protects against word-splitting and globbing issues.tools/deployment/presto-clp/README.md (5)
1-8: Multiple H1 headings (MD025) – change subsequent#headings to##.
20-29: Ordered list numbering jumps (MD029) – restart numbering or use bullets.
41-44: Add language hint & blank lines around fenced block (MD031/MD040)
46-46: Heading ends with “:” and is H1 – use## Use CLIwithout trailing punctuation.
59-61: Surround SQL block with blank lines & keepsqltag (MD031)tools/deployment/presto-clp/worker/scripts/generate-configs.sh (1)
8-11: Template expansion executes arbitrary shell – switch to envsubstThis security vulnerability was previously flagged. Piping the template through
shwill execute every line, not just expand variables. Useenvsubstfor safe variable substitution instead.-# Substitute environemnt variables in config template +# Substitute environment variables in config template find /configs -type f | while read -r f; do - ( echo "cat <<EOF"; cat $f; echo "EOF" ) | sh > "${PRESTO_CONFIG_DIR}/$(basename "$f")" + envsubst < "$f" > "${PRESTO_CONFIG_DIR}/$(basename "$f")" done
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (22)
taskfiles/lint.yaml(2 hunks)tools/deployment/presto-clp/README.md(1 hunks)tools/deployment/presto-clp/coordinator-common.env(1 hunks)tools/deployment/presto-clp/coordinator.env(1 hunks)tools/deployment/presto-clp/coordinator/config-template/clp.properties(1 hunks)tools/deployment/presto-clp/coordinator/config-template/config.properties(1 hunks)tools/deployment/presto-clp/coordinator/config-template/jvm.config(1 hunks)tools/deployment/presto-clp/coordinator/config-template/log.properties(1 hunks)tools/deployment/presto-clp/coordinator/config-template/metadata-filter.json(1 hunks)tools/deployment/presto-clp/coordinator/config-template/node.properties(1 hunks)tools/deployment/presto-clp/coordinator/scripts/generate-configs.sh(1 hunks)tools/deployment/presto-clp/docker-compose.yaml(1 hunks)tools/deployment/presto-clp/scripts/.gitignore(1 hunks)tools/deployment/presto-clp/scripts/generate-user-env-vars-file.py(1 hunks)tools/deployment/presto-clp/scripts/requirements.txt(1 hunks)tools/deployment/presto-clp/scripts/set-up-config.sh(1 hunks)tools/deployment/presto-clp/worker.env(1 hunks)tools/deployment/presto-clp/worker/config-template/clp.properties(1 hunks)tools/deployment/presto-clp/worker/config-template/config.properties(1 hunks)tools/deployment/presto-clp/worker/config-template/node.properties(1 hunks)tools/deployment/presto-clp/worker/config-template/velox.properties(1 hunks)tools/deployment/presto-clp/worker/scripts/generate-configs.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: quinntaylormitchell
PR: y-scope/clp#968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
Learnt from: haiqi96
PR: y-scope/clp#651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate `set -eu` and `set -o pipefail` commands rather than combining them into `set -euo pipefail`, to ensure uniform script structure across all platform build scripts.
tools/deployment/presto-clp/scripts/.gitignore (3)
Learnt from: Bill-hbrhbr
PR: #1126
File: .gitignore:5-5
Timestamp: 2025-07-25T21:29:48.947Z
Learning: In the CLP project, the .clang-format file is maintained in the yscope-dev-utils submodule and copied over to the main CLP repository, so it should be ignored in .gitignore to prevent accidental commits of the copied file and maintain the single source of truth in the submodule.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
tools/deployment/presto-clp/worker/config-template/clp.properties (1)
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
tools/deployment/presto-clp/coordinator/config-template/metadata-filter.json (4)
Learnt from: junhaoliao
PR: #596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In components/log-viewer-webui/client/src/api/query.js, the ExtractJsonResp type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/test/utils.cpp:78-84
Timestamp: 2025-05-02T23:49:30.798Z
Learning: The function trivial_new_projected_schema_tree_node_callback in the KV-IR codebase is documented in the header file to explain its purpose for testing, which addresses static analyzer warnings about unused functions.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/test/utils.cpp:78-84
Timestamp: 2025-05-02T23:49:30.798Z
Learning: The function trivial_new_projected_schema_tree_node_callback in the KV-IR codebase is properly documented in the header file to indicate its purpose for testing, which addresses static analyzer warnings about unused functions.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:378-402
Timestamp: 2025-05-07T16:56:35.687Z
Learning: In the CLP search component, the evaluate_wildcard_filter function should return AstEvaluationResult::Pruned when node_id_value_pairs is empty, not AstEvaluationResult::False. Empty node sets should be treated as "undetermined" rather than definitive non-matches.
tools/deployment/presto-clp/coordinator/config-template/clp.properties (1)
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
tools/deployment/presto-clp/scripts/set-up-config.sh (12)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate set -eu and set -o pipefail commands rather than combining them into set -euo pipefail, to ensure uniform script structure across all platform build scripts.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: kirkrodrigues
PR: #881
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:35-41
Timestamp: 2025-05-06T09:48:55.408Z
Learning: For installation scripts in the CLP project, prefer explicit error handling over automatic dependency resolution (like apt-get install -f) when installing packages to give users more control over their system.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across different platforms (e.g., using separate update and install commands like apk update && apk add, apt update && apt install, yum update && yum install) rather than platform-specific optimizations, to ensure uniform script structure and readability.
Learnt from: kirkrodrigues
PR: #881
File: components/core/tools/scripts/lib_install/macos/install-all.sh:11-12
Timestamp: 2025-05-06T10:07:04.654Z
Learning: In CLP installation scripts, temporary directories with downloaded files should not be automatically cleaned up on failure (e.g., with EXIT traps) to preserve artifacts for debugging purposes.
Learnt from: quinntaylormitchell
PR: #968
File: docs/src/user-guide/quick-start/overview.md:53-54
Timestamp: 2025-06-18T20:48:48.990Z
Learning: CLP is designed to run on Linux systems where Python is typically pre-installed, so Python installation links are generally not needed in CLP documentation.
tools/deployment/presto-clp/README.md (14)
Learnt from: quinntaylormitchell
PR: #968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: quinntaylormitchell
PR: #1069
File: docs/src/user-guide/quick-start/clp-json.md:135-152
Timestamp: 2025-07-05T03:38:16.779Z
Learning: The CLP project team has decided to refrain from using include directives in their documentation at present, preferring to maintain duplicated content rather than using shared includes or partials for de-duplication.
Learnt from: quinntaylormitchell
PR: #961
File: docs/src/dev-guide/design-clp-structured/single-file-archive-format.md:216-219
Timestamp: 2025-06-18T14:35:20.485Z
Learning: In clp-s documentation, technical abbreviations like "MPT" (Merged Parse Tree) should be defined at first use to improve reader clarity and comprehension.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate set -eu and set -o pipefail commands rather than combining them into set -euo pipefail, to ensure uniform script structure across all platform build scripts.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across different platforms (e.g., using separate update and install commands like apk update && apk add, apt update && apt install, yum update && yum install) rather than platform-specific optimizations, to ensure uniform script structure and readability.
taskfiles/lint.yaml (1)
Learnt from: anlowee
PR: #925
File: taskfiles/deps/main.yaml:97-106
Timestamp: 2025-05-28T18:33:30.155Z
Learning: In the taskfiles dependency system (taskfiles/deps/main.yaml), echo commands are used to generate .cmake settings files that are consumed by the main CMake build process. These files set variables like ANTLR_RUNTIME_HEADER to point to dependency locations for use during compilation.
tools/deployment/presto-clp/worker/scripts/generate-configs.sh (8)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate set -eu and set -o pipefail commands rather than combining them into set -euo pipefail, to ensure uniform script structure across all platform build scripts.
Learnt from: haiqi96
PR: #594
File: components/package-template/src/sbin/del-archives.sh:6-9
Timestamp: 2024-11-15T16:28:08.644Z
Learning: For wrapper scripts in the components/package-template/src/sbin/ directory, keep them simple and avoid adding additional validation code.
tools/deployment/presto-clp/scripts/generate-user-env-vars-file.py (2)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
tools/deployment/presto-clp/coordinator/scripts/generate-configs.sh (13)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: haiqi96
PR: #594
File: components/package-template/src/sbin/del-archives.sh:6-9
Timestamp: 2024-11-15T16:28:08.644Z
Learning: For wrapper scripts in the components/package-template/src/sbin/ directory, keep them simple and avoid adding additional validation code.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate set -eu and set -o pipefail commands rather than combining them into set -euo pipefail, to ensure uniform script structure across all platform build scripts.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.
Learnt from: kirkrodrigues
PR: #881
File: components/core/tools/scripts/lib_install/macos/install-all.sh:11-12
Timestamp: 2025-05-06T10:07:04.654Z
Learning: In CLP installation scripts, temporary directories with downloaded files should not be automatically cleaned up on failure (e.g., with EXIT traps) to preserve artifacts for debugging purposes.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like apt-get update && apt-get install or apk update && apk add), preserve this pattern for uniform script structure.
🪛 Shellcheck (0.10.0)
tools/deployment/presto-clp/scripts/set-up-config.sh
[info] 20-20: Not following: ./.venv/bin/activate was not specified as input (see shellcheck -x).
(SC1091)
tools/deployment/presto-clp/worker/scripts/generate-configs.sh
[info] 10-10: Double quote to prevent globbing and word splitting.
(SC2086)
[warning] 81-81: In POSIX sh, HOSTNAME is undefined.
(SC3028)
[warning] 82-82: In POSIX sh, HOSTNAME is undefined.
(SC3028)
tools/deployment/presto-clp/coordinator/scripts/generate-configs.sh
[info] 10-10: Double quote to prevent globbing and word splitting.
(SC2086)
🪛 markdownlint-cli2 (0.17.2)
tools/deployment/presto-clp/README.md
7-7: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
20-20: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1
(MD029, ol-prefix)
27-27: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1
(MD029, ol-prefix)
33-33: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
42-42: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
42-42: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
46-46: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
46-46: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
59-59: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
63-63: Multiple top-level headings in the same document
(MD025, single-title, single-h1)
69-69: Multiple consecutive blank lines
Expected: 1; Actual: 2
(MD012, no-multiple-blanks)
70-70: Multiple consecutive blank lines
Expected: 1; Actual: 3
(MD012, no-multiple-blanks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-13, true)
🔇 Additional comments (10)
tools/deployment/presto-clp/scripts/.gitignore (1)
1-1: Pattern will not match the intended directoryBecause this
.gitignoreresides two levels below the repo root, the leading “/” anchors the pattern at the repository root, sotools/deployment/presto-clp/scripts/.venv/will still be tracked. Drop the leading slash:-/.venv/ +.venv/to correctly ignore the virtual-env folder created alongside these scripts.
⛔ Skipped due to learnings
Learnt from: Bill-hbrhbr PR: y-scope/clp#1126 File: .gitignore:5-5 Timestamp: 2025-07-25T21:29:48.947Z Learning: In the CLP project, the .clang-format file is maintained in the yscope-dev-utils submodule and copied over to the main CLP repository, so it should be ignored in .gitignore to prevent accidental commits of the copied file and maintain the single source of truth in the submodule.tools/deployment/presto-clp/worker/config-template/clp.properties (1)
1-1: LGTM – property is correct
connector.name=clpmatches the coordinator side and Presto expectations.tools/deployment/presto-clp/worker/config-template/velox.properties (1)
1-1: Confirm thatmutable-config=trueis intended for production clustersTurning on dynamic config reloading can be handy during local experiments, but it also lets any user with the right privilege change Velox settings at runtime. If this worker image is used beyond a demo, double-check that:
- The feature is really needed.
- Access is restricted (e.g., via
web-ui.authentication).Otherwise consider omitting the flag or gating it behind an env var.
taskfiles/lint.yaml (1)
106-108: YAML directory inclusion looks fine – confirm yamllint rules cover Docker-Compose syntax.Adding
tools/deploymentto the yamllint target broadens coverage, which is good. The directory now containsdocker-compose.yaml; some yamllint default rules (e.g.,truthy,reserved-words) frequently complain about Compose keywords. Double-check that your.yamllint.ymleither disables those rules for Compose files or uses ayamllintextends: dockerprofile to avoid false positives.tools/deployment/presto-clp/coordinator/config-template/config.properties (1)
4-5: Environment variables verified in coordinator.envThe placeholders
PRESTO_COORDINATOR_CONFIG_CONFIGPROPERTIES_QUERY_MAXMEMORYandPRESTO_COORDINATOR_CONFIG_CONFIGPROPERTIES_QUERY_MAXMEMORYPERNODEare defined in tools/deployment/presto-clp/coordinator.env (lines 6–7). No further action required.tools/deployment/presto-clp/docker-compose.yaml (1)
32-33: Ensure${CLP_PACKAGE_ARCHIVES}is an absolute host pathThis bind-mount maps the host path to the same path inside the container. If the variable is unset or relative,
docker composewill error at runtime. Validate or normalise it when generating.env, or mount it to a fixed target path inside the container.tools/deployment/presto-clp/worker/scripts/generate-configs.sh (1)
18-66: Well-implemented coordination logicThe retry mechanism for coordinator readiness and version detection is well-structured with appropriate error handling and busybox-compatible commands.
tools/deployment/presto-clp/scripts/generate-user-env-vars-file.py (3)
9-23: Well-structured logging configurationThe logging setup follows Python best practices with proper formatting and console handling.
79-96: Clean environment file generationThe environment variable generation is well-structured with consistent formatting and includes all necessary configuration parameters.
99-117: Well-implemented configuration value extractionThe helper function properly handles nested dictionary traversal with dot notation and includes comprehensive documentation.
| - "tools/deployment" | ||
| - "tools/scripts" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Black/Ruff may recurse through large non-code trees under tools/deployment.
Running black . from tools/deployment will walk every sub-directory (configs, templates, logs). That can add noticeable time and, in worst cases, choke on extremely large data files a user may drop under that tree. Consider narrowing the path to tools/deployment/**/*.py or adding an --extend-exclude for obvious non-code directories (e.g., config-template, scripts/generated).
🤖 Prompt for AI Agents
In taskfiles/lint.yaml around lines 776 to 777, the current lint paths include
"tools/deployment" which causes Black and Ruff to recursively process large
non-code directories, slowing down linting and potentially causing errors.
Modify the lint paths to only include Python files under "tools/deployment" by
changing the path to "tools/deployment/**/*.py" or add an --extend-exclude
option to exclude known non-code directories like "configs" or "templates" to
prevent unnecessary recursion.
| clp.metadata-provider-type=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE} | ||
| clp.metadata-db-url=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL} | ||
| clp.metadata-db-name=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_NAME} | ||
| clp.metadata-db-user=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER} | ||
| clp.metadata-db-password=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD} | ||
| clp.metadata-table-prefix=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_TABLEPREFIX} | ||
| clp.split-provider-type=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_SPLITPROVIDER} | ||
| clp.metadata-filter-config=/opt/presto-server/etc/metadata-filter.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several placeholders have no matching definitions – coordinator fails to start.
coordinator.env currently defines only METADATA_PROVIDERTYPE and SPLITPROVIDER.
The following variables are missing and will survive as literal ${…} after envsubst, causing Presto to error at startup:
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL…_DATABASE_NAME…_DATABASE_USER…_DATABASE_PASSWORD…_METADATA_TABLEPREFIX
Add them to coordinator.env (or remove the placeholders) before merging.
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/coordinator/config-template/clp.properties lines
2 to 9, several environment variable placeholders used for database
configuration do not have corresponding definitions in coordinator.env, causing
the coordinator to fail at startup. To fix this, add the missing environment
variables PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL,
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_NAME,
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER,
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD, and
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_TABLEPREFIX to the
coordinator.env file with appropriate values, or alternatively remove these
placeholders from clp.properties if they are not needed.
| @@ -0,0 +1 @@ | |||
| com.facebook.presto=${PRESTO_COORDINATOR_CONFIG_LOGPROPERTIES_LEVEL} | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid quoting the log level value in ${…}
Docker .env files often wrap string values in quotes ("DEBUG"). When the template engine substitutes the value here, the resulting line becomes:
com.facebook.presto="DEBUG"
The extra quotes are not a valid log level and Presto will fall back to the default (INFO). Either:
-com.facebook.presto=${PRESTO_COORDINATOR_CONFIG_LOGPROPERTIES_LEVEL}
+com.facebook.presto=${PRESTO_COORDINATOR_CONFIG_LOGPROPERTIES_LEVEL:-INFO}and remove surrounding quotes in the .env file, or strip them inside the script that performs substitution.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| com.facebook.presto=${PRESTO_COORDINATOR_CONFIG_LOGPROPERTIES_LEVEL} | |
| com.facebook.presto=${PRESTO_COORDINATOR_CONFIG_LOGPROPERTIES_LEVEL:-INFO} |
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/coordinator/config-template/log.properties at
line 1, the log level value substituted from the environment variable may
include quotes, causing invalid log level syntax. To fix this, ensure that the
substitution process strips any surrounding quotes from the environment variable
value before inserting it into the config file, or update the .env file to
provide the log level without quotes.
| clp_package_dir=$1 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Quote positional parameter when capturing it
Keeps paths with spaces intact.
-clp_package_dir=$1
+clp_package_dir="$1"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| clp_package_dir=$1 | |
| clp_package_dir="$1" |
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/scripts/set-up-config.sh at line 13, the
positional parameter $1 is assigned to clp_package_dir without quotes, which can
break paths containing spaces. Fix this by enclosing $1 in double quotes when
assigning it to clp_package_dir to preserve any spaces in the path.
| PRESTO_WORKER_HTTPPORT="8080" | ||
|
|
||
| # node.properties | ||
| PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION="worker-location" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Rename variable for readability & future-proofing
Using PRESTO_WORKER_HTTPPORT without a separator makes the variable name harder to scan and deviates from the common *_HTTP_PORT pattern used elsewhere in the repo (e.g., PRESTO_COORDINATOR_HTTP_PORT). Consider renaming:
-PRESTO_WORKER_HTTPPORT="8080"
+PRESTO_WORKER_HTTP_PORT="8080"and update every reference in the generation scripts accordingly.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| PRESTO_WORKER_HTTPPORT="8080" | |
| # node.properties | |
| PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION="worker-location" | |
| PRESTO_WORKER_HTTP_PORT="8080" | |
| # node.properties | |
| PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION="worker-location" |
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/worker.env lines 1 to 4, rename the environment
variable PRESTO_WORKER_HTTPPORT to PRESTO_WORKER_HTTP_PORT to improve
readability and maintain consistency with other variables like
PRESTO_COORDINATOR_HTTP_PORT. After renaming, update all references to this
variable in the generation scripts to match the new name.
| @@ -0,0 +1,6 @@ | |||
| discovery.uri=http://${PRESTO_COORDINATOR_SERVICENAME}:${PRESTO_COORDINATOR_HTTPPORT} | |||
| presto.version=REPLACE_ME | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Prefer an env placeholder over hard-coded REPLACE_ME for Presto version.
Replacing REPLACE_ME via sed in the bootstrap script works, but using an env var keeps the template declarative and avoids sed-ordering bugs:
-presto.version=REPLACE_ME
+presto.version=${PRESTO_VERSION}The script can then simply export PRESTO_VERSION=$(curl …) and rely on envsubst.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| presto.version=REPLACE_ME | |
| presto.version=${PRESTO_VERSION} |
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/worker/config-template/config.properties at line
2, replace the hard-coded "REPLACE_ME" for presto.version with an environment
variable placeholder like "${PRESTO_VERSION}". This change allows the bootstrap
script to export PRESTO_VERSION dynamically and use envsubst for substitution,
making the template declarative and avoiding sed ordering issues.
| node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} | ||
| node.internal-address=REPLACE_ME | ||
| node.location=${PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION} | ||
| node.id=REPLACE_ME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Environment variable name may be misleading for the worker template.
node.environment expands ${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT}, which is defined in the coordinator env file. That couples the worker to the coordinator implementation detail and will break if someone supplies only worker.env.
Rename to a worker-scoped variable or alias it in worker.env to avoid hidden dependency.
-node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT}
+node.environment=${PRESTO_WORKER_CONFIG_NODEPROPERTIES_ENVIRONMENT}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} | |
| node.internal-address=REPLACE_ME | |
| node.location=${PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION} | |
| node.id=REPLACE_ME | |
| node.environment=${PRESTO_WORKER_CONFIG_NODEPROPERTIES_ENVIRONMENT} | |
| node.internal-address=REPLACE_ME | |
| node.location=${PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION} | |
| node.id=REPLACE_ME |
🤖 Prompt for AI Agents
In tools/deployment/presto-clp/worker/config-template/node.properties lines 1 to
4, the environment variable used for node.environment is referencing a
coordinator-specific variable, creating an unintended dependency. Change the
variable to a worker-specific environment variable by renaming it to something
like PRESTO_WORKER_CONFIG_NODEPROPERTIES_ENVIRONMENT or create an alias in the
worker.env file that maps to the coordinator variable to avoid coupling and
ensure the worker config is self-contained.
…don't exist. Remove placeholder properties from property files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-guide/guides-using-presto.md(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: quinntaylormitchell
PR: y-scope/clp#968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
docs/src/user-guide/guides-using-presto.md (5)
Learnt from: quinntaylormitchell
PR: #968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
Learnt from: quinntaylormitchell
PR: #1069
File: docs/src/user-guide/quick-start/clp-json.md:135-152
Timestamp: 2025-07-05T03:38:16.779Z
Learning: The CLP project team has decided to refrain from using include directives in their documentation at present, preferring to maintain duplicated content rather than using shared includes or partials for de-duplication.
Learnt from: quinntaylormitchell
PR: #961
File: docs/src/dev-guide/design-clp-structured/single-file-archive-format.md:216-219
Timestamp: 2025-06-18T14:35:20.485Z
Learning: In clp-s documentation, technical abbreviations like "MPT" (Merged Parse Tree) should be defined at first use to improve reader clarity and comprehension.
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: SharafMohamed
PR: #1033
File: components/core/config/schemas.txt:96-97
Timestamp: 2025-07-17T16:02:57.688Z
Learning: In log-surgeon, the . character is treated as a literal non-delimiter character rather than a regex wildcard that matches any character. This means patterns like .* in log-surgeon schemas match sequences of non-delimiter characters, not arbitrary characters like in standard regex.
🪛 LanguageTool
docs/src/user-guide/guides-using-presto.md
[style] ~32-~32: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...'s metadata database and archives. ### Setting up CLP Follow the [quick-start](./quic...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[uncategorized] ~77-~77: Although a hyphen is possible, it is not necessary in a compound modifier in which the first word is an adverb that ends in ‘ly’.
Context: ...wing: * A catalog name * A fully-qualified schema name * A fully-qualified t...
(HYPHENATED_LY_ADVERB_ADJECTIVE)
[uncategorized] ~78-~78: Although a hyphen is possible, it is not necessary in a compound modifier in which the first word is an adverb that ends in ‘ly’.
Context: ...A fully-qualified schema name * A fully-qualified table name Filter configs under ...
(HYPHENATED_LY_ADVERB_ADJECTIVE)
🪛 markdownlint-cli2 (0.17.2)
docs/src/user-guide/guides-using-presto.md
54-54: Trailing spaces
Expected: 0 or 2; Actual: 1
(MD009, no-trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: lint-check (macos-15)
- GitHub Check: build-macos (macos-15, false)
| * To use more than Presto worker, you can use the `--scale` option as follows: | ||
|
|
||
| ```bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Grammar: insert missing word in scaling instruction
- * To use more than Presto worker, you can use the `--scale` option as follows:
+ * To use more than one Presto worker, you can use the `--scale` option as follows:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * To use more than Presto worker, you can use the `--scale` option as follows: | |
| ```bash | |
| * To use more than one Presto worker, you can use the `--scale` option as follows: |
🤖 Prompt for AI Agents
In docs/src/user-guide/guides-using-presto.md around lines 111 to 113, the
scaling instruction is missing a word for correct grammar. Insert the missing
word "one" after "more than" in the sentence to read "To use more than one
Presto worker, you can use the `--scale` option as follows:" to fix the grammar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
tools/deployment/presto-clp/scripts/generate-user-env-vars-file.py (4)
101-107: Add early return after storage type validation error.The function logs an error when the storage type is not 'fs' but continues execution instead of returning early with failure status.
Apply this fix:
if "fs" != clp_archive_output_storage_type: logger.error( "Expected CLP's archive_output.storage.type to be fs but found '%s'. Presto" " currently only supports reading archives from the fs storage type.", clp_archive_output_storage_type, ) + return False
124-130: Improve credential validation to handle whitespace.The current validation only checks if values are truthy, which could miss empty strings after whitespace stripping.
Consider more robust validation:
- database_user = _get_config_value(credentials, "database.user") - database_password = _get_config_value(credentials, "database.password") - if not database_user or not database_password: + database_user = _get_config_value(credentials, "database.user", "").strip() + database_password = _get_config_value(credentials, "database.password", "").strip() + if not database_user or not database_password:
152-158: Enhance KeyError handling with clearer context.The current error handling doesn't clearly indicate which specific key is missing from the file.
Make the error message more informative:
except KeyError as e: logger.error( - "Missing required key '%s' in '%s'", - e, + "Missing required key %s in '%s'", + str(e), coordinator_common_env_file_path, )
10-23: Consider simplifying logging configuration.The current manual logging setup could be more maintainable using a structured approach.
Consider using basicConfig for simpler setup:
-# Set up console logging -logging_console_handler = logging.StreamHandler() -logging_formatter = logging.Formatter( - "%(asctime)s.%(msecs)03d %(levelname)s [%(module)s] %(message)s", datefmt="%Y-%m-%dT%H:%M:%S" -) -logging_console_handler.setFormatter(logging_formatter) - -# Set up root logger -root_logger = logging.getLogger() -root_logger.setLevel(logging.INFO) -root_logger.addHandler(logging_console_handler) - -# Create logger -logger = logging.getLogger(__name__) +# Set up logging +logging.basicConfig( + level=logging.INFO, + format="%(asctime)s.%(msecs)03d %(levelname)s [%(module)s] %(message)s", + datefmt="%Y-%m-%dT%H:%M:%S" +) +logger = logging.getLogger(__name__)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
docs/src/user-guide/guides-using-presto.md(1 hunks)tools/deployment/presto-clp/scripts/generate-user-env-vars-file.py(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: quinntaylormitchell
PR: y-scope/clp#968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
tools/deployment/presto-clp/scripts/generate-user-env-vars-file.py (7)
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within components/core/tools/scripts/lib_install/, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like readonly declarations.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:104-110
Timestamp: 2024-11-15T16:21:52.122Z
Learning: In clp_package_utils/scripts/native/del_archives.py, when deleting archives, the archive variable retrieved from the database is controlled and is always a single string without path components. Therefore, it's acceptable to skip additional validation checks for directory traversal in this context.
Learnt from: haiqi96
PR: #634
File: components/job-orchestration/job_orchestration/executor/compress/fs_compression_task.py:0-0
Timestamp: 2024-12-16T21:49:06.483Z
Learning: In fs_compression_task.py, when refactoring the archive processing loop in Python, the src_archive_file.unlink() operation should remain outside of the process_archive function.
Learnt from: haiqi96
PR: #594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in components/clp-package-utils/clp_package_utils/scripts/, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: jackluo923
PR: #1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in install-prebuilt-packages.sh, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like --strip-components=1 to handle potential tarball layout changes.
Learnt from: kirkrodrigues
PR: #881
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:35-41
Timestamp: 2025-05-06T09:48:55.408Z
Learning: For installation scripts in the CLP project, prefer explicit error handling over automatic dependency resolution (like apt-get install -f) when installing packages to give users more control over their system.
docs/src/user-guide/guides-using-presto.md (5)
Learnt from: quinntaylormitchell
PR: #968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
Learnt from: quinntaylormitchell
PR: #1069
File: docs/src/user-guide/quick-start/clp-json.md:135-152
Timestamp: 2025-07-05T03:38:16.779Z
Learning: The CLP project team has decided to refrain from using include directives in their documentation at present, preferring to maintain duplicated content rather than using shared includes or partials for de-duplication.
Learnt from: quinntaylormitchell
PR: #961
File: docs/src/dev-guide/design-clp-structured/single-file-archive-format.md:216-219
Timestamp: 2025-06-18T14:35:20.485Z
Learning: In clp-s documentation, technical abbreviations like "MPT" (Merged Parse Tree) should be defined at first use to improve reader clarity and comprehension.
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: SharafMohamed
PR: #1033
File: components/core/config/schemas.txt:96-97
Timestamp: 2025-07-17T16:02:57.688Z
Learning: In log-surgeon, the . character is treated as a literal non-delimiter character rather than a regex wildcard that matches any character. This means patterns like .* in log-surgeon schemas match sequences of non-delimiter characters, not arbitrary characters like in standard regex.
🪛 LanguageTool
docs/src/user-guide/guides-using-presto.md
[style] ~32-~32: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...'s metadata database and archives. ### Setting up CLP Follow the [quick-start](./quic...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[typographical] ~58-~58: It appears that a comma is missing.
Context: ...e/metadata-filter.json`. * For each dataset you want to query, add a filter config ...
(DURING_THAT_TIME_COMMA)
[style] ~75-~75: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...aset>` with the name of the dataset you want to query. (If you didn't specify a ...
(REP_WANT_TO_VB)
🪛 markdownlint-cli2 (0.17.2)
docs/src/user-guide/guides-using-presto.md
74-74: Trailing spaces
Expected: 0 or 2; Actual: 6
(MD009, no-trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: lint-check (macos-15)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-15, false)
🔇 Additional comments (5)
tools/deployment/presto-clp/scripts/generate-user-env-vars-file.py (2)
26-56: LGTM! Clean main function structure.The main function follows a clear structure with proper argument parsing, error handling, and file operations. The return values are consistent and the logic flow is straightforward.
163-181: LGTM! Well-implemented config value retrieval.The function correctly handles dot notation key traversal with proper type checking and graceful fallback to default values.
docs/src/user-guide/guides-using-presto.md (3)
1-24: LGTM! Clear introduction and comprehensive requirements.The introduction effectively explains Presto's role with CLP and the requirements section provides specific version information that users need.
105-155: LGTM! Comprehensive querying guide and clear limitations.The querying section provides practical examples with SQL commands, and the limitations section appropriately sets user expectations about current constraints.
58-58: Consider adding comma for clarity.The sentence structure could benefit from a comma for better readability.
- * For each dataset you want to query, add a filter config of the form: + * For each dataset you want to query, add a filter config of the form:Actually, upon review, the current comma placement is correct. The static analysis suggestion appears to be a false positive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
docs/src/user-guide/guides-using-presto.md (2)
87-93: Add the missing word “one” in the scaling instructionThe sentence is still missing the determiner after more than.
-* To use more than Presto worker, you can use the `--scale` option as follows: +* To use more than one Presto worker, you can use the `--scale` option as follows:
34-36: Vary sentence openings to improve flowTwo adjacent sentences begin with Follow / sample, which triggers the style checker. A quick tweak:
-Follow the [quick-start](./quick-start/index.md) guide to set up CLP and compress your logs. A -sample dataset that works well with Presto is [postgresql]. +Consult the [quick-start](./quick-start/index.md) guide to set up CLP and compress your logs. +One sample dataset that works well with Presto is [postgresql].
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-guide/guides-using-presto.md(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: quinntaylormitchell
PR: y-scope/clp#968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
docs/src/user-guide/guides-using-presto.md (4)
Learnt from: quinntaylormitchell
PR: #968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
Learnt from: quinntaylormitchell
PR: #1069
File: docs/src/user-guide/quick-start/clp-json.md:135-152
Timestamp: 2025-07-05T03:38:16.779Z
Learning: The CLP project team has decided to refrain from using include directives in their documentation at present, preferring to maintain duplicated content rather than using shared includes or partials for de-duplication.
Learnt from: quinntaylormitchell
PR: #961
File: docs/src/dev-guide/design-clp-structured/single-file-archive-format.md:216-219
Timestamp: 2025-06-18T14:35:20.485Z
Learning: In clp-s documentation, technical abbreviations like "MPT" (Merged Parse Tree) should be defined at first use to improve reader clarity and comprehension.
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
🪛 LanguageTool
docs/src/user-guide/guides-using-presto.md
[style] ~32-~32: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...'s metadata database and archives. ### Setting up CLP Follow the [quick-start](./quic...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[typographical] ~58-~58: It appears that a comma is missing.
Context: ...e/metadata-filter.json`. * For each dataset you want to query, add a filter config ...
(DURING_THAT_TIME_COMMA)
[style] ~75-~75: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...aset>` with the name of the dataset you want to query. (If you didn't specify a ...
(REP_WANT_TO_VB)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
| All kv-pairs in each log event can be queried directly using dot-notation. For example, if your logs | ||
| contain the field `foo.bar`, you can query it using: | ||
| ```sql | ||
| SELECT foo.bar FROM default LIMIT 1; | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Define the abbreviation “kv-pairs” for clarity
Readers who are new to CLP may not immediately recognise the shorthand.
-All kv-pairs in each log event can be queried directly using dot-notation.
+All key-value (KV) pairs in each log event can be queried directly using dot notation.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| All kv-pairs in each log event can be queried directly using dot-notation. For example, if your logs | |
| contain the field `foo.bar`, you can query it using: | |
| ```sql | |
| SELECT foo.bar FROM default LIMIT 1; | |
| ``` | |
| All key-value (KV) pairs in each log event can be queried directly using dot notation. For example, if your logs | |
| contain the field `foo.bar`, you can query it using: | |
🤖 Prompt for AI Agents
In docs/src/user-guide/guides-using-presto.md around lines 129 to 134, the term
"kv-pairs" is used without defining the abbreviation. Update the text to define
"kv-pairs" as "key-value pairs" the first time it appears to improve clarity for
readers unfamiliar with the shorthand.
junhaoliao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deferring to @kirkrodrigues 's approval
| echo "Generating environment variables file for user-configured properties..." | ||
| python3 "${script_dir}/generate-user-env-vars-file.py" \ | ||
| --clp-package-dir "${clp_package_dir}" \ | ||
| --output-file "${script_dir}/../.env" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the .env file should be added to https://github.com/y-scope/clp/blob/main/tools/deployment/presto-clp/.gitignore
@coderabbitai create an issue to track
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if .gitignore exists at tools/deployment/presto-clp/.gitignore
if [ -f "tools/deployment/presto-clp/.gitignore" ]; then
echo "=== Existing .gitignore file found ==="
cat tools/deployment/presto-clp/.gitignore
else
echo "=== No .gitignore file exists at tools/deployment/presto-clp/.gitignore ==="
fi
echo ""
echo "=== Creating GitHub issue ==="
# Create the issue
gh issue create \
--title "Add .env file to .gitignore in tools/deployment/presto-clp" \
--body "The generated \`.env\` file in \`tools/deployment/presto-clp/\` should be added to a \`.gitignore\` file to prevent it from being committed to the repository.
**Context:**
The \`set-up-config.sh\` script generates a \`.env\` file containing environment variables at \`tools/deployment/presto-clp/.env\`. This file may contain sensitive information (database credentials, etc.) and should not be tracked by Git.
**Requested by:** @junhaoliao
**Related PR:** https://github.com/y-scope/clp/pull/1132
**Comment:** https://github.com/y-scope/clp/pull/1132#discussion_r2433423044" \
--assignee junhaoliao
Length of output: 149
@junhaoliao, I've created issue #1422 to track adding the .env file to .gitignore in the tools/deployment/presto-clp/ directory. The issue has been assigned to you and includes all the relevant context and backlinks to this PR and comment.
🧠 Learnings used
Learnt from: quinntaylormitchell
PR: y-scope/clp#1170
File: tools/deployment/presto-clp/scripts/generate-metadata-filter-file.py:0-0
Timestamp: 2025-09-04T12:26:54.788Z
Learning: The `set-up-config.sh` script in tools/deployment/presto-clp/scripts/ ensures that the output directory for metadata-filter.json already exists before calling generate-metadata-filter-file.py, so directory creation in the Python script is not strictly necessary.
| volumes: | ||
| - "./worker/config-template:/configs:ro" | ||
| - "./worker/scripts:/scripts:ro" | ||
| - "${CLP_ARCHIVES_DIR}:${CLP_ARCHIVES_DIR}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Presto Docker Compose project and the CLP Package Docker Compose project (introduced in #1178) use different conventions for file mounting, which causes Presto queries to return zero rows.
- Presto project: It mounts the host's
/path/to/var/data/archivesdirectory (or/path/to/var/data/staged-archiveswhen using S3 output) into the container at the same absolute path as on the host. - CLP Package project: It always mounts the host's
/path/to/var/data/archivesdirectory to/var/data/archivesin the container, and/path/to/var/data/staged-archivesto/var/data/staged-archives.
Before #1178, when the components were orchestrated via Python subprocess calls to the docker CLI, they followed the same mapping convention as the current Presto project.
With the clp-s storage engine, when archives are compressed through the Package project, the metadata DB's clp-datasets table stores the archive_storage_directory field using container paths (e.g., /var/data/archives) instead of the corresponding host paths (e.g., /path/to/var/data/archives). This mismatch prevents the Presto coordinator from locating the archives, as they are not mounted under /var/data/archives in the Presto container.
@coderabbitai create an issues to track the fix for this inconsistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create a GitHub issue to track the file mounting inconsistency
gh issue create \
--repo y-scope/clp \
--title "Fix file mounting inconsistency between Presto and CLP Package Docker Compose projects" \
--body "## Problem
The Presto Docker Compose project and the CLP Package Docker Compose project use different conventions for file mounting, which causes Presto queries to return zero rows.
### Current Behavior
- **Presto project (PR #1132):** Mounts the host's \`/path/to/var/data/archives\` directory (or \`/path/to/var/data/staged-archives\` when using S3 output) into the container **at the same absolute path** as on the host.
- **CLP Package project (PR #1178):** Always mounts the host's \`/path/to/var/data/archives\` directory to \`/var/data/archives\` in the container, and \`/path/to/var/data/staged-archives\` to \`/var/data/staged-archives\`.
### Root Cause
With the \`clp-s\` storage engine, when archives are compressed through the Package project, the metadata DB's \`clp-datasets\` table stores the \`archive_storage_directory\` field using container paths (e.g., \`/var/data/archives\`) instead of the corresponding host paths (e.g., \`/path/to/var/data/archives\`). This mismatch prevents the Presto coordinator from locating the archives, as they are not mounted under \`/var/data/archives\` in the Presto container.
### Impact
Presto queries return zero rows because the coordinator cannot locate the archives due to path mismatches.
### Historical Context
Before PR #1178, when the components were orchestrated via Python subprocess calls to the \`docker\` CLI, they followed the same mapping convention as the current Presto project.
## References
- PR #1132 (Presto Docker Compose setup)
- PR #1178 (CLP Package Docker Compose project)
- Original comment: https://github.com/y-scope/clp/pull/1132#discussion_r2462660309" \
--assignee junhaoliao
Length of output: 42
@junhaoliao, I've created issue #1496 to track the file mounting inconsistency between the Presto and CLP Package Docker Compose projects. The issue has been assigned to you and includes all the details you provided about the path mismatch problem and its impact on Presto queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should set system-memory-gb to be about 90% of available machine memory of the deployment according to https://prestodb.io/docs/current/presto_cpp/properties.html#system-memory-gb
@coderabbitai file an issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create a GitHub issue for the system-memory-gb configuration
gh issue create \
--title "Configure system-memory-gb for Presto worker deployment" \
--body "## Description
The Presto worker configuration should include the \`system-memory-gb\` property set to approximately 90% of the available machine memory according to the [Presto documentation](https://prestodb.io/docs/current/presto_cpp/properties.html#system-memory-gb).
Currently, the worker configuration template at \`tools/deployment/presto-clp/worker/config-template/config.properties\` does not include this setting, which may lead to suboptimal memory usage.
## Suggested Action
Add the \`system-memory-gb\` property to the worker configuration template, either:
- As a hardcoded value with documentation explaining how to adjust it
- As an environment variable placeholder for flexible configuration
## References
- PR: https://github.com/y-scope/clp/pull/1132
- Comment: https://github.com/y-scope/clp/pull/1132#discussion_r2462809174
- Documentation: https://prestodb.io/docs/current/presto_cpp/properties.html#system-memory-gb
Requested by: @junhaoliao" \
--assignee junhaoliao
Length of output: 42
@junhaoliao, I've created issue #1503 to track the system-memory-gb configuration requirement for the Presto worker deployment. The issue includes the documentation reference and suggests adding this property to the worker configuration template with a value set to approximately 90% of available memory.
| interval: "10s" | ||
| retries: 30 | ||
|
|
||
| presto-worker: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should define a healthcheck for the worker as well; otherwise, users who run docker compose up --detach --wait would observe the services are started successfully, but the worker service may silently fail (since we --detach the output from the console), which may cause confusions to users about the status of the cluster
@coderabbitai file an issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create GitHub issue for adding healthcheck to Presto worker service
gh issue create \
--repo "y-scope/clp" \
--title "Add healthcheck to Presto worker service in docker-compose.yaml" \
--body "## Description
Currently, the Presto coordinator service in the docker-compose.yaml configuration has a healthcheck defined, but the worker service does not. This creates a potential issue where users running \`docker compose up --detach --wait\` may see services reported as started successfully, while the worker service could be silently failing.
Without a healthcheck on the worker, Docker Compose cannot accurately determine if the worker service is healthy and ready, which may cause confusion about the actual status of the Presto cluster.
## Context
Raised by @junhaoliao during review of PR #1132.
- PR: https://github.com/y-scope/clp/pull/1132
- Review comment: https://github.com/y-scope/clp/pull/1132#discussion_r2462811962
## Expected Behaviour
The worker service should have an appropriate healthcheck defined so that Docker Compose can correctly report the service status when using the \`--wait\` flag." \
--assignee "junhaoliao"
Length of output: 42
@junhaoliao, I've filed issue #1504 to track adding a healthcheck to the Presto worker service. The issue has been assigned to you.
Description
This PR creates a docker-compose project that can quickly set up a docker stack with Presto coordinator and Presstimmo worker for CLP connector E2E demo.
Checklist
breaking change.
Validation performed
Did E2E testing with processed MongoDB 256MB dataset.
Summary by CodeRabbit
New Features
Documentation
Chores