Skip to content

Conversation

@junhaoliao
Copy link
Member

@junhaoliao junhaoliao commented Aug 8, 2025

Description

  • Docker Compose implementation: replace ad-hoc docker run orchestration with a controller-driven Compose workflow that provisions env/config, writes a .env, and brings services up/down with docker compose.
    • Service dependencies: Compose files now encode inter-service health checks, startup ordering, and graceful stop windows for DB, queue, Redis, Mongo, schedulers, workers, reducer, UI, and garbage collector.
    • Volume management: host paths for logs/data/archives/streams/AWS config are mounted consistently across compose projects, with root-aware chown handling.
    • Health checks: critical services use active health probes (cli client / TCP checks) to surface readiness before dependent jobs run.
  • New controller architecture: introduce BaseController and DockerComposeController to encapsulate provisioning, environment generation, and orchestrator-specific lifecycle commands while persisting shared container config for the UI.
    • Environment-based configuration: .env synthesis covers credentials, storage paths, concurrency, logging levels, AWS auth, and service endpoints.
    • Component provisioning helpers: each CLP component now has a dedicated _set_up_env_for_* method that validates directories/configs, pre-creates paths, and resolves hostnames for Compose.
    • Configuration transformation: CLPConfig.transform_for_container() (and per-subconfig overrides) translate host settings into container-friendly service names and absolute paths; generate_docker_compose_container_config() returns the transformed copy used by Compose.
  • Simplified start/stop scripts: start_clp.py/stop_clp.py now validate config, ensure directories exist, and delegate lifecycle management to DockerComposeController.
  • Packaging updates: Compose manifests (docker-compose.base.yaml, docker-compose.yaml) ship with the package, taskfile.yaml copies them into releases, and new docs describe the design and prerequisites.

Breaking Changes

  • sbin/start-clp.sh and sbin/stop-clp.sh no longer accept component names or partial targets. The new controller always spins up or tears down the full Compose project, so existing automation that started individual services (e.g., start-clp.sh query_worker) will fail.
    • The --force option, which prevented removal of killed containers when not specified, is also removed from the sbin/stop-clp.sh.
  • Support for multi-node deployments are temporarily removed. Configuration fields for controller hosts remain, but the package scripts now assume single-host Docker Compose orchestration and the docs instruct users accordingly.
  • Data and log directories are no longer auto-namespaced by hostname when loading clp-config.yml. Existing installations relying on <base>/<hostname> directories must update paths or pre-create per-host directories themselves.
  • docker --version, instead of command -v docker, is now used to check docker availability.
  • Minimum runtime requirements increased: Docker Compose plugin 2.28.1+ is required to start the project.
  • Python API removals/changes:
    • clp_py_utils.clp_config.get_components_for_target() and related component group/target constants were deleted; external tooling importing these symbols will break.
    • CLPConfig.get_runnable_components() replaced by CLPConfig.get_deployment_type(), altering how callers determine which services to launch.
    • clp_package_utils.general.check_dependencies() replaced by check_docker_dependencies(...) (different signature/semantics); scripts calling the old helper need to migrate.

Added

  • components/clp-package-utils/clp_package_utils/controller.py: controller abstraction, Compose-specific implementation, shared config dump, .env writer, worker auto-sizing, instance ID persistence.
  • Compose descriptors under tools/deployment/package/: base stack and full-stack YAML defining services, volumes, health checks, and deployment-type variants.
  • Documentation: docs/src/dev-docs/design-docker-compose.md, index link, build doc note, quick-start Docker version requirements.
  • Config transformations: container path constants, deployment type enum, transform_for_container() overrides for DB, queue, Redis, Mongo, schedulers, reducers, FS/S3 storage, and log input configurations.
  • Dependency validation: check_docker_dependencies() verifies docker availability and Compose project state before start/stop.

Other Removed

  • Legacy per-component orchestration logic (start_clp.py helpers, Docker mount builders, manual docker exec/wait loops, get_components_for_target API) in favour of Compose-driven lifecycle management.

Documentation

  • Added Docker Compose design doc detailing controller pattern, env generation, deployment types, and service topology.
  • Quick-start guide updated with explicit Docker package version minimums and guidance that Compose orchestrates the stack.
  • Multi-node guide replaced with a warning about the temporary removal of multi-node support, leaving only the SeaweedFS appendix.
  • Build doc cross-links to Compose design and mentions Compose assets inclusion in packages.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

(to be updated with more details once they're all revalidated with the latest batch of refactoring changes). Previously:

  1. All scripts in sbin:
    1. admin-tools/archive-manager.sh
    2. admin-tools/dataset-manager.sh
    3. compress.sh
    4. decompress.sh
    5. search.sh
    6. start-clp.sh
    7. stop-clp.sh
  2. And the webui:
    1. Viewing ingestion stats / jobs
    2. Submitting and viewing queries
    3. Extracting stream files and opening them in the log viewer
  3. With below config / use-case variations:
    1. storage_engine: clp / clp-s
    2. with clp-s:
      1. s3 input
      2. s3 archive output
      3. s3 stream output
      4. non-default dataset
    3. garbage collector enablement

Summary by CodeRabbit

  • New Features

    • Controller-driven Docker Compose start/stop with instance-ID management and automated per-component environment (.env) generation; packaged docker-compose manifests for single-node deployment.
  • Bug Fixes / Reliability

    • Stronger Docker dependency/state checks, improved ownership handling, hostname→IP resolution, and safer startup/stop flows.
  • Documentation

    • New deployment orchestration design doc, updated quick-start Docker requirements, and multi-node guide replaced with a Docker Compose notice.
  • Chores

    • Container-friendly config transforms, packaging of deployment assets, and relocation of default config constant.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 8, 2025

Walkthrough

Adds Docker Compose orchestration and packaging: new BaseController and DockerComposeController to prepare per-service environments, emit a .env, and start/stop compose; container-aware config transforms and defaults; moves a default config constant; adds compose manifests, packaging steps, and documentation.

Changes

Cohort / File(s) Summary
Controller & orchestration
components/clp-package-utils/clp_package_utils/controller.py
New BaseController (ABC) with per-component env setup helpers and utilities; new DockerComposeController implementing start/stop, worker-count logic, env aggregation, .env emission, instance-id and chown helpers, and hostname→IP resolution.
Docker dependency & validation
components/clp-package-utils/clp_package_utils/general.py
Added Docker-related exception hierarchy and check_docker_dependencies() (with project-running checks); _is_docker_compose_project_running(); centralized data/log directory validators; generate_docker_compose_container_config() added; dump_shared_container_config() signature changed.
Config model & container transforms
components/clp-py-utils/clp_py_utils/clp_config.py
Added DeploymentType enum, container path constants (CONTAINER_*), CLP default path constants, DEFAULT_PORT classvars and transform_for_container() across components, dump_to_primitive_dict(), and get_deployment_type(); defaults adjusted for data/logs and worker config.
Start/Stop refactor
components/clp-package-utils/clp_package_utils/scripts/start_clp.py, components/clp-package-utils/clp_package_utils/scripts/stop_clp.py
Start/stop scripts refactored to use DockerComposeController and get_or_create_instance_id(); per-target CLI logic removed; start delegates to controller.start() and stop to controller.stop().
CLP_DEFAULT_CONFIG relocation (imports updated)
components/clp-package-utils/clp_package_utils/scripts/*.py, components/clp-package-utils/clp_package_utils/scripts/native/*.py
CLP_DEFAULT_CONFIG_FILE_RELATIVE_PATH moved from clp_package_utils.general to clp_py_utils.clp_config; import sites updated across scripts.
Docker Compose manifests
tools/deployment/package/docker-compose.base.yaml, tools/deployment/package/docker-compose.yaml
New base and package compose files: centralized service defaults, healthchecks, anchors and full set of CLP services plus package-level query services; package compose adds top-level name and include.
Packaging update
taskfile.yaml
Include tools/deployment/package/**/* in package sources and add rsync steps to copy deployment package into build OUTPUT_DIR.
Documentation
docs/src/dev-docs/design-deployment-orchestration.md, docs/src/dev-docs/building-package.md, docs/src/dev-docs/index.md, docs/src/user-docs/guides-multi-node.md, docs/src/user-docs/quick-start/index.md
Added deployment orchestration design doc; building guide references docker-compose and package inclusion; index toctree updated; multi-node guide simplified to note Docker Compose usage; quick-start adds Docker/containerd minimum versions.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant User
    participant StartScript as start_clp.py
    participant Controller as DockerComposeController
    participant EnvPrep as BaseController
    participant Compose as Docker Compose

    User->>StartScript: run
    StartScript->>StartScript: load config & instance-id
    StartScript->>Controller: instantiate
    StartScript->>Controller: controller.start()
    Controller->>EnvPrep: _set_up_env()
    EnvPrep->>EnvPrep: build per-component envs, update webui settings, resolve hosts
    EnvPrep-->>Controller: aggregated env (.env)
    Controller->>Compose: docker compose up (with .env, project name)
    Compose->>Compose: start services (depends_on / healthchecks)
    Controller-->>StartScript: return status
Loading
sequenceDiagram
    autonumber
    participant User
    participant StopScript as stop_clp.py
    participant Controller as DockerComposeController
    participant Compose as Docker Compose

    User->>StopScript: run
    StopScript->>StopScript: load config & instance-id
    StopScript->>Controller: instantiate
    StopScript->>Controller: controller.stop()
    Controller->>Compose: docker compose down (project)
    Compose->>Compose: stop & remove services
    Controller-->>StopScript: return status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat(deployment)!: Migrate package orchestration to Docker Compose (resolves #1177); Temporarily remove support for multi-node deployments." clearly identifies the primary change: migration from ad-hoc docker run orchestration to a Docker Compose–based controller workflow. The title is specific and technical, accurately reflecting the main code changes including the new BaseController and DockerComposeController abstractions, environment synthesis, Docker Compose manifests, and refactored start/stop scripts. The breaking-change indicator (!) is appropriately applied, and the secondary point about multi-node removal is directly related to this primary migration rather than a separate concern. The title successfully communicates the key objective without vague terms or excessive noise.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@junhaoliao junhaoliao changed the title feat(deployment): Migrate package orchestration to Docker Compose (resolves #1177). DO NOT MERGE (pending many other dependency PRs) - feat(deployment): Migrate package orchestration to Docker Compose (resolves #1177). Aug 8, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/src/user-docs/guides-multi-node.md (1)

5-8: Add missing comma and date reference for breaking change clarity.

The warning text is missing a comma after "orchestration" (flagged in prior feedback), and should include a "Since" reference to indicate when this breaking change took effect.

Apply this diff:

 :::{warning}
-CLP now uses [Docker Compose][docker-compose] for orchestration and support for multi-node
-deployments has been removed temporarily.
+CLP now uses [Docker Compose][docker-compose] for orchestration, and support for multi-node
+deployments has been removed temporarily. Since: 2025‑08‑08 (PR #1178).
 :::
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed25fc3 and 127cb79.

📒 Files selected for processing (2)
  • docs/src/dev-docs/building-package.md (2 hunks)
  • docs/src/user-docs/guides-multi-node.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-25T19:26:32.436Z
Learnt from: junhaoliao
PR: y-scope/clp#1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Applied to files:

  • docs/src/dev-docs/building-package.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: package-image
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: build-macos (macos-14, true)
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build-macos (macos-15, true)
🔇 Additional comments (1)
docs/src/dev-docs/building-package.md (1)

91-92: Reference definitions are correct.

The new link references align properly with usage in the note block and point to appropriate destinations. No concerns here.

Comment on lines +64 to +69
:::{note}
The package includes a `docker-compose.yaml` file that can be used to deploy CLP using [Docker
Compose][docker-compose]. If you want to manually deploy with Docker Compose instead of using the
package scripts, see the [Deployment orchestration][design-deployment-orchestration] design doc for
more information.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Correct filename and add manual Compose example.

Two issues in this note:

  1. Filename mismatch (critical): Line 65 references docker-compose.yaml, but the package artifacts ship as docker-compose.yml. This will mislead users trying to locate the file.

  2. Missing user example: Past review cycles suggested adding a quick bash snippet showing the basic Compose commands so users don't need to navigate elsewhere to understand how to manually start/stop services.

Apply this diff to correct the filename and add the example:

 :::{note}
-The package includes a `docker-compose.yaml` file that can be used to deploy CLP using [Docker
+The package includes a `docker-compose.yml` file that can be used to deploy CLP using [Docker
 Compose][docker-compose]. If you want to manually deploy with Docker Compose instead of using the
 package scripts, see the [Deployment orchestration][design-deployment-orchestration] design doc for
 more information.
+
+Example:
+
+```bash
+# From the package root where docker-compose.yml resides
+docker compose up -d
+# Tear down when done (preserves volumes; append -v to also remove volumes)
+docker compose down
+```
 :::
🤖 Prompt for AI Agents
In docs/src/dev-docs/building-package.md around lines 64 to 69, the note
incorrectly references docker-compose.yaml (the package ships
docker-compose.yml) and is missing a quick manual Compose example; update the
filename to docker-compose.yml and insert a short bash snippet showing how to
start and stop the stack (from package root): a one-line docker compose up -d
and a docker compose down comment about preserving volumes (and that -v removes
volumes) so users can run the commands without leaving the doc.

Comment on lines +268 to +283
# NOTE: Only one of `CLP_ARCHIVE_OUTPUT_DIR_HOST` and `CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST` are
# set at a time, but since `./var/data` on the host is mounted into the container and both
# `CLP_ARCHIVE_OUTPUT_DIR_HOST` and `CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST` default to
# directories under `./var/data`, we need to use a hack to avoid having Docker create the
# unset directory on the host (as root).
#
# For example, let's say we use the following as the mount for staged archives:
# "${CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST:-empty}:/var/data/staged-archives". If
# `CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST` is unset, Docker will create
# `/var/data/staged-archives` in the container, but it will also create
# `./var/data/staged-archives` on the host as root; this is because `/var/data` in the
# container is bind mounted to `./var/data` on the host.
#
# The hack to avoid this is if one of `CLP_ARCHIVE_OUTPUT_DIR_HOST` or
# `CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST` is unset, we set the target for the corresponding mount
# to a path that's not under `/var/data` in the container.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai please create an issue to track removal of this hack

an idea is to accept a var/tmp directory config from the user, and we generate all temporary runtime data in that directory instead of the current var/data directory. then we don't have to add the /var/data bind mount below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create GitHub issue to track removal of Docker Compose mount hack

gh issue create \
  --title "Remove Docker Compose mount hack for archive/staged-archive directories" \
  --body "## Context

Currently, the Docker Compose configuration contains a hack to avoid Docker creating directories as root on the host when mounting archive and staged-archive directories.

The issue arises because only one of \`CLP_ARCHIVE_OUTPUT_DIR_HOST\` and \`CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST\` is set at a time, but since \`./var/data\` on the host is mounted into the container and both variables default to directories under \`./var/data\`, Docker would create the unset directory on the host as root if we naively mount with a default path under \`/var/data\`.

## Current Hack

When one of the variables is unset, we set the target for the corresponding mount to a path that's not under \`/var/data\` in the container to prevent Docker from creating it on the host.

See: https://github.com/y-scope/clp/blob/docker-compose/tools/deployment/package/docker-compose.base.yaml#L267-L283

## Proposed Solution

Accept a \`var/tmp\` directory config from the user and generate all temporary runtime data in that directory instead of the current \`var/data\` directory. This would eliminate the need for the \`/var/data\` bind mount and remove the hack.

## References

- PR #1178
- Comment: https://github.com/y-scope/clp/pull/1178#discussion_r2455414543
- Requested by: @junhaoliao" \
  --assignee junhaoliao

Length of output: 42


@junhaoliao I've created issue #1478 to track the removal of this Docker Compose mount hack. The issue describes the current workaround and includes your proposed solution of using a separate var/tmp directory for temporary runtime data.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (10)
tools/deployment/package/docker-compose.base.yaml (10)

88-107: Add explicit user to db-table-creator after removing global default.

Once the global user is removed from x-service-defaults, add an explicit user to this CLP-owned service:

 db-table-creator:
   <<: *service_defaults
   hostname: "db_table_creator"
+  user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}"
   environment:

Based on learnings


199-214: Add explicit user to results-cache-indices-creator after removing global default.

Once the global user is removed from x-service-defaults, add an explicit user to this CLP-owned service:

 results-cache-indices-creator:
   <<: *service_defaults
   hostname: "results_cache_indices_creator"
+  user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}"
   environment:

Based on learnings


4-9: 🔴 CRITICAL: Remove global user override; vendor services will fail to start with forced host UID/GID.

The global user setting in x-service-defaults (line 9) forces all services—including vendor images (MySQL, RabbitMQ, Redis, MongoDB)—to run as the host UID/GID. This breaks their packaged entrypoints and causes permission/ownership errors. Based on learnings, only CLP-owned workloads should specify a user; vendor services must run with their image defaults.

Remove line 9 entirely. Then add explicit user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}" only to CLP-owned services: db-table-creator, results-cache-indices-creator, compression-scheduler, compression-worker, webui, and garbage-collector.

 x-service-defaults: &service_defaults
   image: "${CLP_PACKAGE_CONTAINER_IMAGE_REF:-clp-package}"
   logging:
     driver: "local"
   stop_grace_period: "60s"
-  user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}"

52-87: 🔴 CRITICAL: Remove user override from database service; vendor image requires packaged user.

Line 56 forces the MySQL/MariaDB container to run as the host UID/GID, which breaks the vendor entrypoint. MySQL requires its packaged user (usually mysql). Remove this line entirely.

 database:
   <<: *service_defaults
   image: "${CLP_DB_IMAGE:-mariadb:10-jammy}"
   hostname: "database"
-  user: "${CLP_THIRD_PARTY_SERVICE_UID_GID:-1000:1000}"
   environment:

Additionally, the stop_grace_period: "60s" inherited from x-service-defaults is appropriate for safe MySQL shutdown.


132-167: 🔴 CRITICAL: Remove user override from redis service; Redis requires its packaged user.

Line 136 forces Redis to run as the host UID/GID, which breaks the vendor entrypoint. Remove this line:

 redis:
   <<: *service_defaults
   image: "redis:7.2.4"
   hostname: "redis"
-  user: "${CLP_THIRD_PARTY_SERVICE_UID_GID:-1000:1000}"
   ports:

350-379: Add explicit user to garbage-collector after removing global default.

Once the global user is removed from x-service-defaults, add an explicit user to this CLP-owned service:

 garbage-collector:
   <<: *service_defaults
   hostname: "garbage_collector"
+  user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}"
   stop_grace_period: "10s"
   deploy:

Based on learnings


215-247: Add explicit user to compression-scheduler after removing global default.

Once the global user is removed from x-service-defaults, add an explicit user to this CLP-owned service:

 compression-scheduler:
   <<: *service_defaults
   hostname: "compression_scheduler"
+  user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}"
   stop_grace_period: "300s"
   environment:

Based on learnings


168-198: 🔴 CRITICAL: Remove user override and fix trailing comma in command array.

Two issues:

  1. Line 172: Remove the vendor user override so MongoDB runs with its packaged user:
 results-cache:
   <<: *service_defaults
   image: "mongo:7.0.1"
   hostname: "results_cache"
-  user: "${CLP_THIRD_PARTY_SERVICE_UID_GID:-1000:1000}"
   ports:
  1. Line 191: Fix the trailing comma in the command array (YAML syntax error):
   command: [
     "--config", "/etc/mongo/mongod.conf",
-    "--bind_ip", "0.0.0.0",
+    "--bind_ip", "0.0.0.0"
   ]

304-349: Add explicit user to webui after removing global default.

Once the global user is removed from x-service-defaults, add an explicit user to this CLP-owned service:

 webui:
   <<: *service_defaults
   hostname: "webui"
+  user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}"
   environment:

Based on learnings


248-303: 🔴 CRITICAL: Add explicit user to compression-worker and fix archive mount paths.

Two issues:

  1. Add explicit user after removing global default:
 compression-worker:
   <<: *service_defaults
   hostname: "compression_worker"
+  user: "${CLP_FIRST_PARTY_SERVICE_UID_GID:-1000:1000}"
   environment:
  1. Fix archive mount path logic (lines 284–287). The conditional expansions ${CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST:+/tmp} and ${CLP_ARCHIVE_OUTPUT_DIR_HOST:+/tmp} incorrectly inject /tmp into the container target paths. When CLP_ARCHIVE_OUTPUT_DIR_HOST is set to /mnt/data, the mount becomes /mnt/data:/tmp/var/data/archives instead of /mnt/data:/var/data/archives. Use fixed container paths:
       - "${CLP_ARCHIVE_OUTPUT_DIR_HOST:-empty}\
-        :${CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST:+/tmp}/var/data/archives"
+        :/var/data/archives"
       - "${CLP_STAGED_ARCHIVE_OUTPUT_DIR_HOST:-empty}\
-        :${CLP_ARCHIVE_OUTPUT_DIR_HOST:+/tmp}/var/data/staged-archives"
+        :/var/data/staged-archives"

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 127cb79 and 649276f.

📒 Files selected for processing (1)
  • tools/deployment/package/docker-compose.base.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-25T05:13:13.298Z
Learnt from: junhaoliao
PR: y-scope/clp#1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Applied to files:

  • tools/deployment/package/docker-compose.base.yaml
📚 Learning: 2025-01-16T16:58:43.190Z
Learnt from: haiqi96
PR: y-scope/clp#651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.

Applied to files:

  • tools/deployment/package/docker-compose.base.yaml
📚 Learning: 2025-10-13T03:32:19.293Z
Learnt from: junhaoliao
PR: y-scope/clp#1414
File: tools/docker-images/clp-package/Dockerfile:20-24
Timestamp: 2025-10-13T03:32:19.293Z
Learning: In the clp repository's Dockerfiles (e.g., tools/docker-images/clp-package/Dockerfile), ENV directives should be split into separate lines for readability rather than consolidated to reduce layer count. This is especially true for PATH modifications, as agreed upon in PR #1166. Later ENV settings may depend on earlier ones (e.g., referencing CLP_HOME).

Applied to files:

  • tools/deployment/package/docker-compose.base.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)

Comment on lines +108 to +131
queue:
<<: *service_defaults
image: "rabbitmq:3.9.8"
hostname: "queue"
user: "${CLP_THIRD_PARTY_SERVICE_UID_GID:-1000:1000}"
environment:
RABBITMQ_DEFAULT_PASS: "${CLP_QUEUE_PASS:?Please set a value.}"
RABBITMQ_DEFAULT_USER: "${CLP_QUEUE_USER:?Please set a value.}"
RABBITMQ_LOGS: "/var/log/rabbitmq/rabbitmq.log"
ports:
- host_ip: "${CLP_QUEUE_HOST:-127.0.0.1}"
published: "${CLP_QUEUE_PORT:-5672}"
target: 5672
volumes:
- type: "bind"
source: "${CLP_QUEUE_LOGS_DIR_HOST:-./var/log/queue}"
target: "/var/log/rabbitmq"
healthcheck:
<<: *healthcheck_defaults
test: [
"CMD",
"rabbitmq-diagnostics", "check_running"
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🔴 CRITICAL: Remove user override from queue service; RabbitMQ requires its packaged user.

Line 112 forces RabbitMQ to run as the host UID/GID, which breaks the vendor entrypoint. RabbitMQ requires its packaged user. Remove this line:

 queue:
   <<: *service_defaults
   image: "rabbitmq:3.9.8"
   hostname: "queue"
-  user: "${CLP_THIRD_PARTY_SERVICE_UID_GID:-1000:1000}"
   ports:

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In tools/deployment/package/docker-compose.base.yaml around lines 108 to 131,
remove the user override line in the queue service (the "user:
${CLP_THIRD_PARTY_SERVICE_UID_GID:-1000:1000}" entry) because RabbitMQ requires
its packaged user and the override breaks the vendor entrypoint; delete that
line and ensure no other uid/gid overrides are applied to this service (leave
environment, ports, volumes and healthcheck as-is).

Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

feat(deployment)!: Migrate package orchestration to Docker Compose (resolves #1177); Temporarily remove support for multi-node deployments.

@junhaoliao junhaoliao changed the title feat(deployment)!: Migrate package orchestration to Docker Compose (resolves #1177). feat(deployment)!: Migrate package orchestration to Docker Compose (resolves #1177); Temporarily remove support for multi-node deployments. Oct 23, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
docs/src/dev-docs/design-deployment-orchestration.md (3)

16-16: Fix missing conjunction; verify this was previously addressed.

The sentence still reads "lists the services their functions" and is missing "and". Past review comments indicate this was addressed in commit ed25fc3, but the error remains. Please verify whether this fix was reverted or incompletely applied.

-[Table 1](#table-1) below lists the services their functions, while [Table 2](#table-2) lists the
+[Table 1](#table-1) below lists the services and their functions, while [Table 2](#table-2) lists the

9-9: Add comma after parenthetical for clarity.

The phrase reads more naturally with a comma between the parenthetical and the following phrase.

-[Figure 1](#figure-1) shows the components (*services* in orchestrator terminology) in the CLP
+[Figure 1](#figure-1) shows the components (*services* in orchestrator terminology), in the CLP

153-153: Use "to" instead of "in order to" for conciseness.

  * `<Orchestrator>Controller` that implements (and/or overrides) any of the methods in
    `BaseController` (`<Orchestrator>` is a placeholder for the specific orchestrator for which the
-   class is being implemented in order to
+   class is being implemented to
    orchestrate a deployment).
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 649276f and 7139740.

📒 Files selected for processing (1)
  • docs/src/dev-docs/design-deployment-orchestration.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/dev-docs/design-deployment-orchestration.md

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...in orchestrator terminology) in the CLP package as well as their dependencies. The CLP ...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~153-~153: Consider a more concise word here.
Context: ...specific derived classes must implement in order to orchestrate a deployment. * `<Orche...

(IN_ORDER_TO_PREMIUM)

🪛 markdownlint-cli2 (0.18.1)
docs/src/dev-docs/design-deployment-orchestration.md

9-9: Link fragments should be valid

(MD051, link-fragments)


16-16: Link fragments should be valid

(MD051, link-fragments)


16-16: Link fragments should be valid

(MD051, link-fragments)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: package-image
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: build-macos (macos-14, true)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: lint-check (macos-15)


## Architecture

[Figure 1](#figure-1) shows the components (*services* in orchestrator terminology) in the CLP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Verify link fragment syntax for cross-references.

The Markdown link syntax (e.g., [Figure 1](#figure-1)) references anchors defined with MyST syntax (e.g., (figure-1)= at line 19). Consider using MyST cross-reference syntax ({ref}) instead, which integrates more reliably with the documentation build system.

-[Figure 1](#figure-1) shows the components (*services* in orchestrator terminology), in the CLP
+{ref}`figure-1` shows the components (*services* in orchestrator terminology), in the CLP

Similarly, update line 16 to use {ref} instead of Markdown link syntax.

Also applies to: 16-16

🧰 Tools
🪛 LanguageTool

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...in orchestrator terminology) in the CLP package as well as their dependencies. The CLP ...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 markdownlint-cli2 (0.18.1)

9-9: Link fragments should be valid

(MD051, link-fragments)

🤖 Prompt for AI Agents
In docs/src/dev-docs/design-deployment-orchestration.md around line 9 (and also
update line 16), replace the Markdown anchor-style links like [Figure
1](#figure-1) with MyST cross-reference syntax using the defined anchor name
(figure-1) — e.g., change the Markdown link to {ref}`figure-1`; do the same for
the link on line 16 so all cross-references use {ref}`<anchor-name>` matching
the (figure-1)= anchor at line 19.

Comment on lines +70 to +75
subgraph Databases
database
queue
redis
results_cache
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider renaming "Databases" group to reflect its broader role.

Past review comments discussed whether this grouping should be renamed to "State persistence", "Shared data stores", or similar, since it contains not only a database (MySQL) but also a message broker (RabbitMQ), cache (Redis), and results storage (MongoDB). No final decision was recorded. If there is team consensus, update the label to avoid confusion about the role of these components.

🤖 Prompt for AI Agents
In docs/src/dev-docs/design-deployment-orchestration.md around lines 70 to 75,
the diagram subgraph labeled "Databases" groups heterogeneous components (MySQL,
RabbitMQ, Redis, MongoDB); update the label to reflect their broader role—e.g.,
"State persistence", "Shared data stores", or "Storage & Messaging"—to remove
ambiguity. Pick the team-consensus name (or propose one if none) and replace the
subgraph title text accordingly, then update any nearby explanatory text to use
the new term for consistency.

@junhaoliao junhaoliao merged commit 9ff731b into y-scope:main Oct 23, 2025
28 of 32 checks passed
CLP_DB_PASS: "${CLP_DB_PASS:?Please set a value.}"
CLP_DB_USER: "${CLP_DB_USER:?Please set a value.}"
CLP_LOGGING_LEVEL: "${CLP_QUERY_SCHEDULER_LOGGING_LEVEL:-INFO}"
CLP_LOGS_DIR: "/var/log"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR wrongly changes the query scheduler's logs directory base to /var/log (it should be /var/log/query_scheduler). a similar issue is found in the compression scheduler.

@coderabbitai create an issue to track the fix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create a GitHub issue to track the incorrect logs directory configuration

gh issue create \
  --repo y-scope/clp \
  --title "Fix incorrect logs directory base for schedulers in Docker Compose" \
  --body "The PR #1178 incorrectly configures the logs directory base for query scheduler and compression scheduler.

**Issue:**
- Query scheduler's \`CLP_LOGS_DIR\` is set to \`/var/log\` but should be \`/var/log/query_scheduler\`
- Compression scheduler has a similar issue and should use \`/var/log/compression_scheduler\`

**Context:**
Reported by @junhaoliao in https://github.com/y-scope/clp/pull/1178#discussion_r2456746396

**Files affected:**
- \`tools/deployment/package/docker-compose.yaml\`
- \`tools/deployment/package/docker-compose.base.yaml\` (likely)" \
  --assignee junhaoliao

Length of output: 42


@junhaoliao, I've created issue #1483 to track the incorrect logs directory configuration for the query scheduler and compression scheduler.

The issue documents that:

  • Query scheduler's CLP_LOGS_DIR should be /var/log/query_scheduler instead of /var/log
  • Compression scheduler has the same issue and should use /var/log/compression_scheduler

The issue has been assigned to you and includes a link back to this discussion.


🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp#1152
File: components/clp-package-utils/clp_package_utils/scripts/start_clp.py:613-613
Timestamp: 2025-08-08T06:59:42.436Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/start_clp.py, generic_start_scheduler sets CLP_LOGGING_LEVEL using clp_config.query_scheduler.logging_level for both schedulers; compression scheduler should use its own logging level. Tracking via an issue created from PR #1152 discussion.

@sitaowang1998
Copy link
Contributor

When running compression workflow, I see the following logs at the very start of some compression worker:

[2025-10-28 19:54:36,496: ERROR/MainProcess] consumer: Cannot connect to amqp://clp-user:**@queue:5672//: [Errno 111] Connection refused.
Trying again in 2.00 seconds... (1/100)

[2025-10-28 19:54:38,499: ERROR/MainProcess] consumer: Cannot connect to amqp://clp-user:**@queue:5672//: [Errno 111] Connection refused.
Trying again in 4.00 seconds... (2/100)

This is caused by the lack of dependency between compression worker and RabbitMQ. Compression worker starts before the RabbitMQ is fully setup, and thus cannot connect to it. Once the RabbitMQ is fully operational, everything works as normal.
@CodeRabbit Could you please open an issue about this, which also tracks the other missing dependencies in docker compose services?

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 28, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the docker-compose files to understand current compression-worker dependencies
echo "=== docker-compose.base.yaml ==="
cat tools/deployment/package/docker-compose.base.yaml

echo -e "\n\n=== docker-compose.yaml ==="
cat tools/deployment/package/docker-compose.yaml

Length of output: 17710


🏁 Script executed:

#!/bin/bash
# Create GitHub issue tracking missing Docker Compose service dependencies
gh issue create \
  --title "Add missing service dependencies in Docker Compose configuration" \
  --body "## Description

During compression workflow execution, compression workers intermittently fail to connect to RabbitMQ at startup because they begin before RabbitMQ is fully operational. While workers eventually retry and connect successfully, the initial connection failures generate error logs and indicate missing \`depends_on\` health check dependencies in the Docker Compose configuration.

## Error Logs

\`\`\`
[2025-10-28 19:54:36,496: ERROR/MainProcess] consumer: Cannot connect to amqp://clp-user:**@queue:5672//: [Errno 111] Connection refused.
Trying again in 2.00 seconds... (1/100)

[2025-10-28 19:54:38,499: ERROR/MainProcess] consumer: Cannot connect to amqp://clp-user:**@queue:5672//: [Errno 111] Connection refused.
Trying again in 4.00 seconds... (2/100)
\`\`\`

## Root Cause

The compression worker service does not declare an explicit dependency on the RabbitMQ (\`queue\`) service with a health check condition. As a result, Docker Compose starts the worker before verifying that RabbitMQ is healthy and ready to accept connections.

## Scope

This issue tracks:
1. Adding \`depends_on\` with \`condition: service_healthy\` for compression worker → RabbitMQ
2. Reviewing and adding any other missing service dependencies across all Docker Compose services (compression-scheduler, query-scheduler, query-worker, reducer, webui, garbage-collector, etc.)

## Related

- PR: https://github.com/y-scope/clp/pull/1178
- Comment: https://github.com/y-scope/clp/pull/1178#discussion_r<will-be-generated>

Reported by: @sitaowang1998" \
  --assignee sitaowang1998

Length of output: 42


@sitaowang1998: I've opened issue #1516 to track this problem.

After examining the Docker Compose configuration, I can confirm the root cause: the compression-worker service in docker-compose.base.yaml is missing a depends_on block entirely. While the compression-scheduler correctly declares:

depends_on:
  queue:
    condition: "service_healthy"

The compression-worker has no such dependency declaration, allowing it to start before RabbitMQ is ready. The same issue also affects query-worker, which also connects to the queue but doesn't wait for it to be healthy.

The issue will track adding the missing dependencies for both workers and reviewing all other services for similar gaps.

directory: SerializablePath = CLP_DEFAULT_DATA_DIRECTORY_PATH / "archives"
directory: SerializablePath = CLP_DEFAULT_ARCHIVES_DIRECTORY_PATH

def transform_for_container(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "native" scripts - particularly native/dataset_manager.py - should have been updated to use the new in-container path (/var/data) instead of relying on the previous direct host-to-container path mapping. Otherwise, the dataset manager isn't able to remove archives at the moment.

@coderabbitai create an issue to track the archive deletion bug in the dataset manager

and another one to migrate the sbin tools to Docker Compose - or at least update their Docker mappings to align with the Package Compose project

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Create issue 1: Archive deletion bug in dataset manager
gh issue create \
  --repo y-scope/clp \
  --title "Archive deletion fails in dataset manager due to path mapping mismatch" \
  --body "## Description

The native dataset manager script (\`native/dataset_manager.py\`) is currently unable to remove archives because it hasn't been updated to use the new in-container path (\`/var/data\`) introduced in the Docker Compose migration.

## Root Cause

The script still relies on the previous direct host-to-container path mapping instead of the new container path structure.

## Related

- PR: https://github.com/y-scope/clp/pull/1178
- Comment: https://github.com/y-scope/clp/pull/1178#discussion_r2476634906
- Reported by: @junhaoliao

## Expected Fix

Update \`native/dataset_manager.py\` and related native scripts to use the correct in-container paths that align with the Docker Compose deployment model." \
  --assignee junhaoliao

echo "---"

# Create issue 2: Migrate sbin tools to Docker Compose
gh issue create \
  --repo y-scope/clp \
  --title "Migrate sbin tools to Docker Compose or align their Docker mappings" \
  --body "## Description

The sbin tools need to be migrated to use Docker Compose orchestration, or at minimum, their Docker volume/path mappings should be updated to align with the Package Compose project structure.

## Context

Following the migration to Docker Compose in PR #1178, the sbin tools' container orchestration and path mappings are out of sync with the new deployment model.

## Scope

Either:
- Fully migrate sbin tools to Docker Compose orchestration, or
- Update their Docker run commands and volume mappings to align with the Package Compose project

## Related

- PR: https://github.com/y-scope/clp/pull/1178
- Comment: https://github.com/y-scope/clp/pull/1178#discussion_r2476634906
- Reported by: @junhaoliao" \
  --assignee junhaoliao

Length of output: 89


@junhaoliao, I've created both issues to track these follow-up items:

  1. Archive deletion bug: #1526 - Archive deletion fails in dataset manager due to path mapping mismatch

    • Tracks the immediate bug where native/dataset_manager.py can't remove archives due to the path mapping mismatch with the new /var/data in-container path structure.
  2. Sbin tools migration: #1527 - Migrate sbin tools to Docker Compose or align their Docker mappings

    • Tracks the broader task of either fully migrating sbin tools to Docker Compose or updating their Docker volume/path mappings to align with the Package Compose project.

Both issues are assigned to you and include backlinks to this PR and comment thread.


🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp#1178
File: components/clp-py-utils/clp_py_utils/clp_config.py:47-47
Timestamp: 2025-10-07T07:54:32.427Z
Learning: In components/clp-py-utils/clp_py_utils/clp_config.py, the CONTAINER_AWS_CONFIG_DIRECTORY constant is intentionally set to pathlib.Path("/") / ".aws" (i.e., `/.aws`) rather than a user-specific home directory. This hardcoded path is part of the container orchestration design.

Learnt from: haiqi96
PR: y-scope/clp#1144
File: components/clp-package-utils/clp_package_utils/scripts/native/dataset_manager.py:164-170
Timestamp: 2025-08-13T15:07:37.767Z
Learning: In the CLP codebase, the ArchiveOutput.get_directory() method already returns a resolved/absolute Path object because FsStorage.make_config_paths_absolute() processes the directory path during configuration loading, making additional .resolve() calls redundant.

Learnt from: haiqi96
PR: y-scope/clp#1144
File: components/clp-package-utils/clp_package_utils/scripts/native/dataset_manager.py:164-170
Timestamp: 2025-08-13T15:07:37.767Z
Learning: In the CLP codebase, the ArchiveOutput.get_directory() method already returns a resolved/absolute Path object because FsStorage.make_config_paths_absolute() processes the directory path during configuration loading, making additional .resolve() calls redundant.

Learnt from: haiqi96
PR: y-scope/clp#1144
File: components/clp-package-utils/clp_package_utils/scripts/dataset_manager.py:106-114
Timestamp: 2025-08-13T14:48:49.020Z
Learning: For the dataset manager scripts in components/clp-package-utils/clp_package_utils/scripts/, the native script (native/dataset_manager.py) is designed to only be called through the wrapper script (dataset_manager.py), so dataset validation is only performed at the wrapper level rather than duplicating it in the native script.

Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:104-110
Timestamp: 2024-11-15T16:21:52.122Z
Learning: In `clp_package_utils/scripts/native/del_archives.py`, when deleting archives, the `archive` variable retrieved from the database is controlled and is always a single string without path components. Therefore, it's acceptable to skip additional validation checks for directory traversal in this context.

Learnt from: junhaoliao
PR: y-scope/clp#1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Learnt from: junhaoliao
PR: y-scope/clp#1178
File: components/clp-package-utils/clp_package_utils/controller.py:315-315
Timestamp: 2025-10-17T19:59:25.596Z
Learning: In components/clp-package-utils/clp_package_utils/controller.py, worker log directories (compression_worker, query_worker, reducer) created via `mkdir()` do not need `_chown_paths_if_root()` calls because directories are created with the same owner as the script caller. This differs from infrastructure service directories (database, queue, Redis, results cache) which do require explicit ownership changes.

Learnt from: junhaoliao
PR: y-scope/clp#1414
File: tools/docker-images/clp-package/Dockerfile:20-24
Timestamp: 2025-10-13T03:32:19.293Z
Learning: In the clp repository's Dockerfiles (e.g., tools/docker-images/clp-package/Dockerfile), ENV directives should be split into separate lines for readability rather than consolidated to reduce layer count. This is especially true for PATH modifications, as agreed upon in PR #1166. Later ENV settings may depend on earlier ones (e.g., referencing CLP_HOME).

Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants