Skip to content

Conversation

@hoophalab
Copy link
Contributor

@hoophalab hoophalab commented Nov 8, 2025

Description

  1. Add API server to docker-compose.*.yaml
  2. Tailor how api server fetches credentials and configurations.
  3. Add necessary configurations to controller.py
  4. Add pydantic models for API server fields in clp-config.yml

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. rebuild package
  2. start clp-json
  3. api server is accessible on localhost:3001

Summary by CodeRabbit

  • New Features

    • New API server component with CLI options for host, port and config; added to deployment manifests and packaged images.
  • Improvements

    • Environment-based credentials, configurable logs directory, combined console + rotating-file logging, and graceful shutdown handling.
    • Default API server settings and query-polling defaults added to configuration template (commented).
  • Documentation

    • Deployment diagrams and multi-host startup guide updated to include the API server.
  • Chores

    • Build pipeline updated to build and include Rust artifacts; new build/clean tasks added.

@hoophalab hoophalab requested a review from a team as a code owner November 8, 2025 04:18
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 8, 2025

Walkthrough

Adds a Rust API server binary and runtime changes: new CLI options and env-driven credentials, file-based non-blocking logging, async graceful shutdown; new API server config models in Python and Rust; build tasks and Dockerfile updated to produce and include the binary; Docker Compose services, docs, and CI task adjustments added.

Changes

Cohort / File(s) Summary
Rust Cargo changes
components/api-server/Cargo.toml
Added secrecy = "0.10.3" and expanded tracing-subscriber features from ["json","env-filter"] to ["json","env-filter","fmt","std"].
API server binary
components/api-server/src/bin/api_server.rs
New CLI args --config, --host, --port; Args updated; added read_config_and_credentials, set_up_logging, shutdown_signal; async main with tokio; CLI host/port override config; reads CLP_DB_USER, CLP_DB_PASS, CLP_LOGS_DIR; non-blocking file+stdout logging; integrated graceful shutdown.
Python config models
components/clp-py-utils/clp_py_utils/clp_config.py
Added API_SERVER_COMPONENT_NAME = "api_server"; new Pydantic QueryJobPollingConfig and ApiServer models; added api_server: ApiServer to CLPConfig.
Python controller env setup
components/clp-package-utils/clp_package_utils/controller.py
Imported API_SERVER_COMPONENT_NAME; added BaseController._set_up_env_for_api_server() to create logs dir, resolve container path, and return CLP_API_SERVER_HOST/CLP_API_SERVER_PORT; merged into DockerComposeController.set_up_env().
Rust config defaults
components/clp-rust-utils/src/clp_config/package/config.rs
Added #[serde(default)] to QueryJobPollingConfig so deserialization applies defaults.
Config template
components/package-template/src/etc/clp-config.yml
Added commented api_server block (host, port, query_job_polling, default_max_num_query_results).
Build tasks
taskfile.yaml
Added G_RUST_BUILD_DIR var; added clean-rust and rust tasks; hooked rust into package-build-deps; added clean-rust to clean.
Docker Compose (all)
tools/deployment/package/docker-compose-all.yaml
Added api-server service: uses *service_defaults, hostname api_server, env (CLP_LOGS_DIR, CLP_DB_PASS, CLP_DB_USER, RUST_LOG), port mapping ${CLP_API_SERVER_HOST:-127.0.0.1}:${CLP_API_SERVER_PORT:-3001}:3001, config & logs volumes, depends_on initializers, and command to run the API binary with --host, --port, --config.
Docker Compose (base & primary)
tools/deployment/package/docker-compose-base.yaml, tools/deployment/package/docker-compose.yaml
Added top-level api-server service entries that extend the docker-compose-all.yaml api-server definition.
Docker image
tools/docker-images/clp-package/Dockerfile
Added COPY --link --chown=${UID} ./build/rust/release/api_server bin/ to include the compiled API server binary.
CI / Actions
.github/actions/run-on-image/action.yaml, .github/workflows/clp-artifact-build.yaml
Removed npm_config_cache Docker run arg in action; added HOME=/tmp env assignment in two run_command invocations in clp-artifact-build.yaml.
Docs & guides
docs/src/dev-docs/design-deployment-orchestration.md, docs/src/user-docs/guides-multi-host.md
Added api_server to architecture diagram/service table and inserted API server startup step in multi-host guide before web UI.

Sequence Diagram(s)

sequenceDiagram
    rect `#f7fbff`
    participant Controller as Python Controller
    participant Deployer as Docker Compose
    participant API as API Server (Rust)
    participant CFG as /etc/clp-config.yml
    participant DB as Database
    end

    Controller->>Deployer: provide env (CLP_API_SERVER_HOST, CLP_API_SERVER_PORT, CLP_LOGS_DIR, CLP_DB_*)
    Deployer->>Deployer: start api-server after init services complete
    Deployer->>API: exec /opt/clp/bin/api_server --host --port --config /etc/clp-config.yml
    API->>CFG: load config file
    API->>API: apply CLI overrides (host/port)
    API->>API: read env (CLP_DB_USER, CLP_DB_PASS, CLP_LOGS_DIR)
    API->>DB: connect using env credentials
    API->>API: init logging (non-blocking file + stdout)
    API->>API: serve requests
    API-->>API: graceful shutdown on SIGTERM or Ctrl+C
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Files warranting extra attention:
    • components/api-server/src/bin/api_server.rs — CLI vs config precedence, env-based credentials handling, logging setup (Tee + non-blocking writer), and shutdown/select logic.
    • tools/deployment/package/docker-compose-all.yaml — env interpolation, port mapping defaults, depends_on conditions.
    • taskfile.yaml and tools/docker-images/clp-package/Dockerfile — verify G_RUST_BUILD_DIR and COPY paths match build outputs.
    • components/clp-package-utils/clp_package_utils/controller.py and components/clp-py-utils/clp_py_utils/clp_config.py — ensure env var names and config shapes align with the Rust server expectations.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a Docker Compose service for the API server, which is the primary focus of the changeset.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eddf9dc and 49a15d0.

📒 Files selected for processing (1)
  • taskfile.yaml (5 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1575
File: components/clp-py-utils/clp_py_utils/clp_config.py:602-607
Timestamp: 2025-11-10T05:19:56.600Z
Learning: In the y-scope/clp repository, the `ApiServer` class in `components/clp-py-utils/clp_py_utils/clp_config.py` does not need a `transform_for_container()` method because no other containerized service depends on the API server - it's only accessed from the host, so no docker-network communication is expected.
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:01:31.391Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:01:31.391Z
Learning: In y-scope/clp, for the clp-rust-checks workflow (GitHub Actions) and the Taskfile target deps:lock:check-rust, we should verify Rust Cargo.lock is in sync with Cargo.toml using a non-mutating method (e.g., cargo metadata --locked / cargo check --locked) to keep CI deterministic and avoid updating dependencies during validation.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-05-06T09:46:42.639Z
Learnt from: kirkrodrigues
Repo: y-scope/clp PR: 881
File: components/core/tools/scripts/lib_install/macos/install-all.sh:15-23
Timestamp: 2025-05-06T09:46:42.639Z
Learning: For temporary solutions in installation scripts like those in `components/core/tools/scripts/lib_install/`, checksumming downloaded files is considered optional, particularly when working with trusted sources like GitHub raw content.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-07T17:41:15.655Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-25T00:45:05.464Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1242
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:39-41
Timestamp: 2025-08-25T00:45:05.464Z
Learning: Task v3.44.1 causes issues in the clp-ffi-js project (issue #110), so CLP should avoid upgrading to v3.44.1 and stick with v3.44.0 for now.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-27T04:05:01.560Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1507
File: taskfiles/deps/lock.yaml:66-76
Timestamp: 2025-10-27T04:05:01.560Z
Learning: In the CLP project, the `uv` CLI tool is documented as a prerequisite in docs/src/dev-docs/building-package.md and is not installed via a task dependency like the rust toolchain. UV is expected to be available in the environment rather than being provisioned by the taskfile system.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:14:12.225Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1466
File: .github/workflows/clp-rust-checks.yaml:14-15
Timestamp: 2025-10-22T21:14:12.225Z
Learning: Repository y-scope/clp: In GitHub Actions workflows (e.g., .github/workflows/clp-rust-checks.yaml), YAML anchors/aliases are acceptable and preferred to avoid duplication; if actionlint flags an alias node (e.g., on push.paths) as an error, treat it as a tool limitation and do not require inlining unless the team asks to silence the warning.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-09-25T19:26:32.436Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-25T21:29:48.947Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1126
File: .gitignore:5-5
Timestamp: 2025-07-25T21:29:48.947Z
Learning: In the CLP project, the .clang-format file is maintained in the yscope-dev-utils submodule and copied over to the main CLP repository, so it should be ignored in .gitignore to prevent accidental commits of the copied file and maintain the single source of truth in the submodule.

Applied to files:

  • taskfile.yaml
🔇 Additional comments (3)
taskfile.yaml (3)

31-31: Approve Rust build directory variable addition.

The addition of G_RUST_BUILD_DIR variable is properly placed and follows the build paths naming convention.


100-102: Approve clean-rust task placement and implementation.

The new clean-rust task correctly removes Rust artifacts and is properly integrated into the main clean task sequence (line 76).


235-235: Approve Rust task integration into package-build-deps.

The addition of rust as a dependency in package-build-deps is correctly positioned and ensures the Rust API server binary is built before package assembly.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as outdated.

junhaoliao

This comment was marked as resolved.

query_worker["query-worker"]
reducer["reducer"]
api_server["api-server"]
garbage_collector["garbage-collector"]
Copy link
Contributor Author

@hoophalab hoophalab Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is to make the nodes in Management & UI section follow the order api_server, garbage_collector, webui. api_server and garbage_collector are management nodes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
taskfile.yaml (1)

200-207: Add generates field to rust task for proper Taskfile dependency tracking.

Verification confirms the rust task (lines 200-207) is missing the generates field, which is used throughout taskfile.yaml for dependency caching. The workspace builds a single binary: api_server. The suggested fix with generates: ["{{.G_RUST_BUILD_DIR}}/release/api_server"] is accurate and follows the Rust cargo build output conventions.

Apply the provided diff to declare the generated binary output.

tools/docker-images/clp-package/Dockerfile (1)

44-44: Consider alphabetizing COPY statements for maintainability.

As noted in the past review comment, since all COPY statements use --link, they can be reordered without affecting the build. Alphabetizing the binary COPY statements would improve maintainability and make it easier to locate specific entries.

Consider this alphabetical ordering:

 COPY --link --chown=${UID} ./components/package-template/src/ .
+COPY --link --chown=${UID} ./build/rust/release/api_server bin/
 COPY --link --chown=${UID} ./build/core/clg bin/
 COPY --link --chown=${UID} ./build/core/clo bin/
 COPY --link --chown=${UID} ./build/core/clp bin/
 COPY --link --chown=${UID} ./build/core/clp-s bin/
 COPY --link --chown=${UID} ./build/core/indexer bin/
 COPY --link --chown=${UID} ./build/core/log-converter bin/
 COPY --link --chown=${UID} ./build/core/reducer-server bin/
 COPY --link --chown=${UID} ./build/deps/cpp/mariadb-connector-cpp-install/lib/*/libmariadbcpp.so* \
     lib/
 COPY --link --chown=${UID} ./build/spider/spider-build/src/spider/spider_scheduler bin/
 COPY --link --chown=${UID} ./build/spider/spider-build/src/spider/spider_worker bin/
 COPY --link --chown=${UID} ./build/nodejs-22/bin/node bin/node-22
 COPY --link --chown=${UID} ./build/python-libs/ lib/python3/site-packages/
-COPY --link --chown=${UID} ./build/rust/release/api_server bin/
 COPY --link --chown=${UID} ./build/webui/ var/www/webui/
tools/deployment/package/docker-compose-all.yaml (1)

497-497: Make RUST_LOG configurable and follow Rust conventions.

The RUST_LOG environment variable is hardcoded to "DEBUG" which produces very verbose logs and doesn't follow Rust conventions. Per the env_logger documentation, log level names should be lowercase, and the level should be configurable for different environments.

Apply this diff:

       CLP_LOGS_DIR: "/var/log/api_server"
       CLP_DB_PASS: "${CLP_DB_PASS:?Please set a value.}"
       CLP_DB_USER: "${CLP_DB_USER:?Please set a value.}"
-      RUST_LOG: "DEBUG"
+      RUST_LOG: "${CLP_API_SERVER_RUST_LOG:-info}"

Then update the controller to set the default in .env if needed.

components/clp-package-utils/clp_package_utils/controller.py (1)

772-775: Consider reordering API server setup for consistency.

The API server environment setup is called after webui setup, but the deployment documentation (docs/src/user-docs/guides-multi-host.md) and architecture diagram show the API server starting before webui. While the order may not affect functionality since all environment variables are merged, consider moving line 775 before line 772 for consistency with the documented startup sequence.

Apply this diff to improve consistency:

         env_vars |= self._set_up_env_for_compression_worker(num_workers)
         env_vars |= self._set_up_env_for_query_worker(num_workers)
         env_vars |= self._set_up_env_for_reducer(num_workers)
+        env_vars |= self._set_up_env_for_api_server()
         env_vars |= self._set_up_env_for_webui(container_clp_config)
         env_vars |= self._set_up_env_for_mcp_server()
         env_vars |= self._set_up_env_for_garbage_collector()
-        env_vars |= self._set_up_env_for_api_server()
components/clp-py-utils/clp_py_utils/clp_config.py (1)

596-607: Consider adding validation and a DEFAULT_PORT constant for consistency.

The ApiServer model follows most component patterns but could be improved:

  1. Validation: default_max_num_query_results should use PositiveInt instead of plain int to ensure it's greater than zero.
  2. Consistency: Add a DEFAULT_PORT: ClassVar[int] = 3001 like other components (Database, Queue, Redis, etc.) for clarity and maintainability.

Apply this diff:

 class ApiServer(BaseModel):
+    DEFAULT_PORT: ClassVar[int] = 3001
+
     host: DomainStr = "localhost"
-    port: Port = 3001
+    port: Port = DEFAULT_PORT
     query_job_polling: QueryJobPollingConfig = QueryJobPollingConfig()
-    default_max_num_query_results: int = 1000
+    default_max_num_query_results: PositiveInt = 1000
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa1d134 and 4bf7ad1.

📒 Files selected for processing (11)
  • .github/actions/run-on-image/action.yaml (0 hunks)
  • .github/workflows/clp-artifact-build.yaml (1 hunks)
  • components/api-server/src/bin/api_server.rs (2 hunks)
  • components/clp-package-utils/clp_package_utils/controller.py (3 hunks)
  • components/clp-py-utils/clp_py_utils/clp_config.py (3 hunks)
  • components/package-template/src/etc/clp-config.yml (1 hunks)
  • docs/src/dev-docs/design-deployment-orchestration.md (4 hunks)
  • docs/src/user-docs/guides-multi-host.md (1 hunks)
  • taskfile.yaml (5 hunks)
  • tools/deployment/package/docker-compose-all.yaml (1 hunks)
  • tools/docker-images/clp-package/Dockerfile (1 hunks)
💤 Files with no reviewable changes (1)
  • .github/actions/run-on-image/action.yaml
🧰 Additional context used
🧠 Learnings (25)
📓 Common learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1575
File: components/clp-py-utils/clp_py_utils/clp_config.py:602-607
Timestamp: 2025-11-10T05:19:56.600Z
Learning: In the y-scope/clp repository, the `ApiServer` class in `components/clp-py-utils/clp_py_utils/clp_config.py` does not need a `transform_for_container()` method because no other containerized service depends on the API server - it's only accessed from the host, so no docker-network communication is expected.
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/package-template/src/etc/clp-config.yml
  • components/api-server/src/bin/api_server.rs
  • tools/deployment/package/docker-compose-all.yaml
  • components/clp-package-utils/clp_package_utils/controller.py
  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-11-10T05:19:56.600Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1575
File: components/clp-py-utils/clp_py_utils/clp_config.py:602-607
Timestamp: 2025-11-10T05:19:56.600Z
Learning: In the y-scope/clp repository, the `ApiServer` class in `components/clp-py-utils/clp_py_utils/clp_config.py` does not need a `transform_for_container()` method because no other containerized service depends on the API server - it's only accessed from the host, so no docker-network communication is expected.

Applied to files:

  • components/package-template/src/etc/clp-config.yml
  • tools/deployment/package/docker-compose-all.yaml
  • components/clp-package-utils/clp_package_utils/controller.py
  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
Repo: y-scope/clp PR: 1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.

Applied to files:

  • components/package-template/src/etc/clp-config.yml
  • components/clp-py-utils/clp_py_utils/clp_config.py
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • tools/docker-images/clp-package/Dockerfile
  • taskfile.yaml
  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-10-17T19:59:25.596Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:315-315
Timestamp: 2025-10-17T19:59:25.596Z
Learning: In components/clp-package-utils/clp_package_utils/controller.py, worker log directories (compression_worker, query_worker, reducer) created via `mkdir()` do not need `_chown_paths_if_root()` calls because directories are created with the same owner as the script caller. This differs from infrastructure service directories (database, queue, Redis, results cache) which do require explicit ownership changes.

Applied to files:

  • components/clp-package-utils/clp_package_utils/controller.py
📚 Learning: 2025-10-07T07:54:32.427Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-py-utils/clp_py_utils/clp_config.py:47-47
Timestamp: 2025-10-07T07:54:32.427Z
Learning: In components/clp-py-utils/clp_py_utils/clp_config.py, the CONTAINER_AWS_CONFIG_DIRECTORY constant is intentionally set to pathlib.Path("/") / ".aws" (i.e., `/.aws`) rather than a user-specific home directory. This hardcoded path is part of the container orchestration design.

Applied to files:

  • components/clp-package-utils/clp_package_utils/controller.py
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • components/clp-package-utils/clp_package_utils/controller.py
📚 Learning: 2025-10-22T21:01:31.391Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:01:31.391Z
Learning: In y-scope/clp, for the clp-rust-checks workflow (GitHub Actions) and the Taskfile target deps:lock:check-rust, we should verify Rust Cargo.lock is in sync with Cargo.toml using a non-mutating method (e.g., cargo metadata --locked / cargo check --locked) to keep CI deterministic and avoid updating dependencies during validation.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-05-06T09:46:42.639Z
Learnt from: kirkrodrigues
Repo: y-scope/clp PR: 881
File: components/core/tools/scripts/lib_install/macos/install-all.sh:15-23
Timestamp: 2025-05-06T09:46:42.639Z
Learning: For temporary solutions in installation scripts like those in `components/core/tools/scripts/lib_install/`, checksumming downloaded files is considered optional, particularly when working with trusted sources like GitHub raw content.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-07T17:41:15.655Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-08-25T00:45:05.464Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1242
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:39-41
Timestamp: 2025-08-25T00:45:05.464Z
Learning: Task v3.44.1 causes issues in the clp-ffi-js project (issue #110), so CLP should avoid upgrading to v3.44.1 and stick with v3.44.0 for now.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • taskfile.yaml
  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-10-27T04:05:01.560Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1507
File: taskfiles/deps/lock.yaml:66-76
Timestamp: 2025-10-27T04:05:01.560Z
Learning: In the CLP project, the `uv` CLI tool is documented as a prerequisite in docs/src/dev-docs/building-package.md and is not installed via a task dependency like the rust toolchain. UV is expected to be available in the environment rather than being provisioned by the taskfile system.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-22T21:14:12.225Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1466
File: .github/workflows/clp-rust-checks.yaml:14-15
Timestamp: 2025-10-22T21:14:12.225Z
Learning: Repository y-scope/clp: In GitHub Actions workflows (e.g., .github/workflows/clp-rust-checks.yaml), YAML anchors/aliases are acceptable and preferred to avoid duplication; if actionlint flags an alias node (e.g., on push.paths) as an error, treat it as a tool limitation and do not require inlining unless the team asks to silence the warning.

Applied to files:

  • taskfile.yaml
  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-09-25T19:26:32.436Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Applied to files:

  • taskfile.yaml
  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-07-25T21:29:48.947Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1126
File: .gitignore:5-5
Timestamp: 2025-07-25T21:29:48.947Z
Learning: In the CLP project, the .clang-format file is maintained in the yscope-dev-utils submodule and copied over to the main CLP repository, so it should be ignored in .gitignore to prevent accidental commits of the copied file and maintain the single source of truth in the submodule.

Applied to files:

  • taskfile.yaml
📚 Learning: 2025-10-13T03:32:19.293Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1414
File: tools/docker-images/clp-package/Dockerfile:20-24
Timestamp: 2025-10-13T03:32:19.293Z
Learning: In the clp repository's Dockerfiles (e.g., tools/docker-images/clp-package/Dockerfile), ENV directives should be split into separate lines for readability rather than consolidated to reduce layer count. This is especially true for PATH modifications, as agreed upon in PR #1166. Later ENV settings may depend on earlier ones (e.g., referencing CLP_HOME).

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-05-26T16:03:05.519Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 918
File: .github/workflows/clp-execution-image-build.yaml:77-97
Timestamp: 2025-05-26T16:03:05.519Z
Learning: In .github/workflows/clp-execution-image-build.yaml, the ubuntu-jammy-execution-image and ubuntu-noble-execution-image jobs are intentionally kept separate (rather than using a matrix strategy) to make it easier to remove individual platform versions when they reach end of life, such as when jammy eventually becomes obsolete.

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-10-20T21:05:30.417Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1413
File: tools/docker-images/clp-package/Dockerfile:22-24
Timestamp: 2025-10-20T21:05:30.417Z
Learning: In the clp repository's Dockerfiles, ENV directives should be consolidated into multi-line ENV statements when possible to reduce image layers. ENV statements should only be split into separate commands when consolidation is not possible due to dependencies (e.g., when later variables must reference earlier ones that need to be set first, or when PATH must be modified sequentially).

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-07-07T17:43:04.349Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate `set -eu` and `set -o pipefail` commands rather than combining them into `set -euo pipefail`, to ensure uniform script structure across all platform build scripts.

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-07-01T14:51:19.172Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-07-01T14:52:02.418Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
📚 Learning: 2025-05-27T20:04:51.498Z
Learnt from: anlowee
Repo: y-scope/clp PR: 925
File: .github/workflows/clp-s-antlr-generation.yaml:24-27
Timestamp: 2025-05-27T20:04:51.498Z
Learning: The clp codebase uses commit SHAs instead of version tags for GitHub Actions (like actions/checkout) as an established pattern across workflow files.

Applied to files:

  • .github/workflows/clp-artifact-build.yaml
🧬 Code graph analysis (3)
components/api-server/src/bin/api_server.rs (2)
components/clp-rust-utils/src/serde/yaml.rs (1)
  • from_path (18-21)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
  • Database (164-251)
components/clp-package-utils/clp_package_utils/controller.py (1)
components/clp-py-utils/clp_py_utils/core.py (1)
  • resolve_host_path_in_container (83-104)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-rust-utils/src/clp_config/package/config.rs (6)
  • default (21-29)
  • default (48-54)
  • default (67-74)
  • default (88-93)
  • default (110-114)
  • default (142-148)
🪛 Ruff (0.14.4)
components/clp-package-utils/clp_package_utils/controller.py

437-437: Logging statement uses f-string

(G004)

🔇 Additional comments (11)
.github/workflows/clp-artifact-build.yaml (1)

550-553: Verify consistency of HOME environment variable across all run_command invocations.

Line 552 adds HOME=/tmp to the package-image job's run_command, following the pattern established at lines 495–498 in ubuntu-jammy-lint. However, other run_command invocations (lines 233–239, 273–279, 313–319, 356–362) do not set HOME.

Consider whether this environment variable should be uniformly applied to all run_command invocations that execute within containers, or if selective application is intentional based on specific container configurations. If selective application is intentional, document why package-image and ubuntu-jammy-lint require HOME=/tmp while the binaries jobs do not.

taskfile.yaml (4)

31-31: LGTM — Build directory variables correctly defined.

The G_RUST_BUILD_DIR and G_WEBUI_BUILD_DIR variables follow the established pattern and will properly isolate build artefacts.

Also applies to: 33-33


76-77: LGTM — clean-rust integrated correctly.

The clean-rust task is properly placed in the clean task's dependency list, maintaining alphabetical ordering with other cleanup tasks.


100-102: LGTM — clean-rust task definition is correct.

The simple rm -rf command properly cleans the Rust build directory.


235-235: LGTM — rust dependency correctly added to package-build-deps.

The rust task is now a dependency of the package build pipeline, ensuring the API server binary is built before packaging.

components/package-template/src/etc/clp-config.yml (1)

121-129: LGTM! API server configuration structure is well-defined.

The commented-out configuration block follows the established pattern for optional components and provides sensible defaults that align with the Python and Rust configuration models.

docs/src/dev-docs/design-deployment-orchestration.md (1)

47-48: LGTM! Documentation accurately reflects the new API server component.

The architecture diagram, dependency graph, and service table have been correctly updated to include the API server. The ordering (api_server before webui) aligns with the startup sequence and addresses the past review feedback.

Also applies to: 67-68, 99-103, 131-131

components/clp-package-utils/clp_package_utils/controller.py (1)

429-451: LGTM! API server environment setup follows established patterns.

The _set_up_env_for_api_server method correctly creates the component's logs directory and returns the necessary environment variables (host and port) for Docker Compose deployment.

components/api-server/src/bin/api_server.rs (2)

34-54: LGTM! Helper functions improve code organization.

The extraction of read_config_and_credentials, set_up_logging, and shutdown_signal into separate functions addresses the past review feedback and significantly improves readability. The separation of concerns makes the startup flow in main much clearer.

Also applies to: 56-69, 71-80


89-93: LGTM! Address selection logic is clear and flexible.

The CLI arguments correctly override configuration file values for host and port, providing deployment flexibility while maintaining sensible defaults from the config file.

tools/deployment/package/docker-compose-all.yaml (1)

490-515: LGTM! API server service definition follows established patterns.

The service configuration correctly:

  • Sets up dependencies on initialization jobs
  • Mounts necessary volumes (config and logs)
  • Uses environment variables for database credentials
  • Exposes the API server port with configurable host/port via env vars

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants