Skip to content
Merged
Changes from 12 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
7d18129
fix(docker): Move USER and ENV directives after image flattening in c…
junhaoliao Oct 13, 2025
b193c2f
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 13, 2025
3195ef0
feat(docker): Update clp-package Dockerfile; add non-root user and se…
junhaoliao Oct 14, 2025
18a1b18
fix(docker): Reorder and restructure ENV directives in clp-package Do…
junhaoliao Oct 14, 2025
3ef8714
add \
junhaoliao Oct 14, 2025
1b2d4cb
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 14, 2025
78a7af3
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 20, 2025
b3e673f
fix(docker): Use `--link` flag in COPY command for clp-package Docker…
junhaoliao Oct 20, 2025
65f2d1b
fix(docker): Reorder COPY command in clp-package Dockerfile
junhaoliao Oct 20, 2025
b522212
fix(docker): Add ARG for UID and set ownership in COPY command for cl…
junhaoliao Oct 20, 2025
dba99b7
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 21, 2025
360cac6
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 21, 2025
913dea3
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 21, 2025
8a991cc
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 21, 2025
89f0a1c
Merge branch 'main' into fix-clp-package-docker
sitaowang1998 Oct 21, 2025
2967a64
merge ENVs
junhaoliao Oct 23, 2025
1139cbf
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 23, 2025
d82b984
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 23, 2025
d39ab50
Merge branch 'main' into fix-clp-package-docker
junhaoliao Oct 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions tools/docker-images/clp-package/Dockerfile
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix #1410, we need to "Move USER and ENV directives after image flattening in clp-package Dockerfile"

Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,19 @@ RUN ./setup-scripts/install-prebuilt-packages.sh \
RUN apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Flatten the image
FROM scratch
COPY --link --from=base / /

ARG UID=1000
ENV CLP_HOME="/opt/clp"
ENV PATH="${CLP_HOME}/bin:${PATH}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PATH can be moved inside the multi line env with something like

PATH="${CLP_HOME}/sbin:${CLP_HOME}/bin:${PATH}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, would it be better to keep them on separate lines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine from readability side. Appending two elements to a list is not hard to understand.

ENV PATH="${CLP_HOME}/sbin:${PATH}"
ENV PYTHONPATH="${CLP_HOME}/lib/python3/site-packages"
ENV PATH="${CLP_HOME}/sbin:${PATH}" \
PYTHONPATH="${CLP_HOME}/lib/python3/site-packages" \
USER="clp-user"
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

ENV consolidation contradicts prior agreement to keep lines split.

Lines 21-23 consolidate PATH and related settings into a multi-line ENV statement. However, based on learnings from PR #1414 and the prior agreement in PR #1166, ENV directives should remain split into separate lines for readability, especially since later ENV settings depend on earlier ones (e.g., PYTHONPATH and USER reference CLP_HOME). Please revert to separate ENV lines.

Apply this diff to split the ENV statements:

 ENV CLP_HOME="/opt/clp"
 ENV PATH="${CLP_HOME}/bin:${PATH}"
-ENV PATH="${CLP_HOME}/sbin:${PATH}" \
-    PYTHONPATH="${CLP_HOME}/lib/python3/site-packages" \
-    USER="clp-user"
+ENV PATH="${CLP_HOME}/sbin:${PATH}"
+ENV PYTHONPATH="${CLP_HOME}/lib/python3/site-packages"
+ENV USER="clp-user"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ENV PATH="${CLP_HOME}/sbin:${PATH}" \
PYTHONPATH="${CLP_HOME}/lib/python3/site-packages" \
USER="clp-user"
ENV PATH="${CLP_HOME}/sbin:${PATH}"
ENV PYTHONPATH="${CLP_HOME}/lib/python3/site-packages"
ENV USER="clp-user"
🤖 Prompt for AI Agents
In tools/docker-images/clp-package/Dockerfile around lines 21 to 23, the ENV
entries were combined into a single multi-line ENV which contradicts the agreed
style and can break dependent expansions; revert to separate ENV directives so
each variable is set on its own line (e.g., one ENV for PATH, one for
PYTHONPATH, one for USER) ensuring later variables can safely reference CLP_HOME
and improving readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now only split if consolidation into the same command is not possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junhaoliao, thank you for the clarification! I understand now that the policy has been updated: ENV directives should be consolidated into multi-line statements when possible, and only split when dependencies between variables require separate commands.

The current implementation is correct under this updated guidance:

  • Line 19: ENV CLP_HOME="/opt/clp" — separate because it's referenced by subsequent ENVs
  • Line 20: ENV PATH="${CLP_HOME}/bin:${PATH}" — separate because PATH must be modified in sequence
  • Lines 21-23: Consolidated multi-line ENV for the second PATH modification, PYTHONPATH, and USER — correct since these can all be set together

I apologize for the confusion caused by my earlier comment based on the outdated policy from PR #1166. The consolidation here properly addresses issue #1378 (reducing layers) while maintaining correct dependency order.


✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp#1413
File: tools/docker-images/clp-package/Dockerfile:22-24
Timestamp: 2025-10-20T21:05:30.383Z
Learning: In the clp repository's Dockerfiles, ENV directives should be consolidated into multi-line ENV statements when possible to reduce image layers. ENV statements should only be split into separate commands when consolidation is not possible due to dependencies (e.g., when later variables must reference earlier ones that need to be set first, or when PATH must be modified sequentially).

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp#1414
File: tools/docker-images/clp-package/Dockerfile:20-24
Timestamp: 2025-10-13T03:32:19.293Z
Learning: In the clp repository's Dockerfiles (e.g., tools/docker-images/clp-package/Dockerfile), ENV directives should be split into separate lines for readability rather than consolidated to reduce layer count. This is especially true for PATH modifications, as agreed upon in PR #1166. Later ENV settings may depend on earlier ones (e.g., referencing CLP_HOME).


USER 1000:1000
RUN useradd --uid ${UID} --shell /bin/bash --home-dir ${CLP_HOME} ${USER}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the user home path setting here broke the s3 profile auth type because the CONTAINER_AWS_CONFIG_DIRECTORY's value should be relative to the CLP_HOME but isn't

@coderabbitai create an issue to track this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create a GitHub issue to track the S3 profile auth type breakage

gh issue create \
  --repo y-scope/clp \
  --title "S3 profile auth type broken due to user home path change in Docker image" \
  --body "## Problem

The user home path setting introduced in PR #1413 broke the S3 profile auth type because \`CONTAINER_AWS_CONFIG_DIRECTORY\`'s value should be relative to \`CLP_HOME\` but isn't.

## Context

- PR #1413 sets the home directory for \`clp-user\` to \`CLP_HOME\` (\`/opt/clp\`) in the Docker image
- \`CONTAINER_AWS_CONFIG_DIRECTORY\` is currently hardcoded to \`/.aws\`
- This mismatch breaks S3 profile authentication

## Related

- PR: https://github.com/y-scope/clp/pull/1413
- Comment: https://github.com/y-scope/clp/pull/1413#discussion_r2476592326
- Reported by: @junhaoliao" \
  --assignee junhaoliao

Length of output: 42


@junhaoliao, I've created issue #1524 to track the S3 profile auth type breakage caused by the user home path change. The issue documents how the CONTAINER_AWS_CONFIG_DIRECTORY hardcoded path conflicts with the new home directory setting.

#1524


🧠 Learnings used
Learnt from: junhaoliao
PR: y-scope/clp#1414
File: tools/docker-images/clp-package/Dockerfile:20-24
Timestamp: 2025-10-13T03:32:19.293Z
Learning: In the clp repository's Dockerfiles (e.g., tools/docker-images/clp-package/Dockerfile), ENV directives should be split into separate lines for readability rather than consolidated to reduce layer count. This is especially true for PATH modifications, as agreed upon in PR #1166. Later ENV settings may depend on earlier ones (e.g., referencing CLP_HOME).

Learnt from: junhaoliao
PR: y-scope/clp#1413
File: tools/docker-images/clp-package/Dockerfile:22-24
Timestamp: 2025-10-20T21:05:30.417Z
Learning: In the clp repository's Dockerfiles, ENV directives should be consolidated into multi-line ENV statements when possible to reduce image layers. ENV statements should only be split into separate commands when consolidation is not possible due to dependencies (e.g., when later variables must reference earlier ones that need to be set first, or when PATH must be modified sequentially).

Learnt from: junhaoliao
PR: y-scope/clp#1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Learnt from: junhaoliao
PR: y-scope/clp#1335
File: taskfiles/docker-images.yaml:15-15
Timestamp: 2025-09-25T19:26:32.436Z
Learning: In the CLP project's Taskfile, the `:package` task creates the G_PACKAGE_BUILD_DIR directory structure, so any task that depends on `:package` (like `docker-images:package`) can safely assume this directory exists without needing additional mkdir commands.

Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate `set -eu` and `set -o pipefail` commands rather than combining them into `set -euo pipefail`, to ensure uniform script structure across all platform build scripts.

Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.

Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86/build.sh:18-24
Timestamp: 2025-07-01T14:52:02.418Z
Learning: In the CLP project, consistency across platform build scripts is prioritized over defensive programming when it comes to git remote handling. All build.sh files in docker-images directories should follow the same pattern for git metadata injection.

Learnt from: junhaoliao
PR: y-scope/clp#1178
File: components/clp-py-utils/clp_py_utils/clp_config.py:47-47
Timestamp: 2025-10-07T07:54:32.427Z
Learning: In components/clp-py-utils/clp_py_utils/clp_config.py, the CONTAINER_AWS_CONFIG_DIRECTORY constant is intentionally set to pathlib.Path("/") / ".aws" (i.e., `/.aws`) rather than a user-specific home directory. This hardcoded path is part of the container orchestration design.

Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

USER ${USER}
WORKDIR ${CLP_HOME}

COPY --link ./build/clp-package /opt/clp

# Flatten the image
FROM scratch
COPY --link --from=base / /
COPY --link --chown=${UID} ./build/clp-package ${CLP_HOME}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although Docker's documentation states that --chown= supports usernames and performs lookups in /etc/passwd and /etc/group within the container to resolve UID/GID, we must use ${UID} instead of ${USER} here because such lookups are not yet supported when --link is used.

This limitation is a known issue in Docker Buildx, tracked at moby/buildkit#2987

Loading