Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/src/dev-docs/building-package.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ prebuilt version instead, check out the [releases](https://github.com/y-scope/cl
environment.
* It should be possible to build a package for a different environment, it just requires a some
extra configuration.
* [Docker]
* `containerd.io` >= 1.7.18
* `docker-buildx-plugin` >= 0.15.1
* `docker-ce` >= 27.0.3
* `docker-ce-cli` >= 27.0.3
* Python 3.9 or newer
* python3-dev
* python3-venv (for the version of Python installed)
Expand Down Expand Up @@ -80,6 +85,7 @@ task docker-images:package

This will create a Docker image named `clp-package:dev`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Update documentation to reflect the new image tagging scheme.

According to the PR summary, the build process now generates unique identifiers to avoid overwriting older builds on shared machines. The AI summary mentions tagging as clp-package:dev-${USER}-${short_id}, but this line still states the image will be named clp-package:dev.

Please update this documentation to:

  1. Explain that each build generates a unique image identifier
  2. Clarify how users can reference the built image (e.g., via the image.id file mentioned in the PR summary)
  3. Optionally, mention the tagging convention if it's relevant for users
🤖 Prompt for AI Agents
In docs/src/dev-docs/building-package.md around line 86, the text still claims
the Docker image will be named `clp-package:dev` but the build now generates
unique image identifiers; update this line to state that each build produces a
uniquely tagged image to avoid overwrites, note where to find the exact image
reference (point to the generated image.id file created by the build), and
optionally show the tagging convention used (e.g.,
clp-package:dev-${USER}-${short_id}) so users know how to reference the image or
read image.id for the precise tag.


[Docker]: https://docs.docker.com/engine/install/
[Task]: https://taskfile.dev/
[uv]: https://docs.astral.sh/uv/
[y-scope/clp#1352]: https://github.com/y-scope/clp/issues/1352
37 changes: 34 additions & 3 deletions tools/docker-images/clp-package/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,38 @@
set -eu
set -o pipefail

remove_temp_file_and_prev_image() {
rm -f "$temp_iid_file"

[[ -z "$prev_image_id" || "$prev_image_id" == "$new_image_id" ]] && return

if docker image inspect "$prev_image_id" >/dev/null 2>&1; then
echo "Removing previous image $prev_image_id."
docker image remove "$prev_image_id"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image might fail to be removed because some running container might still be referencing it, which causes a non-zero exit code in the script.

When the image fails to be removed, we should either exit gracefully (with 0) with a warning message printed, or the caller should gracefully handle the non-exit code. Currently, the caller (the docker-images.yaml taskfile) doesn't handle this --- when the script fails, any commands following the script call (e.g., to copy the unique image id file into build/clp-package) would fail to execute.

@coderabbitai create an issue to track this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Create GitHub issue to track the image removal failure problem
gh issue create \
  --title "Handle docker image removal failures gracefully in clp-package build script" \
  --body "## Problem

In \`tools/docker-images/clp-package/build.sh\`, the \`remove_temp_file_and_prev_image\` function (called via EXIT trap) attempts to remove the previous image using \`docker image remove\`. This can fail with a non-zero exit code if a running container is still referencing the image.

Since the script uses \`set -eu\`, this failure causes the script to exit with an error code, which in turn causes subsequent commands in the caller (\`docker-images.yaml\` taskfile) to fail execution (e.g., copying the unique image ID file into \`build/clp-package\`).

## Proposed Solutions

Either:
1. Exit gracefully (with exit code 0) and print a warning message when image removal fails, or
2. Have the caller (taskfile) gracefully handle the non-zero exit code

## Context

Reported by @junhaoliao in PR #1335:
https://github.com/y-scope/clp/pull/1335#discussion_r2400399192

## Affected Files

- \`tools/docker-images/clp-package/build.sh\`
- \`docker-images.yaml\` (taskfile caller)" \
  --assignee junhaoliao

Length of output: 42


@junhaoliao I've created issue #1370 to track this: #1370

The issue documents the problem where docker image remove can fail if a container is still referencing the image, causing the script to exit with an error due to set -eu. It includes both proposed solutions (graceful exit with warning or caller handling the error) and references this PR discussion.

fi
}
trap remove_temp_file_and_prev_image EXIT

script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
repo_root=${script_dir}/../../../
repo_root="${script_dir}/../../../"
iid_file="${repo_root}/build/clp-package-image.id"

prev_image_id=""
if [[ -f "$iid_file" ]]; then
prev_image_id=$(<"$iid_file")
fi

temp_iid_file="$(mktemp)"
new_image_id=""

build_cmd=(
docker build
--tag "clp-package:dev"
--iidfile "$temp_iid_file"
"$repo_root"
--file "${script_dir}/Dockerfile"
)

if command -v git >/dev/null && git -C "$script_dir" rev-parse --is-inside-work-tree >/dev/null ;
if command -v git >/dev/null && git -C "$script_dir" rev-parse --is-inside-work-tree >/dev/null;
then
build_cmd+=(
--label "org.opencontainers.image.revision=$(git -C "$script_dir" rev-parse HEAD)"
Expand All @@ -22,3 +43,13 @@ then
fi

"${build_cmd[@]}"

if [[ -s "$temp_iid_file" ]]; then
new_image_id="$(<"$temp_iid_file")"
echo "$new_image_id" > "$iid_file"

user="${USER:-$(id -un 2>/dev/null || whoami 2>/dev/null || echo unknown)}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one edge case we missed to handle -

When the username cannot be found on the system (e.g., the username is not specified in /etc/passwd), the user id would still be printed by id -un command. However, since the id -un command exits with non-zero code, the other command would continue to run, which may cause line breaks to show up in the middle of the user string. e.g.,

id -un: Prints `123\n`; Exits with non-0
whoami: Prints nothing; Exits with non-9
echo unknown: Prints "unknown"

then `user` becomes: `123\nunknown`

@coderabbitai create an issue to track this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create GitHub issue to track the username fallback edge case

gh issue create \
  --repo y-scope/clp \
  --title "Edge case in username fallback logic in clp-package build script" \
  --body "## Description

An edge case exists in the username fallback logic in \`tools/docker-images/clp-package/build.sh\` at line 51:

\`\`\`bash
user=\"\${USER:-\$(id -un 2>/dev/null || whoami 2>/dev/null || echo unknown)}\"
\`\`\`

When the username cannot be found on the system (e.g., the username is not specified in \`/etc/passwd\`), \`id -un\` may print the numeric user ID followed by a newline but exit with a non-zero code. Due to the \`||\` chaining, subsequent commands (\`whoami\` and \`echo unknown\`) continue to execute, causing their output to be appended, resulting in a multi-line string with line breaks embedded in the \`user\` variable.

## Example scenario

\`\`\`
id -un: Prints 123\\n; Exits with non-0
whoami: Prints nothing; Exits with non-0
echo unknown: Prints \"unknown\"

Result: user becomes \"123\\nunknown\"
\`\`\`

This could lead to invalid Docker image tag names or unexpected behaviour in the build process.

## References

- PR: https://github.com/y-scope/clp/pull/1335
- Comment: https://github.com/y-scope/clp/pull/1335#discussion_r2400404851
- Reported by: @junhaoliao" \
  --assignee junhaoliao

Length of output: 42


@junhaoliao I've created issue #1371 to track this edge case: #1371

The issue documents the scenario where the || chain in the username fallback logic can result in a multi-line string when id -un prints a numeric ID but exits with a non-zero code, causing subsequent commands to append their output.

short_id="${new_image_id#sha256:}"
short_id="${short_id:0:4}"
docker tag "$new_image_id" "clp-package:dev-${user}-${short_id}"
fi
Loading