Skip to content

Conversation

vbvictor
Copy link
Contributor

@vbvictor vbvictor commented Sep 28, 2025

This PR adds a base container llvm-downloader which later used to create two separate containers for premerge jobs:

  • code-format container with clang-format and black
  • code-lint container with clang-tidy

@llvmbot
Copy link
Member

llvmbot commented Sep 28, 2025

@llvm/pr-subscribers-github-workflow

Author: Baranov Victor (vbvictor)

Changes

WIP, do not review


Full diff: https://github.com/llvm/llvm-project/pull/161083.diff

2 Files Affected:

  • (added) .github/workflows/build-ci-container-code-format.yml (+105)
  • (added) .github/workflows/containers/github-action-ci-code-format/Dockerfile (+77)
diff --git a/.github/workflows/build-ci-container-code-format.yml b/.github/workflows/build-ci-container-code-format.yml
new file mode 100644
index 0000000000000..723ebd8a6e252
--- /dev/null
+++ b/.github/workflows/build-ci-container-code-format.yml
@@ -0,0 +1,105 @@
+name: Build CI Container
+
+permissions:
+  contents: read
+
+on:
+  push:
+    branches:
+      - main
+    paths:
+      - .github/workflows/build-ci-container-code-format.yml
+      - '.github/workflows/containers/github-action-ci-code-format/**'
+      - llvm/utils/git/code-format-helper.py
+      - llvm/utils/git/requirements_formatting.txt
+      - llvm/utils/git/requirements_formatting.txt.in
+  pull_request:
+    paths:
+      - .github/workflows/build-ci-container-code-format.yml
+      - '.github/workflows/containers/github-action-ci-code-format/**'
+      - llvm/utils/git/code-format-helper.py
+      - llvm/utils/git/requirements_formatting.txt
+      - llvm/utils/git/requirements_formatting.txt.in
+
+jobs:
+  build-ci-container-code-format:
+    if: github.repository_owner == 'llvm'
+    runs-on: depot-ubuntu-24.04-16
+    steps:
+      - name: Checkout LLVM
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
+        with:
+          sparse-checkout: .github/workflows/containers/github-action-ci-code-format/
+      - name: Write Variables
+        id: vars
+        run: |
+          tag=$(git rev-parse --short=12 HEAD)
+          container_name="ghcr.io/$GITHUB_REPOSITORY_OWNER/amd64/ci-ubuntu-24.04-code-format"
+          echo "container-name=$container_name" >> $GITHUB_OUTPUT
+          echo "container-name-tag=$container_name:$tag" >> $GITHUB_OUTPUT
+          echo "container-filename=$(echo $container_name:$tag  | sed -e 's/\//-/g' -e 's/:/-/g').tar" >> $GITHUB_OUTPUT
+      - name: Build container
+        run: |
+          podman build --target ci-container-code-format \
+          -f .github/workflows/containers/github-action-ci-code-format/Dockerfile \
+          -t ${{ steps.vars.outputs.container-name-tag }} .
+
+      # Save the container so we have it in case the push fails.  This also
+      # allows us to separate the push step into a different job so we can
+      # maintain minimal permissions while building the container.
+      - name: Save container image
+        run: |
+          podman save ${{ steps.vars.outputs.container-name-tag }}  >  ${{ steps.vars.outputs.container-filename }}
+
+      - name: Upload container image
+        uses: actions/upload-artifact@65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08 # v4.6.0
+        with:
+          name: container-amd64
+          path: "*.tar"
+          retention-days: 14
+
+      - name: Test Container
+        run: |
+          for image in ${{ steps.vars.outputs.container-name-tag }}; do
+            # Use --pull=never to ensure we are testing the just built image.
+            podman run --pull=never --rm -it $image /usr/bin/bash -x -c 'cd $HOME && clang-format --version | grep version'
+          done
+
+  push-ci-container:
+    if: github.event_name == 'push'
+    needs:
+      - build-ci-container-code-format
+    permissions:
+      packages: write
+    runs-on: ubuntu-24.04
+    env:
+      GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+    steps:
+      - name: Download container
+        uses: actions/download-artifact@634f93cb2916e3fdff6788551b99b062d0335ce0 # v5.0.0
+
+      - name: Push Container
+        run: |
+          function push_container {
+            image_name=$1
+            latest_name=$(echo $image_name | sed 's/:[a-f0-9]\+$/:latest/g')
+            podman tag $image_name $latest_name
+            echo "Pushing $image_name ..."
+            podman push $image_name
+            echo "Pushing $latest_name ..."
+            podman push $latest_name
+          }
+
+          podman login -u ${{ github.actor }} -p $GITHUB_TOKEN ghcr.io
+          for f in $(find . -iname *.tar); do
+            image_name=$(podman load -q -i $f | sed 's/Loaded image: //g')
+            push_container $image_name
+
+            if echo $image_name | grep '/amd64/'; then
+              # For amd64, create an alias with the arch component removed.
+              # This matches the convention used on dockerhub.
+              default_image_name=$(echo $(dirname $(dirname $image_name))/$(basename $image_name))
+              podman tag $image_name $default_image_name
+              push_container $default_image_name
+            fi
+          done
diff --git a/.github/workflows/containers/github-action-ci-code-format/Dockerfile b/.github/workflows/containers/github-action-ci-code-format/Dockerfile
new file mode 100644
index 0000000000000..88d43618bd22e
--- /dev/null
+++ b/.github/workflows/containers/github-action-ci-code-format/Dockerfile
@@ -0,0 +1,77 @@
+FROM docker.io/library/ubuntu:24.04 AS base
+ENV LLVM_SYSROOT=/opt/llvm
+
+FROM base AS clang-format-toolchain
+ENV LLVM_VERSION=21.1.1
+
+RUN apt-get update && \
+    apt-get install -y \
+    wget \
+    gcc \
+    g++ \
+    cmake \
+    ninja-build \
+    python3 \
+    git \
+    curl \
+    zlib1g-dev && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+RUN curl -O -L https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-$LLVM_VERSION.tar.gz && \
+  tar -xf llvmorg-$LLVM_VERSION.tar.gz && \
+  rm -f llvmorg-$LLVM_VERSION.tar.gz
+
+WORKDIR /llvm-project-llvmorg-$LLVM_VERSION
+
+RUN cmake -B ./build -G Ninja ./llvm \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DCMAKE_INSTALL_PREFIX="$LLVM_SYSROOT" \
+  -DLLVM_ENABLE_PROJECTS="clang" \
+  -DLLVM_DISTRIBUTION_COMPONENTS="clang-format"
+
+RUN ninja -C ./build install-distribution
+
+FROM base AS ci-container-code-format
+
+COPY --from=clang-format-toolchain $LLVM_SYSROOT $LLVM_SYSROOT
+
+# Need nodejs for some of the GitHub actions.
+# Need git for git-clang-format.
+RUN apt-get update && \
+    DEBIAN_FRONTEND=noninteractive apt-get install -y \
+    # binutils \
+    git \
+    nodejs \
+    # python3-psutil \
+    sudo \
+    # These are needed by the premerge pipeline. Pip and venv are used to
+    # install dependent python packages.
+    # Having a symlink from python to python3 enables code sharing between
+    # the Linux and Windows pipelines.
+    python3-pip \
+    python3-venv \
+    python-is-python3 && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+ENV LLVM_SYSROOT=$LLVM_SYSROOT
+ENV PATH=${LLVM_SYSROOT}/bin:${PATH}
+
+# Create a new user to avoid test failures related to a lack of expected
+# permissions issues in some tests. Set the user id to 1001 as that is the
+# user id that Github Actions uses to perform the checkout action.
+RUN useradd gha -u 1001 -m -s /bin/bash
+
+# Also add the user to passwordless sudoers so that we can install software
+# later on without having to rebuild the container.
+RUN adduser gha sudo
+RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
+
+USER gha
+WORKDIR /home/gha
+
+COPY llvm/utils/git/requirements_formatting.txt /home/gha/requirements_formatting.txt
+RUN python -m venv venv && \
+    venv/bin/pip install -r /home/gha/requirements_formatting.txt && \
+    rm /home/gha/requirements_formatting.txt

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right approach.

  1. Building LLVM yet another time for another container doesn't make much sense. We should probably be building this as an offshoot of the PGO optimized container so we can keep all the builds together. That complicates version updates though.
  2. The number of container build workflows that have basically the exact same contents are getting high. It needs to get refactored into a common action that can be invoked with a dockerfile path/image name.

@vbvictor
Copy link
Contributor Author

vbvictor commented Sep 29, 2025

I don't think this is the right approach.

  1. Building LLVM yet another time for another container doesn't make much sense. We should probably be building this as an offshoot of the PGO optimized container so we can keep all the builds together. That complicates version updates though.
  2. The number of container build workflows that have basically the exact same contents are getting high. It needs to get refactored into a common action that can be invoked with a dockerfile path/image name.
  1. Do you mean we should create another FROM base as ci-code-format-container in .github/workflows/containers/github-action-ci/Dockerfile or refactor base in a separate Dockerfile and create separate Dockerfiles for ci-container, ci-code-format, etc.. ? I personally prefer second option with multiple Dockefiles. We could pass LLVM_VERSION and other parameters (like projects to build) via ENV.

  2. So we can have a separate .yml files for



    And then create workflows that will basically consist of these two steps but with different container names.

@boomanaiden154
Copy link
Contributor

Do you mean we should create another FROM base as ci-code-format-container in .github/workflows/containers/github-action-ci/Dockerfile

This. Then add another container push in the existing workflow, similar to what we do for the agent container. If we want to build from source.

What I would probably prefer is to keep it separate (like this) but just pull in the clang-format binary by downloading it from the official release binaries. And I guess a separate workflow is fine for now. It needs to be refactored later, but it's not really hurting anything.

@vbvictor
Copy link
Contributor Author

What I would probably prefer is to keep it separate (like this) but just pull in the clang-format binary by downloading it from the official release binaries.

Prebuild binaries indeed is most straightforward and easy approach. BTW, do we have prebuild linux binaries available? I guess we need to download them from apt.

@boomanaiden154
Copy link
Contributor

Prebuild binaries indeed is most straightforward and easy approach. BTW, do we have prebuild linux binaries available? I guess we need to download them from apt.

The release binaries on Github. Eg https://github.com/llvm/llvm-project/releases/tag/llvmorg-21.1.2 should have it. Apt doesn't allow precise control over minor versions.

@vbvictor
Copy link
Contributor Author

vbvictor commented Oct 2, 2025

Prebuild binaries indeed is most straightforward and easy approach. BTW, do we have prebuild linux binaries available? I guess we need to download them from apt.

The release binaries on Github. Eg https://github.com/llvm/llvm-project/releases/tag/llvmorg-21.1.2 should have it. Apt doesn't allow precise control over minor versions.

I've implemented this solution. I think we better have a base container with prebuild tooling and then create code-lint, code-format on top of it.

As for the build size, here is what I got locally:

localhost/test-code-lint-container     latest      40a6e6950779  13 minutes ago  859 MB
localhost/test-code-format-container   latest      3552a9520deb  14 minutes ago  703 MB

Not great, not terrible :) I'll investigate if there is anything more to strip (probably could use apt-get install --no-install-recommends and apt-get autoremove)

@vbvictor vbvictor force-pushed the add-code-format-container branch from a4a25bd to 8496a1b Compare October 2, 2025 23:30
@vbvictor vbvictor changed the title [WIP][Github][CI] Add separate container for code-format premerge job [Github][CI] Add separate container for code-format premerge job Oct 3, 2025
@vbvictor
Copy link
Contributor Author

vbvictor commented Oct 3, 2025

@boomanaiden154, would you mind giving a look at this and add other reviewers who you think should look at it too?

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more comments. This looks almost ready.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming CI passes.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the most recent changes.

@vbvictor vbvictor merged commit 6beb860 into llvm:main Oct 6, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants