Skip to content

Commit a3ca551

Browse files
ci: add a preliminary compliance scan to ci (#7289) (#7306)
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com> Signed-off-by: Anant Sharma <anants@nvidia.com> Co-authored-by: Anant Sharma <anants@nvidia.com>
1 parent 2e3605e commit a3ca551

File tree

13 files changed

+1030
-12
lines changed

13 files changed

+1030
-12
lines changed

.dockerignore

Lines changed: 1 addition & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,5 @@
11
# SPDX-FileCopyrightText: Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
22
# SPDX-License-Identifier: Apache-2.0
3-
#
4-
# Licensed under the Apache License, Version 2.0 (the "License");
5-
# you may not use this file except in compliance with the License.
6-
# You may obtain a copy of the License at
7-
#
8-
# http://www.apache.org/licenses/LICENSE-2.0
9-
#
10-
# Unless required by applicable law or agreed to in writing, software
11-
# distributed under the License is distributed on an "AS IS" BASIS,
12-
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13-
# See the License for the specific language governing permissions and
14-
# limitations under the License.
153

164
**/*.onnx
175
**/*.plan
@@ -45,6 +33,7 @@ container/Dockerfile*
4533
container/**/*.Dockerfile
4634
container/render.py
4735
container/context.yaml
36+
container/compliance/
4837
.venv
4938
.venv-docs
5039

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
name: 'Compliance Scan'
5+
description: 'Generate attribution CSVs (dpkg + Python) for a container image and upload as workflow artifacts'
6+
7+
inputs:
8+
image:
9+
description: 'Full container image URI to scan (must be pullable)'
10+
required: true
11+
artifact_name:
12+
description: 'Name for the uploaded artifact (e.g., compliance-vllm-cuda12-amd64)'
13+
required: true
14+
framework:
15+
description: 'Framework name for base image resolution (vllm, sglang, trtllm, dynamo)'
16+
required: false
17+
default: ''
18+
target:
19+
description: 'Build target for base image resolution (runtime or frontend)'
20+
required: false
21+
default: 'runtime'
22+
cuda_version:
23+
description: 'CUDA version for base image resolution (e.g., 12.9, 13.0, 13.1)'
24+
required: false
25+
default: ''
26+
base_image:
27+
description: 'Explicit base image for diff (overrides framework/cuda-version auto-resolve)'
28+
required: false
29+
default: ''
30+
retention_days:
31+
description: 'Artifact retention in days'
32+
required: false
33+
default: '90'
34+
35+
runs:
36+
using: "composite"
37+
steps:
38+
- name: Set up Docker Buildx
39+
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 #v3.11.1
40+
with:
41+
driver: docker-container
42+
# Enable BuildKit for enhanced metadata
43+
buildkitd-flags: --debug
44+
version: v0.14.1
45+
- name: Cleanup
46+
if: always()
47+
shell: bash
48+
run: |
49+
docker system prune -af
50+
- name: Set up Python
51+
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
52+
with:
53+
python-version: '3.12'
54+
pip-install: pyyaml
55+
56+
- name: Pull container image
57+
shell: bash
58+
run: |
59+
source ./.github/scripts/retry_docker.sh
60+
retry_pull ${{ inputs.image }}
61+
62+
- name: Generate attribution CSVs
63+
shell: bash
64+
run: |
65+
ARGS=""
66+
if [ -n "${{ inputs.base_image }}" ]; then
67+
ARGS+=" --base-image ${{ inputs.base_image }}"
68+
elif [ -n "${{ inputs.framework }}" ]; then
69+
ARGS+=" --framework ${{ inputs.framework }}"
70+
ARGS+=" --target ${{ inputs.target }}"
71+
if [ -n "${{ inputs.cuda_version }}" ]; then
72+
ARGS+=" --cuda-version ${{ inputs.cuda_version }}"
73+
fi
74+
fi
75+
76+
python container/compliance/generate_attributions.py \
77+
"${{ inputs.image }}" \
78+
--output "${{ inputs.artifact_name }}.csv" \
79+
--verbose \
80+
${ARGS}
81+
82+
- name: Upload attribution artifacts
83+
if: always()
84+
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
85+
with:
86+
name: ${{ inputs.artifact_name }}
87+
path: ${{ inputs.artifact_name }}*.csv
88+
retention-days: ${{ inputs.retention_days }}

.github/filters.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ core:
8282
- 'container/templates/wheel_builder.Dockerfile'
8383
- '.dockerignore'
8484
- 'container/deps/*'
85+
- 'container/compliance/**'
8586
- '.cargo/config.toml'
8687
- 'lib/**'
8788
- 'tests/**'
@@ -151,6 +152,7 @@ frontend:
151152
- '*.toml'
152153
- '*.lock'
153154
- 'container/deps/*'
155+
- 'container/compliance/**'
154156
- 'components/src/dynamo/router/**'
155157
- 'components/src/dynamo/mocker/**'
156158
- 'components/src/dynamo/frontend/**'

.github/scripts/retry_docker.sh

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Retry docker operations with exponential backoff.
2+
# Safe under `set -e`: the `if` conditional context prevents a failed
3+
# `docker <operation>` from triggering an immediate exit.
4+
retry_docker_operation() {
5+
local operation="$1"
6+
local image="$2"
7+
local max_attempts=3
8+
local wait_seconds=10
9+
local attempt=1
10+
11+
if [[ "$operation" != "push" && "$operation" != "pull" ]]; then
12+
echo "Unsupported docker operation: $operation (expected: push|pull)" >&2
13+
return 2
14+
fi
15+
16+
while true; do
17+
if docker "$operation" "$image"; then
18+
return 0
19+
fi
20+
echo "Docker ${operation} failed for $image (attempt ${attempt}/${max_attempts})." >&2
21+
22+
if (( attempt >= max_attempts )); then
23+
echo "Docker ${operation} failed after ${max_attempts} attempts: $image" >&2
24+
return 1
25+
fi
26+
27+
echo "Retrying docker ${operation} in ${wait_seconds}s..."
28+
sleep "$wait_seconds"
29+
attempt=$((attempt + 1))
30+
wait_seconds=$((wait_seconds * 2))
31+
if (( wait_seconds > 120 )); then
32+
wait_seconds=120
33+
fi
34+
done
35+
}
36+
37+
retry_push() {
38+
local image="$1"
39+
retry_docker_operation push "$image"
40+
}
41+
42+
retry_pull() {
43+
local image="$1"
44+
retry_docker_operation pull "$image"
45+
}

.github/workflows/build-frontend-image.yaml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,44 @@ jobs:
221221
echo "| \`${{ steps.calculate-target-tag.outputs.default_target_image_uri }}\` |" >> $GITHUB_STEP_SUMMARY
222222
echo "| \`${{ steps.calculate-target-tag.outputs.azure_target_image_uri }}\` |" >> $GITHUB_STEP_SUMMARY
223223
224+
# ============================================================================
225+
# COMPLIANCE — Generate attribution CSVs for dpkg and Python packages
226+
# ============================================================================
227+
compliance:
228+
needs: [build-frontend-image, changed-files]
229+
if: needs.build-frontend-image.result == 'success'
230+
strategy:
231+
fail-fast: false
232+
matrix:
233+
arch: [amd64, arm64]
234+
name: Compliance frontend-${{ matrix.arch }}
235+
runs-on: ${{ matrix.arch == 'amd64' && 'prod-builder-amd-v1' || 'prod-tester-arm-v1' }}
236+
steps:
237+
- name: Checkout repository
238+
uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
239+
- name: Docker Login
240+
uses: ./.github/actions/docker-login
241+
with:
242+
aws_default_region: ${{ secrets.AWS_DEFAULT_REGION }}
243+
aws_account_id: ${{ secrets.AWS_ACCOUNT_ID }}
244+
azure_acr_hostname: ${{ secrets.AZURE_ACR_HOSTNAME }}
245+
azure_acr_user: ${{ secrets.AZURE_ACR_USER }}
246+
azure_acr_password: ${{ secrets.AZURE_ACR_PASSWORD }}
247+
- name: Calculate image URI
248+
id: images
249+
shell: bash
250+
run: |
251+
TARGET_TAG="${{ github.sha }}-frontend-${{ matrix.arch }}"
252+
FRONTEND_IMAGE="${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ secrets.AWS_DEFAULT_REGION }}.amazonaws.com/ai-dynamo/dynamo:${TARGET_TAG}"
253+
echo "frontend_image=${FRONTEND_IMAGE}" >> $GITHUB_OUTPUT
254+
- name: Compliance scan
255+
uses: ./.github/actions/compliance-scan
256+
with:
257+
image: ${{ steps.images.outputs.frontend_image }}
258+
artifact_name: compliance-frontend-${{ matrix.arch }}
259+
framework: dynamo
260+
target: frontend
261+
224262
frontend-status-check:
225263
runs-on: ubuntu-latest
226264
needs: [changed-files, build-frontend-image, build-epp-image]

.github/workflows/build-test-distribute-flavor.yml

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -427,6 +427,43 @@ jobs:
427427
dind_as_sidecar: 'true'
428428

429429

430+
# ============================================================================
431+
# COMPLIANCE — Generate attribution CSVs for dpkg and Python packages
432+
# ============================================================================
433+
compliance:
434+
if: inputs.build_image && inputs.push_image
435+
needs: [build]
436+
name: Compliance cuda${{ inputs.cuda_version }}-${{ inputs.platform }}
437+
runs-on: ${{ inputs.platform == 'amd64' && 'prod-builder-amd-v1' || 'prod-tester-arm-v1' }}
438+
steps:
439+
- name: Checkout repository
440+
uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
441+
- name: Docker Login
442+
uses: ./.github/actions/docker-login
443+
with:
444+
aws_default_region: ${{ secrets.AWS_DEFAULT_REGION }}
445+
aws_account_id: ${{ secrets.AWS_ACCOUNT_ID }}
446+
azure_acr_hostname: ${{ secrets.AZURE_ACR_HOSTNAME }}
447+
azure_acr_user: ${{ secrets.AZURE_ACR_USER }}
448+
azure_acr_password: ${{ secrets.AZURE_ACR_PASSWORD }}
449+
- name: Calculate image URI
450+
id: images
451+
shell: bash
452+
run: |
453+
CUDA_VERSION_RAW=${{ inputs.cuda_version }}
454+
CUDA_VERSION=${CUDA_VERSION_RAW%%.*}
455+
echo "cuda_major=${CUDA_VERSION}" >> $GITHUB_OUTPUT
456+
RUNTIME_IMAGE=${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ secrets.AWS_DEFAULT_REGION }}.amazonaws.com/ai-dynamo/dynamo:${{ needs.build.outputs.target_tag_plain }}-cuda${CUDA_VERSION}-${{ inputs.platform }}
457+
echo "runtime_image=${RUNTIME_IMAGE}" >> $GITHUB_OUTPUT
458+
- name: Compliance scan
459+
uses: ./.github/actions/compliance-scan
460+
with:
461+
image: ${{ steps.images.outputs.runtime_image }}
462+
artifact_name: compliance-${{ inputs.framework }}-${{ inputs.target }}${{ inputs.make_efa && '-efa' || '' }}-cuda${{ steps.images.outputs.cuda_major }}-${{ inputs.platform }}
463+
framework: ${{ inputs.framework }}
464+
cuda_version: ${{ inputs.cuda_version }}
465+
466+
430467
# ============================================================================
431468
# COPY TO ACR
432469
# ============================================================================

container/compliance/README.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Container Compliance Tooling
2+
3+
Scripts for generating attribution CSVs from built container images, listing all installed dpkg and Python packages with their SPDX license identifiers where known.
4+
5+
## Output format
6+
7+
Each run produces up to two CSV files:
8+
9+
| Column | Description |
10+
|--------|-------------|
11+
| `package_name` | Package name as reported by dpkg or pip |
12+
| `version` | Installed version |
13+
| `type` | `dpkg` or `python` |
14+
| `spdx_license` | SPDX identifier (e.g. `MIT`, `Apache-2.0`) or `UNKNOWN` |
15+
16+
Files are sorted by `(type, package_name)` for stable diffs.
17+
18+
When a base image is provided, a second `_diff.csv` file is written containing only packages that are new or version-changed relative to the base — i.e. what Dynamo's build layers added on top of the upstream image.
19+
20+
## Usage
21+
22+
```bash
23+
# Full scan, output to stdout
24+
python container/compliance/generate_attributions.py <image:tag>
25+
26+
# Write to file
27+
python container/compliance/generate_attributions.py <image:tag> -o attribution.csv
28+
29+
# With base image diff — auto-resolved from context.yaml
30+
python container/compliance/generate_attributions.py <image:tag> \
31+
--framework vllm \
32+
--cuda-version 12.9 \
33+
-o attribution-vllm-cuda12-amd64.csv
34+
# Produces: attribution-vllm-cuda12-amd64.csv (full)
35+
# attribution-vllm-cuda12-amd64_diff.csv (delta from base)
36+
37+
# With explicit base image override
38+
python container/compliance/generate_attributions.py <image:tag> \
39+
--base-image nvcr.io/nvidia/cuda:12.9.1-runtime-ubuntu24.04 \
40+
-o attribution.csv
41+
42+
# Frontend image
43+
python container/compliance/generate_attributions.py <image:tag> \
44+
--framework dynamo \
45+
--target frontend \
46+
-o attribution-frontend-amd64.csv
47+
48+
# dpkg only
49+
python container/compliance/generate_attributions.py <image:tag> \
50+
--types dpkg \
51+
-o attribution-dpkg.csv
52+
```
53+
54+
### All flags
55+
56+
| Flag | Default | Description |
57+
|------|---------|-------------|
58+
| `image` | *(required)* | Container image to scan |
59+
| `--output`, `-o` | stdout | Output CSV path |
60+
| `--framework` || Auto-resolve base image from `context.yaml` (`vllm`, `sglang`, `trtllm`, `dynamo`) |
61+
| `--target` | `runtime` | Build target for base resolution (`runtime` or `frontend`) |
62+
| `--cuda-version` || CUDA version for base resolution (e.g. `12.9`, `13.0`, `13.1`) |
63+
| `--base-image` || Explicit base image URI (overrides `--framework` auto-resolve) |
64+
| `--context-yaml` | `container/context.yaml` | Path to context.yaml |
65+
| `--types` | `dpkg,python` | Comma-separated list of types to extract |
66+
| `--docker-cmd` | `docker` | Docker binary to use |
67+
| `--verbose`, `-v` || Enable verbose logging to stderr |
68+
69+
## Base image reference
70+
71+
| Framework | CUDA | Base image |
72+
|-----------|------|------------|
73+
| `vllm` | 12.9 | `nvcr.io/nvidia/cuda:12.9.1-runtime-ubuntu24.04` |
74+
| `vllm` | 13.0 | `nvcr.io/nvidia/cuda:13.0.2-runtime-ubuntu24.04` |
75+
| `sglang` | 12.9 | `lmsysorg/sglang:v0.5.9-runtime` |
76+
| `sglang` | 13.0 | `lmsysorg/sglang:v0.5.9-cu130-runtime` |
77+
| `trtllm` | 13.1 | `nvcr.io/nvidia/cuda-dl-base:25.12-cuda13.1-runtime-ubuntu24.04` |
78+
| `dynamo` frontend || `nvcr.io/nvidia/base/ubuntu:noble-20250619` |
79+
80+
These values are sourced from `container/context.yaml` at runtime; the table above reflects the current defaults.
81+
82+
## How it works
83+
84+
The script runs two lightweight helper scripts **inside the container** via `docker run --rm -v`:
85+
86+
- **dpkg extractor** — runs `dpkg-query` to list packages, then reads `/usr/share/doc/<pkg>/copyright` files for license info. Only DEP-5 machine-readable copyright files are parsed; ambiguous cases return `UNKNOWN`.
87+
- **Python extractor** — uses `importlib.metadata.distributions()` to iterate installed packages. License is read from `License-Expression` (PEP 639), then `License` metadata, then trove classifiers. Ambiguous cases return `UNKNOWN`.
88+
89+
Both helpers are self-contained and have no external dependencies — they run with whatever Python is in the container.
90+
91+
## License detection
92+
93+
Detection is intentionally conservative: only unambiguous matches are assigned SPDX identifiers. The `UNKNOWN` entries are expected; they can be resolved with additional analysis against the raw copyright files.
94+
95+
## CI integration
96+
97+
Attribution CSVs are generated automatically as part of CI after every successful image build. Artifacts are available in the GitHub Actions workflow run under:
98+
- `compliance-{framework}-cuda{major}-{platform}` — runtime images
99+
- `compliance-frontend-{arch}` — frontend image
100+
101+
The scan runs as a separate lightweight job (`prod-default-small-v2`) in parallel with tests, so it does not extend pipeline wall time.
102+
103+
## Requirements
104+
105+
- Python 3.11+
106+
- `docker` (or compatible CLI) with access to the target registry
107+
- `pyyaml` — only required on the host when using `--framework`/`--cuda-version` base image auto-resolution (`pip install pyyaml`)
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
"""Attribution extractors for container dependency scanning."""

0 commit comments

Comments
 (0)