Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docker/Dockerfile.multi
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,9 @@ ARG BUILD_WHEEL_ARGS="--clean --benchmarks"
ARG BUILD_WHEEL_SCRIPT="scripts/build_wheel.py"
RUN --mount=type=cache,target=/root/.cache/pip --mount=type=cache,target=${CCACHE_DIR} \
GITHUB_MIRROR=$GITHUB_MIRROR python3 ${BUILD_WHEEL_SCRIPT} ${BUILD_WHEEL_ARGS}
RUN python3 scripts/copy_third_party_sources.py \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not work or even break in the post-merge pipeline where we generate the NGC image candidates for releasing. In the post-merge pipeline, in order to speed up the release wheel generation, we actually don't build the wheel in the docker build. Instead, we copy the build artifacts from the corresponding normal build stage and install the pre-built wheel directly.

See https://github.com/NVIDIA/TensorRT-LLM/blob/main/jenkins/BuildDockerImage.groovy#L336.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pointer. I will update and test the post-merge pipeline prior to submit.

Copy link
Copy Markdown
Collaborator Author

@cheshirekow cheshirekow Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chzblych Can you provide me with a more precise indication of what needs to change here. When I look at line 366 where you linked, I see that ultimately prepareWheelFromBuildStage modifies the make variables BUILD_WHEEL_SCRIPT to call into get_wheel_from_package.py.

I don't see a lot of documentation to indicate what is going on here. Is the tarfile provided to by --artifact_path a tarball of the image layers? Do I simply need to update get_wheel_from_package.py to replicate:

COPY --from=wheel /third-party-sources /third-party-sources

in python?

Also, what steps do I need to follow to verify the post-merge pipeline on this change?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZhanruiSunCh Could you take a look?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. When use get_wheel_from_package.py, we will use the tarfile from build stage. If what you need is not in tarfile, you can add them in tarfile like here: https://github.com/NVIDIA/TensorRT-LLM/blob/main/jenkins/Build.groovy#L468-L474, and move it to the path what you need in https://github.com/NVIDIA/TensorRT-LLM/blob/main/scripts/get_wheel_from_package.py#L97
  2. For local test, you can cherry-pick this commit, and use /bot run --stage-list "Build-Docker-Images" to test it in PR, pls not merge with this commit.

--deps-dir cpp/build/_deps \
--output-dir /third-party-sources

FROM ${DEVEL_IMAGE} AS release

Expand Down Expand Up @@ -169,6 +172,8 @@ RUN chmod -R a+w examples && \
benchmarks/cpp/CMakeLists.txt && \
rm -rf /root/.cache/pip

COPY --from=wheel /third-party-sources /third-party-sources

Comment on lines +175 to +176
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have got a lot of complaints about the size of the release image. I am hesitant to increase it even further unless we absolutely have to. Could we distribute these sources in a different way?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern. This is the strategy that was agreed between the OSRB compliance folks and TensorRT. I understand that it is a compromise motivated primarily by ensuring we can immediately demonstrate compliance without spinning up significant infrastructure or adding dedicated staffing. I suspect proposals for alternate strategies would very welcome in the future.

@atrifex Can provide more information about the decision.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much space does the additional layer take?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MartinMarciniszyn - sizes are in the commit message. For your convenience, I added them up, and it looks like about 600MB.

ARG GIT_COMMIT
ARG TRT_LLM_VER
ENV TRT_LLM_GIT_COMMIT=${GIT_COMMIT} \
Expand Down
52 changes: 52 additions & 0 deletions scripts/copy_third_party_sources.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
"""Copy third-party sources used in the cmake build to a container directory.

The purpose of this script is to simplify the process of producing third party
sources "as used" in the build. We package up all of the sources we use and
stash them in a location in the container so that they are automatically
distributed alongside the build artifacts ensuring that we comply with the
license requirements in an obvious and transparent manner.
"""

import argparse
import logging
import pathlib
import subprocess

logger = logging.getLogger(__name__)


def main():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--deps-dir",
type=pathlib.Path,
required=True,
help="Path to the third party dependencies directory, e.g. ${CMAKE_BINARY_DIR}/_deps",
)
parser.add_argument(
"--output-dir",
type=pathlib.Path,
required=True,
help="Path to the output directory where third party sources will be copied",
)

args = parser.parse_args()

src_dirs = list(sorted(args.deps_dir.glob("*-src")))
if not src_dirs:
raise ValueError(f"No source directories found in {args.deps_dir}")

for src_dir in src_dirs:
tarball_name = src_dir.name[:-4] + ".tar.gz"
output_path = args.output_dir / tarball_name
logger.info(f"Creating tarball {output_path} from {src_dir}")
args.output_dir.mkdir(parents=True, exist_ok=True)
subprocess.run(
["tar", "-czf", str(output_path), "-C", str(src_dir.parent), src_dir.name],
check=True,
)


if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
main()