-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[TRTLLM-9227][infra] Add Dockerfile step to copy sources #9368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -129,6 +129,9 @@ ARG BUILD_WHEEL_ARGS="--clean --benchmarks" | |
| ARG BUILD_WHEEL_SCRIPT="scripts/build_wheel.py" | ||
| RUN --mount=type=cache,target=/root/.cache/pip --mount=type=cache,target=${CCACHE_DIR} \ | ||
| GITHUB_MIRROR=$GITHUB_MIRROR python3 ${BUILD_WHEEL_SCRIPT} ${BUILD_WHEEL_ARGS} | ||
| RUN python3 scripts/copy_third_party_sources.py \ | ||
| --deps-dir cpp/build/_deps \ | ||
| --output-dir /third-party-sources | ||
|
|
||
| FROM ${DEVEL_IMAGE} AS release | ||
|
|
||
|
|
@@ -169,6 +172,8 @@ RUN chmod -R a+w examples && \ | |
| benchmarks/cpp/CMakeLists.txt && \ | ||
| rm -rf /root/.cache/pip | ||
|
|
||
| COPY --from=wheel /third-party-sources /third-party-sources | ||
|
|
||
|
Comment on lines
+175
to
+176
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have got a lot of complaints about the size of the release image. I am hesitant to increase it even further unless we absolutely have to. Could we distribute these sources in a different way?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand your concern. This is the strategy that was agreed between the OSRB compliance folks and TensorRT. I understand that it is a compromise motivated primarily by ensuring we can immediately demonstrate compliance without spinning up significant infrastructure or adding dedicated staffing. I suspect proposals for alternate strategies would very welcome in the future. @atrifex Can provide more information about the decision.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How much space does the additional layer take?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @MartinMarciniszyn - sizes are in the commit message. For your convenience, I added them up, and it looks like about 600MB. |
||
| ARG GIT_COMMIT | ||
| ARG TRT_LLM_VER | ||
| ENV TRT_LLM_GIT_COMMIT=${GIT_COMMIT} \ | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| """Copy third-party sources used in the cmake build to a container directory. | ||
|
|
||
| The purpose of this script is to simplify the process of producing third party | ||
| sources "as used" in the build. We package up all of the sources we use and | ||
| stash them in a location in the container so that they are automatically | ||
| distributed alongside the build artifacts ensuring that we comply with the | ||
| license requirements in an obvious and transparent manner. | ||
| """ | ||
|
|
||
| import argparse | ||
| import logging | ||
| import pathlib | ||
| import subprocess | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def main(): | ||
| parser = argparse.ArgumentParser(description=__doc__) | ||
| parser.add_argument( | ||
| "--deps-dir", | ||
| type=pathlib.Path, | ||
| required=True, | ||
| help="Path to the third party dependencies directory, e.g. ${CMAKE_BINARY_DIR}/_deps", | ||
| ) | ||
| parser.add_argument( | ||
| "--output-dir", | ||
| type=pathlib.Path, | ||
| required=True, | ||
| help="Path to the output directory where third party sources will be copied", | ||
| ) | ||
|
|
||
| args = parser.parse_args() | ||
|
|
||
| src_dirs = list(sorted(args.deps_dir.glob("*-src"))) | ||
| if not src_dirs: | ||
| raise ValueError(f"No source directories found in {args.deps_dir}") | ||
|
|
||
| for src_dir in src_dirs: | ||
| tarball_name = src_dir.name[:-4] + ".tar.gz" | ||
| output_path = args.output_dir / tarball_name | ||
| logger.info(f"Creating tarball {output_path} from {src_dir}") | ||
| args.output_dir.mkdir(parents=True, exist_ok=True) | ||
| subprocess.run( | ||
| ["tar", "-czf", str(output_path), "-C", str(src_dir.parent), src_dir.name], | ||
| check=True, | ||
| ) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| logging.basicConfig(level=logging.INFO) | ||
| main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not work or even break in the post-merge pipeline where we generate the NGC image candidates for releasing. In the post-merge pipeline, in order to speed up the release wheel generation, we actually don't build the wheel in the docker build. Instead, we copy the build artifacts from the corresponding normal build stage and install the pre-built wheel directly.
See https://github.com/NVIDIA/TensorRT-LLM/blob/main/jenkins/BuildDockerImage.groovy#L336.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the pointer. I will update and test the post-merge pipeline prior to submit.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chzblych Can you provide me with a more precise indication of what needs to change here. When I look at line 366 where you linked, I see that ultimately
prepareWheelFromBuildStagemodifies the make variablesBUILD_WHEEL_SCRIPTto call intoget_wheel_from_package.py.I don't see a lot of documentation to indicate what is going on here. Is the tarfile provided to by --artifact_path a tarball of the image layers? Do I simply need to update
get_wheel_from_package.pyto replicate:in python?
Also, what steps do I need to follow to verify the post-merge pipeline on this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZhanruiSunCh Could you take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_wheel_from_package.py, we will use the tarfile from build stage. If what you need is not in tarfile, you can add them in tarfile like here: https://github.com/NVIDIA/TensorRT-LLM/blob/main/jenkins/Build.groovy#L468-L474, and move it to the path what you need in https://github.com/NVIDIA/TensorRT-LLM/blob/main/scripts/get_wheel_from_package.py#L97/bot run --stage-list "Build-Docker-Images"to test it in PR, pls not merge with this commit.