Skip to content

Commit fa83657

Browse files
mszhanyiYi Zhang
authored andcommitted
Fix docker image layer caching to avoid redundant docker building and transient connection exceptions. (#21612)
### Description Improve docker commands to make docker image layer caching works. It can make docker building faster and more stable. So far, A100 pool's system disk is too small to use docker cache. We won't use pipeline cache for docker image and remove some legacy code. ### Motivation and Context There are often an exception of ``` 64.58 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail 286.4 curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2) ``` Because Onnxruntime pipeline have been sending too many requests to download Nodejs in docker building. Which is the major reason of pipeline failing now In fact, docker image layer caching never works. We can always see the scrips are still running ``` #9 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts #9 0.234 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) #9 0.235 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) #9 0.235 /tmp/scripts/install_centos.sh: line 1: !/bin/bash: No such file or directory #9 0.235 ++ '[' '!' -f /etc/yum.repos.d/microsoft-prod.repo ']' #9 0.236 +++ tr -dc 0-9. #9 0.236 +++ cut -d . -f1 #9 0.238 ++ os_major_version=8 .... #9 60.41 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail #9 60.59 + return 0 ... ``` This PR is improving the docker command to make image layer caching work. Thus, CI won't send so many redundant request of downloading NodeJS. ``` #9 [2/5] ADD scripts /tmp/scripts #9 CACHED #10 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts #10 CACHED #11 [4/5] RUN adduser --uid 1000 onnxruntimedev #11 CACHED #12 [5/5] WORKDIR /home/onnxruntimedev #12 CACHED ``` ###Reference https://docs.docker.com/build/drivers/ --------- Co-authored-by: Yi Zhang <[email protected]>
1 parent 099ba67 commit fa83657

File tree

6 files changed

+32
-69
lines changed

6 files changed

+32
-69
lines changed

tools/ci_build/get_docker_image.py

Lines changed: 6 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -98,42 +98,30 @@ def main():
9898
)
9999

100100
if use_container_registry:
101+
run(args.docker_path, "buildx", "create", "--driver=docker-container", "--name=container_builder")
101102
run(
102103
args.docker_path,
103104
"--log-level",
104105
"error",
105106
"buildx",
106107
"build",
107-
"--push",
108+
"--load",
108109
"--tag",
109110
full_image_name,
110-
"--cache-from",
111-
full_image_name,
111+
"--cache-from=type=registry,ref=" + full_image_name,
112+
"--builder",
113+
"container_builder",
112114
"--build-arg",
113115
"BUILDKIT_INLINE_CACHE=1",
114116
*shlex.split(args.docker_build_args),
115117
"-f",
116118
args.dockerfile,
117119
args.context,
118120
)
119-
elif args.use_imagecache:
120-
log.info("Building image with pipeline cache...")
121121
run(
122122
args.docker_path,
123-
"--log-level",
124-
"error",
125-
"buildx",
126-
"build",
127-
"--tag",
128-
full_image_name,
129-
"--cache-from",
123+
"push",
130124
full_image_name,
131-
"--build-arg",
132-
"BUILDKIT_INLINE_CACHE=1",
133-
*shlex.split(args.docker_build_args),
134-
"-f",
135-
args.dockerfile,
136-
args.context,
137125
)
138126
else:
139127
log.info("Building image...")

tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,7 @@ stages:
321321
--build-arg TRT_VERSION=${{ variables.linux_trt_version }}
322322
"
323323
Repository: onnxruntimeubi8packagestest_torch
324+
UseImageCacheContainerRegistry: false
324325
UpdateDepsTxt: false
325326

326327
- task: DownloadPackage@1

tools/ci_build/github/azure-pipelines/templates/c-api-linux-cpu.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,23 +51,23 @@ jobs:
5151
Dockerfile: tools/ci_build/github/linux/docker/inference/x86_64/default/cpu/Dockerfile
5252
Context: tools/ci_build/github/linux/docker/inference/x86_64/default/cpu
5353
DockerBuildArgs: "--build-arg BUILD_UID=$( id -u ) --build-arg BASEIMAGE=${{parameters.BaseImage}}"
54-
Repository: onnxruntimecpubuildcentos8${{parameters.OnnxruntimeArch}}
55-
54+
Repository: onnxruntimecpubuildcentos8${{parameters.OnnxruntimeArch}}_packaging
55+
5656
- ${{ if eq(parameters.OnnxruntimeArch, 'aarch64') }}:
5757
- template: get-docker-image-steps.yml
5858
parameters:
5959
Dockerfile: tools/ci_build/github/linux/docker/inference/aarch64/default/cpu/Dockerfile
6060
Context: tools/ci_build/github/linux/docker/inference/aarch64/default/cpu
6161
DockerBuildArgs: "--build-arg BUILD_UID=$( id -u ) --build-arg BASEIMAGE=${{parameters.BaseImage}}"
62-
Repository: onnxruntimecpubuildcentos8${{parameters.OnnxruntimeArch}}
62+
Repository: onnxruntimecpubuildcentos8${{parameters.OnnxruntimeArch}}_packaging
6363
UpdateDepsTxt: false
6464

6565
- task: CmdLine@2
6666
inputs:
6767
script: |
6868
mkdir -p $HOME/.onnx
6969
docker run --rm --volume /data/onnx:/data/onnx:ro --volume $(Build.SourcesDirectory):/onnxruntime_src --volume $(Build.BinariesDirectory):/build \
70-
--volume $HOME/.onnx:/home/onnxruntimedev/.onnx -e NIGHTLY_BUILD onnxruntimecpubuildcentos8${{parameters.OnnxruntimeArch}} /bin/bash -c "python3.9 \
70+
--volume $HOME/.onnx:/home/onnxruntimedev/.onnx -e NIGHTLY_BUILD onnxruntimecpubuildcentos8${{parameters.OnnxruntimeArch}}_packaging /bin/bash -c "python3.9 \
7171
/onnxruntime_src/tools/ci_build/build.py --enable_lto --build_java --build_nodejs --build_dir /build --config Release \
7272
--skip_submodule_sync --parallel --use_binskim_compliant_compile_flags --build_shared_lib ${{ parameters.AdditionalBuildFlags }} && cd /build/Release && make install DESTDIR=/build/installed"
7373
workingDirectory: $(Build.SourcesDirectory)

tools/ci_build/github/azure-pipelines/templates/get-docker-image-steps.yml

Lines changed: 19 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ steps:
5353
displayName: patch manylinux
5454

5555
- script: |
56+
docker version
5657
docker image ls
5758
docker system df
5859
displayName: Check Docker Images
@@ -71,52 +72,25 @@ steps:
7172
displayName: "Get ${{ parameters.Repository }} image for ${{ parameters.Dockerfile }}"
7273
ContainerRegistry: onnxruntimebuildcache
7374
- ${{ if eq(parameters.UseImageCacheContainerRegistry, false) }}:
74-
- task: Cache@2
75-
displayName: Cache Docker Image Task
76-
inputs:
77-
key: ' "${{ parameters.Repository }}" | "$(Build.SourceVersion)" '
78-
path: ${{ parameters.IMAGE_CACHE_DIR }}
79-
restoreKeys: |
80-
"${{ parameters.Repository }}" | "$(Build.SourceVersion)"
81-
"${{ parameters.Repository }}"
82-
cacheHitVar: CACHE_RESTORED
83-
condition: eq('${{ parameters.UsePipelineCache }}', 'true')
84-
85-
- script: |
86-
test -f ${{ parameters.IMAGE_CACHE_DIR }}/cache.tar && docker load -i ${{ parameters.IMAGE_CACHE_DIR }}/cache.tar
87-
docker image ls
88-
displayName: Docker restore
89-
condition: eq('${{ parameters.UsePipelineCache }}', 'true')
90-
91-
- script: |
92-
if [ ${{ parameters.UsePipelineCache}} ]
93-
then
94-
use_imagecache="--use_imagecache"
95-
else
96-
use_imagecache=""
97-
fi
98-
${{ parameters.ScriptName }} \
99-
--dockerfile "${{ parameters.Dockerfile }}" \
100-
--context "${{ parameters.Context }}" \
101-
--docker-build-args "${{ parameters.DockerBuildArgs }}" \
102-
--repository "${{ parameters.Repository }}" \
103-
$use_imagecache
104-
displayName: "Get ${{ parameters.Repository }} image for ${{ parameters.Dockerfile }}"
105-
106-
- script: |
107-
set -ex
108-
mkdir -p "${{ parameters.IMAGE_CACHE_DIR }}"
109-
docker save -o "${{ parameters.IMAGE_CACHE_DIR }}/cache.tar" ${{ parameters.Repository }}
110-
docker image ls
111-
docker system df
112-
displayName: Docker save
113-
condition: eq('${{ parameters.UsePipelineCache }}', 'true')
75+
# the difference is no --container-registry
76+
- template: with-container-registry-steps.yml
77+
parameters:
78+
Steps:
79+
- script: |
80+
${{ parameters.ScriptName }} \
81+
--dockerfile "${{ parameters.Dockerfile }}" \
82+
--context "${{ parameters.Context }}" \
83+
--docker-build-args "${{ parameters.DockerBuildArgs }}" \
84+
--repository "${{ parameters.Repository }}"
85+
displayName: "Get ${{ parameters.Repository }} image for ${{ parameters.Dockerfile }}"
86+
ContainerRegistry: onnxruntimebuildcache
11487

115-
- script: |
116-
echo ${{ parameters.IMAGE_CACHE_DIR }}
117-
ls -lah ${{ parameters.IMAGE_CACHE_DIR }}
118-
displayName: Display docker dir
119-
condition: eq('${{ parameters.UsePipelineCache }}', 'true')
88+
- script: |
89+
docker version
90+
docker image ls
91+
docker system df
92+
df -h
93+
displayName: Check Docker Images
12094

12195
- ${{ if and(eq(parameters.UpdateDepsTxt, true), or(eq(variables['System.CollectionId'], 'f3ad12f2-e480-4533-baf2-635c95467d29'),eq(variables['System.CollectionId'], 'bc038106-a83b-4dab-9dd3-5a41bc58f34c'))) }}:
12296
- task: PythonScript@0

tools/ci_build/github/linux/docker/inference/aarch64/default/cpu/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
ARG BASEIMAGE=arm64v8/almalinux:8
66
FROM $BASEIMAGE
77

8-
ENV PATH /opt/rh/gcc-toolset-12/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
8+
ENV PATH=/opt/rh/gcc-toolset-12/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
99
ENV LANG=en_US.UTF-8
1010
ENV LC_ALL=en_US.UTF-8
1111

tools/ci_build/github/linux/docker/inference/x86_64/default/cpu/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
ARG BASEIMAGE=amd64/almalinux:8
66
FROM $BASEIMAGE
77

8-
ENV PATH /usr/lib/jvm/msopenjdk-11/bin:/opt/rh/gcc-toolset-12/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
8+
ENV PATH=/usr/lib/jvm/msopenjdk-11/bin:/opt/rh/gcc-toolset-12/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
99
ENV LANG=en_US.UTF-8
1010
ENV LC_ALL=en_US.UTF-8
1111
ENV JAVA_HOME=/usr/lib/jvm/msopenjdk-11

0 commit comments

Comments
 (0)