Skip to content

Commit 81ca3ad

Browse files
rraminenjithunnair-amd
authored andcommitted
CONSOLIDATED COMMITS: Centos stream9 PyTorch support
==================================================== [SOW MS3] Centos stream9 PyTorch image support (#1090) * changes to build Centos stream 9 images * Added scripts for centos and centos stream images * Added an extra line * Add ninja installation * Optimized code * Fixes * Add comment * Optimized code * Added AMDGPU mapping for ROCm 5.2 and invalid-url for rocm_baseurl Co-authored-by: Jithun Nair <[email protected]> Updated to latest conda for CentOS stream 9 [CS9] Updates to CentOS stream 9 build (#1326) - Add missing common_utils.sh - Update the install vision part - Move to amdgpu rhel 9.3 builds - Update to pick python from conda path - Add a missing package - Add ROCM_PATH and magma - Updated repo radeon path (cherry picked from commit 51ce1cc) [rocm6.4_internal_testing] Update missing changes for CentOS9 (#1813) To fix, https://ontrack-internal.amd.com/browse/SWDEV-505385 and https://ontrack-internal.amd.com/browse/SWDEV-507301 (cherry picked from commit 956c145) delete .ci/docker/common/install_db.sh (cherry picked from commit 8a7fd64) CONSOLIDATED COMMITS: Updates to build on Jammy and CentOS7 =========================================================== Updates to build on Jammy - Fortran package installation moved after gcc - Update libtinfo search code in cmake1 - Install libstdc++.so [UB22.04] Updates to support latest scipy Build required version of libpng for CentOS7 Updated condition for libstc++ for Jammy Set ROCM_PATH in env for centOS docker container Changes to support docker v23 Reversed the condition as required temporarily ignore certificate check for Miniconda (cherry picked from commit 9848db1) [release/2.1] Skip certificate check for CentOS7 since certificate expired (#1399) * Skip certificate check only for CentOS7 since certificate expired * Naming Remove the installation of rocm-llvm-dev package - Causing regression - SWDEV-463083 fix install_centos() function [rocm6.3_internal_testing] skip pytorch-nightly installstion (#1557) This PR skips pytorch-nightly installation in docker images Installation of pytorch-nightly is needed to prefetch mobilenet_v2 avd v3 models for some tests. Came from 85bd6bc Models are downloaded on first use to the folder /root/.cache/torch/hub But pytorch-nightly installation also overrides .ci/docker/requirements-ci.txt settings and upgrades some of python packages (sympy from 1.12.0 to 1.13.0) which causes several 'dynamic_shapes' tests to fail Skip prefetching models affects these tests without any errors (but **internet access required**): - python test/mobile/model_test/gen_test_model.py mobilenet_v2 - python test/quantization/eager/test_numeric_suite_eager.py -k test_mobilenet_v3 Issue ROCm/frameworks-internal#8772 Also, in case of some issues these models can be prefetched after pytorch building and before testing (cherry picked from commit b92b34d) Fixes #ISSUE_NUMBER (cherry picked from commit ec70f7e) [rocm6.4_internal_testing] Changes to support UB 24.04 build (#1817) Changes applied from #1816 Test PyTorch build: http://rocm-ci.amd.com/job/mainline-framework-pytorch-ub24.04-py312-internal/5/ (cherry picked from commit 74e1e9e) (cherry picked from commit e7cb7cc) Update Centos 9 build (cherry picked from commit 3d6ba22) [rocm6.5_internal_testing] remove centos.stream dockerfile and move contents into dockerfile (#2044) rocm6.5_internal_testing move contents of centos stream dockerfile into dockerfile Validation: http://rocm-ci.amd.com/job/mainline-framework-pytorch-ci/2448/ --------- Co-authored-by: Jithun Nair <[email protected]> (cherry picked from commit 7886773)
1 parent d945b9d commit 81ca3ad

File tree

8 files changed

+154
-38
lines changed

8 files changed

+154
-38
lines changed

.ci/docker/build.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -326,6 +326,10 @@ if [[ -n "${CI:-}" ]]; then
326326
progress_flag="--progress=plain"
327327
fi
328328

329+
if [[ "${DOCKER_BUILDKIT}" == 0 ]]; then
330+
progress_flag=""
331+
fi
332+
329333
# Build image
330334
docker build \
331335
${no_cache_flag} \

.ci/docker/centos-rocm/Dockerfile

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
ARG CENTOS_VERSION
22

3-
FROM centos:${CENTOS_VERSION}
3+
FROM quay.io/centos/centos:stream${CENTOS_VERSION}
44

5-
ARG CENTOS_VERSION
65

76
# Set AMD gpu targets to build for
87
ARG PYTORCH_ROCM_ARCH
@@ -14,6 +13,9 @@ ENV PYTORCH_ROCM_ARCH ${PYTORCH_ROCM_ARCH}
1413
COPY ./common/install_base.sh install_base.sh
1514
RUN bash ./install_base.sh && rm install_base.sh
1615

16+
#Install langpack
17+
RUN yum install -y glibc-langpack-en
18+
1719
# Update CentOS git version
1820
RUN yum -y remove git
1921
RUN yum -y remove git-*
@@ -22,11 +24,12 @@ RUN yum -y install https://packages.endpointdev.com/rhel/7/os/x86_64/endpoint-re
2224
RUN yum install -y git
2325

2426
# Install devtoolset
25-
ARG DEVTOOLSET_VERSION
26-
COPY ./common/install_devtoolset.sh install_devtoolset.sh
27-
RUN bash ./install_devtoolset.sh && rm install_devtoolset.sh
27+
RUN dnf install -y rpmdevtools
2828
ENV BASH_ENV "/etc/profile"
2929

30+
# Install ninja
31+
RUN dnf --enablerepo=crb install -y ninja-build
32+
3033
# (optional) Install non-default glibc version
3134
ARG GLIBC_VERSION
3235
COPY ./common/install_glibc.sh install_glibc.sh
@@ -69,6 +72,8 @@ RUN rm install_rocm_magma.sh
6972
COPY ./common/install_amdsmi.sh install_amdsmi.sh
7073
RUN bash ./install_amdsmi.sh
7174
RUN rm install_amdsmi.sh
75+
76+
ENV ROCM_PATH /opt/rocm
7277
ENV PATH /opt/rocm/bin:$PATH
7378
ENV PATH /opt/rocm/hcc/bin:$PATH
7479
ENV PATH /opt/rocm/hip/bin:$PATH

.ci/docker/common/cache_vision_models.sh

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,20 @@
22

33
set -ex
44

5+
# Skip pytorch-nightly installation in docker images
6+
# Installation of pytorch-nightly is needed to prefetch mobilenet_v2 avd v3 models for some tests.
7+
# Came from https://github.com/ROCm/pytorch/commit/85bd6bc0105162293fa0bbfb7b661f85ec67f85a
8+
# Models are downloaded on first use to the folder /root/.cache/torch/hub
9+
# But pytorch-nightly installation also overrides .ci/docker/requirements-ci.txt settings
10+
# and upgrades some of python packages (sympy from 1.12.0 to 1.13.0)
11+
# which causes several 'dynamic_shapes' tests to fail
12+
# Skip prefetching models affects these tests without any errors:
13+
# python test/mobile/model_test/gen_test_model.py mobilenet_v2
14+
# python test/quantization/eager/test_numeric_suite_eager.py -k test_mobilenet_v3
15+
# Issue https://github.com/ROCm/frameworks-internal/issues/8772
16+
echo "Skip torch-nightly installation"
17+
exit 0
18+
519
source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
620

721
# Cache the test models at ~/.cache/torch/hub/

.ci/docker/common/install_base.sh

Lines changed: 54 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -77,15 +77,50 @@ install_ubuntu() {
7777
# see: https://github.com/pytorch/pytorch/issues/65931
7878
apt-get install -y libgnutls30
7979

80+
if [[ "$UBUNTU_VERSION" == "22.04"* ]]; then
81+
apt-get install -y libopenblas-dev
82+
fi
83+
8084
# Cleanup package manager
8185
apt-get autoclean && apt-get clean
8286
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
8387
}
8488

89+
build_libpng() {
90+
# install few packages
91+
yum install -y zlib zlib-devel
92+
93+
LIBPNG_VERSION=1.6.37
94+
95+
mkdir -p libpng
96+
pushd libpng
97+
98+
wget http://download.sourceforge.net/libpng/libpng-$LIBPNG_VERSION.tar.gz
99+
tar -xvzf libpng-$LIBPNG_VERSION.tar.gz
100+
101+
pushd libpng-$LIBPNG_VERSION
102+
103+
./configure
104+
make
105+
make install
106+
107+
popd
108+
109+
popd
110+
rm -rf libpng
111+
}
112+
85113
install_centos() {
86114
# Need EPEL for many packages we depend on.
87115
# See http://fedoraproject.org/wiki/EPEL
88-
yum --enablerepo=extras install -y epel-release
116+
# extras repo is not there for CentOS 9 and epel-release is already part of repo list
117+
if [[ $OS_VERSION == 9 ]]; then
118+
yum install -y epel-release
119+
ALLOW_ERASE="--allowerasing"
120+
else
121+
yum --enablerepo=extras install -y epel-release
122+
ALLOW_ERASE=""
123+
fi
89124

90125
ccache_deps="asciidoc docbook-dtds docbook-style-xsl libxslt"
91126
numpy_deps="gcc-gfortran"
@@ -106,24 +141,39 @@ install_centos() {
106141
glibc-headers \
107142
glog-devel \
108143
libstdc++-devel \
109-
libsndfile-devel \
110144
make \
111-
opencv-devel \
112145
sudo \
113146
wget \
114147
vim \
115148
unzip \
116149
gdb
117150

151+
if [[ $OS_VERSION == 9 ]]
152+
then
153+
dnf --enablerepo=crb -y install libsndfile-devel
154+
yum install -y procps
155+
else
156+
yum install -y \
157+
opencv-devel \
158+
libsndfile-devel
159+
fi
160+
161+
# CentOS7 doesnt have support for higher version of libpng,
162+
# so it is built from source.
163+
# Libpng is required for torchvision build.
164+
build_libpng
165+
118166
# Cleanup
119167
yum clean all
120168
rm -rf /var/cache/yum
121169
rm -rf /var/lib/yum/yumdb
122170
rm -rf /var/lib/yum/history
123171
}
124172

125-
# Install base packages depending on the base OS
126173
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
174+
OS_VERSION=$(grep -oP '(?<=^VERSION_ID=).+' /etc/os-release | tr -d '"')
175+
176+
# Install base packages depending on the base OS
127177
case "$ID" in
128178
ubuntu)
129179
install_ubuntu

.ci/docker/common/install_conda.sh

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,10 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
2424
source "${SCRIPT_FOLDER}/common_utils.sh"
2525

2626
pushd /tmp
27-
wget -q "${BASE_URL}/${CONDA_FILE}"
27+
if [ -n $CENTOS_VERSION ] && [[ $CENTOS_VERSION == 7.* ]]; then
28+
NO_CHECK_CERTIFICATE_FLAG="--no-check-certificate"
29+
fi
30+
wget -q "${BASE_URL}/${CONDA_FILE}" ${NO_CHECK_CERTIFICATE_FLAG}
2831
# NB: Manually invoke bash per https://github.com/conda/conda/issues/10431
2932
as_jenkins bash "${CONDA_FILE}" -b -f -p "/opt/conda"
3033
popd
@@ -40,8 +43,13 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
4043

4144
# Prevent conda from updating to 4.14.0, which causes docker build failures
4245
# See https://hud.pytorch.org/pytorch/pytorch/commit/754d7f05b6841e555cea5a4b2c505dd9e0baec1d
43-
# Uncomment the below when resolved to track the latest conda update
44-
# as_jenkins conda update -y -n base conda
46+
# Uncomment the below when resolved to track the latest conda update,
47+
# but this is required for CentOS stream 9 builds
48+
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
49+
OS_VERSION=$(grep -oP '(?<=^VERSION_ID=).+' /etc/os-release | tr -d '"')
50+
if [[ $ID == centos && $OS_VERSION == 9 ]]; then
51+
as_jenkins conda update -y -n base conda
52+
fi
4553

4654
if [[ $(uname -m) == "aarch64" ]]; then
4755
export SYSROOT_DEP="sysroot_linux-aarch64=2.17"
@@ -86,6 +94,15 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
8694
conda_install_through_forge libstdcxx-ng=14
8795
fi
8896

97+
# Install required libstdc++.so.6 version
98+
if [ "$ANACONDA_PYTHON_VERSION" = "3.10" ] || [ "$ANACONDA_PYTHON_VERSION" = "3.9" ] ; then
99+
conda_install_through_forge libstdcxx-ng=12
100+
fi
101+
102+
if [ "$ANACONDA_PYTHON_VERSION" = "3.12" ] || [ "$UBUNTU_VERSION" == "24.04"* ] ; then
103+
conda_install_through_forge libstdcxx-ng=14
104+
fi
105+
89106
# Install some other packages, including those needed for Python test reporting
90107
pip_install -r /opt/conda/requirements-ci.txt
91108

.ci/docker/common/install_rocm.sh

Lines changed: 40 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -60,10 +60,6 @@ EOF
6060
roctracer-dev \
6161
amd-smi-lib
6262

63-
if [[ $(ver $ROCM_VERSION) -ge $(ver 6.1) ]]; then
64-
DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated rocm-llvm-dev
65-
fi
66-
6763
# precompiled miopen kernels added in ROCm 3.5, renamed in ROCm 5.5
6864
# search for all unversioned packages
6965
# if search fails it will abort this script; use true to avoid case where search fails
@@ -124,44 +120,64 @@ install_centos() {
124120
yum update -y
125121
yum install -y kmod
126122
yum install -y wget
127-
yum install -y openblas-devel
123+
124+
if [[ $OS_VERSION == 9 ]]; then
125+
dnf install -y openblas-serial
126+
dnf install -y dkms kernel-headers kernel-devel
127+
else
128+
yum install -y openblas-devel
129+
yum install -y dkms kernel-headers-`uname -r` kernel-devel-`uname -r`
130+
fi
128131

129132
yum install -y epel-release
130-
yum install -y dkms kernel-headers-`uname -r` kernel-devel-`uname -r`
131133

132-
# Add amdgpu repository
133-
local amdgpu_baseurl
134+
if [[ $(ver $ROCM_VERSION) -ge $(ver 4.5) ]]; then
135+
# Add amdgpu repository
136+
local amdgpu_baseurl
137+
if [[ $OS_VERSION == 9 ]]; then
138+
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/rhel/9.1/main/x86_64"
139+
else
140+
if [[ $(ver $ROCM_VERSION) -ge $(ver 5.3) ]]; then
141+
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/rhel/7.9/main/x86_64"
142+
else
143+
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/rhel/7.9/main/x86_64"
144+
fi
145+
fi
146+
echo "[AMDGPU]" > /etc/yum.repos.d/amdgpu.repo
147+
echo "name=AMDGPU" >> /etc/yum.repos.d/amdgpu.repo
148+
echo "baseurl=${amdgpu_baseurl}" >> /etc/yum.repos.d/amdgpu.repo
149+
echo "enabled=1" >> /etc/yum.repos.d/amdgpu.repo
150+
echo "gpgcheck=1" >> /etc/yum.repos.d/amdgpu.repo
151+
echo "gpgkey=http://repo.radeon.com/rocm/rocm.gpg.key" >> /etc/yum.repos.d/amdgpu.repo
152+
fi
153+
134154
if [[ $OS_VERSION == 9 ]]; then
135-
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/rhel/9.0/main/x86_64"
155+
local rocm_baseurl="invalid-url"
136156
else
137-
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/rhel/7.9/main/x86_64"
157+
local rocm_baseurl="http://repo.radeon.com/rocm/yum/${ROCM_VERSION}/main"
138158
fi
139-
echo "[AMDGPU]" > /etc/yum.repos.d/amdgpu.repo
140-
echo "name=AMDGPU" >> /etc/yum.repos.d/amdgpu.repo
141-
echo "baseurl=${amdgpu_baseurl}" >> /etc/yum.repos.d/amdgpu.repo
142-
echo "enabled=1" >> /etc/yum.repos.d/amdgpu.repo
143-
echo "gpgcheck=1" >> /etc/yum.repos.d/amdgpu.repo
144-
echo "gpgkey=http://repo.radeon.com/rocm/rocm.gpg.key" >> /etc/yum.repos.d/amdgpu.repo
145-
146-
local rocm_baseurl="http://repo.radeon.com/rocm/yum/${ROCM_VERSION}"
147159
echo "[ROCm]" > /etc/yum.repos.d/rocm.repo
148160
echo "name=ROCm" >> /etc/yum.repos.d/rocm.repo
149161
echo "baseurl=${rocm_baseurl}" >> /etc/yum.repos.d/rocm.repo
150162
echo "enabled=1" >> /etc/yum.repos.d/rocm.repo
151163
echo "gpgcheck=1" >> /etc/yum.repos.d/rocm.repo
152164
echo "gpgkey=http://repo.radeon.com/rocm/rocm.gpg.key" >> /etc/yum.repos.d/rocm.repo
153165

154-
yum update -y
155-
156-
yum install -y \
166+
if [[ $OS_VERSION == 9 ]]; then
167+
yum update -y --nogpgcheck
168+
dnf --enablerepo=crb install -y perl-File-BaseDir python3-wheel
169+
yum install -y --nogpgcheck rocm-ml-sdk rocm-developer-tools
170+
else
171+
yum update -y
172+
yum install -y \
157173
rocm-dev \
158174
rocm-utils \
159175
rocm-libs \
160176
rccl \
161177
rocprofiler-dev \
162178
roctracer-dev \
163179
amd-smi-lib
164-
180+
fi
165181
# precompiled miopen kernels; search for all unversioned packages
166182
# if search fails it will abort this script; use true to avoid case where search fails
167183
MIOPENHIPGFX=$(yum -q search miopen-hip-gfx | grep miopen-hip-gfx | awk '{print $1}'| grep -F kdb. || true)
@@ -186,6 +202,8 @@ install_centos() {
186202
rm -rf /var/lib/yum/history
187203
}
188204

205+
OS_VERSION=$(grep -oP '(?<=^VERSION_ID=).+' /etc/os-release | tr -d '"')
206+
189207
# Install Python packages depending on the base OS
190208
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
191209
case "$ID" in

.ci/docker/common/install_vision.sh

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,14 @@ install_ubuntu() {
1515
install_centos() {
1616
# Need EPEL for many packages we depend on.
1717
# See http://fedoraproject.org/wiki/EPEL
18-
yum --enablerepo=extras install -y epel-release
19-
20-
yum install -y \
21-
opencv-devel
18+
if [[ $OS_VERSION == 9 ]]; then
19+
yum install -y epel-release
20+
else
21+
yum --enablerepo=extras install -y epel-release
22+
yum install -y \
23+
opencv-devel \
24+
ffmpeg-devel
25+
fi
2226

2327
# Cleanup
2428
yum clean all
@@ -27,6 +31,8 @@ install_centos() {
2731
rm -rf /var/lib/yum/history
2832
}
2933

34+
OS_VERSION=$(grep -oP '(?<=^VERSION_ID=).+' /etc/os-release | tr -d '"')
35+
3036
# Install base packages depending on the base OS
3137
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
3238
case "$ID" in

.ci/docker/requirements-ci.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,8 @@ ninja==1.11.1.4
117117
#Pinned versions: 1.11.1.4
118118
#test that import: run_test.py, test_cpp_extensions_aot.py,test_determination.py
119119

120+
numba==0.49.0 ; python_version < "3.9"
121+
numba==0.55.2 ; python_version == "3.9"
120122
numba==0.55.2 ; python_version == "3.10" and platform_machine != "s390x"
121123
numba==0.60.0 ; python_version == "3.12" and platform_machine != "s390x"
122124
#Description: Just-In-Time Compiler for Numerical Functions

0 commit comments

Comments
 (0)