Skip to content

Commit 33f4261

Browse files
committed
Add dockerfile and update README with instructions.
1 parent 72b5c0a commit 33f4261

File tree

2 files changed

+76
-3
lines changed

2 files changed

+76
-3
lines changed

models/turbine_models/custom_models/torchbench/README.md

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,27 @@ This directory serves as a place for scripts and utilities to run a suite of ben
66

77
Eventually, we want this process to be a plug-in to the upstream torchbench process, and this will be accomplished by exposing the IREE methodology shown here as a compile/runtime backend for the torch benchmark classes. For now, it is set up for developers as a way to get preliminary results and achieve blanket functionality for the models listed in export.py.
88

9-
### Setup
9+
The setup instructions provided here, in a few cases, use "gfx942" as the IREE/LLVM hip target. This is for MI300x accelerators -- you can find a mapping of AMD targets to their LLVM target architecture [here](https://llvm.org/docs/AMDGPUUsage.html#amdgpu-architecture-table), and replace "gfx942" in the following documentation with your desired target.
10+
11+
## Setup (docker)
12+
13+
Use the dockerfile provided with the following build/run commands to execute in docker.
14+
These commands assume a few things about your machine/distro, so please read them and make sure they do what you want.
15+
16+
```shell
17+
docker build --platform linux/amd64 --tag shark_torchbench --file shark_torchbench.dockerfile .
18+
```
19+
```shell
20+
docker run -it --network=host --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v shark_torchbench:/SHARK-Turbine/models/turbine_models/custom_models/torchbench/outputs -w /SHARK-Turbine/models/turbine_models/custom_models/torchbench shark_torchbench:latest
21+
```
22+
```shell
23+
python3 ./export.py --target=gfx942 --device=rocm --compile_to=vmfb --performance --inference --precision=fp16 --float16 --external_weights=safetensors --external_weights_dir=./torchbench_weights/ --output_csv=./outputs/torchbench_results_SHARK.csv
24+
```
25+
26+
27+
## Setup (source)
28+
29+
### Setup source code and prerequisites
1030

1131
- pip install torch+rocm packages:
1232
```shell
@@ -41,10 +61,10 @@ cd ..
4161
python ./export.py --target=gfx942 --device=rocm --compile_to=vmfb --performance --inference --precision=fp16 --float16 --external_weights=safetensors --external_weights_dir=./torchbench_weights/
4262
```
4363

44-
### Example (hf_Albert)
64+
### Example of manual benchmark using export and IREE runtime CLI (hf_Albert)
4565

4666
```shell
4767
python ./export.py --target=gfx942 --device=rocm --compile_to=vmfb --performance --inference --precision=fp16 --float16 --external_weights=safetensors --external_weights_dir=./torchbench_weights/ --model_id=hf_Albert
4868

49-
iree-benchmark-module --module=hf_Albert_32_fp16_gfx942.vmfb --input=@input0.npy --parameters=model=./torchbench_weights/hf_Albert_fp16.irpa --device=hip://0 --device_allocator=caching --function=main --benchmark_repetitions=10
69+
iree-benchmark-module --module=generated/hf_Albert_32_fp16_gfx942.vmfb --input=@generated/hf_Albert_input0.npy --parameters=model=./torchbench_weights/hf_Albert_fp16.irpa --device=hip://0 --device_allocator=caching --function=main --benchmark_repetitions=10
5070
```
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
FROM rocm/dev-ubuntu-22.04:6.1.2
2+
3+
# ######################################################
4+
# # Install MLPerf+Shark reference implementation
5+
# ######################################################
6+
ENV DEBIAN_FRONTEND=noninteractive
7+
8+
# apt dependencies
9+
RUN apt-get update && apt-get install -y \
10+
ffmpeg libsm6 libxext6 git wget unzip \
11+
software-properties-common git \
12+
build-essential curl cmake ninja-build clang lld vim nano python3.10-dev python3.10-venv && \
13+
apt-get clean && rm -rf /var/lib/apt/lists/*
14+
RUN pip install --upgrade pip setuptools wheel && \
15+
pip install pybind11 'nanobind<2' numpy==1.* pandas && \
16+
pip install hip-python hip-python-as-cuda -i https://test.pypi.org/simple
17+
18+
# Rust requirements
19+
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
20+
ENV PATH="/root/.cargo/bin:${PATH}"
21+
22+
SHELL ["/bin/bash", "-c"]
23+
24+
# Disable apt-key parse waring
25+
ARG APT_KEY_DONT_WARN_ON_DANGEROUS_USAGE=1
26+
27+
######################################################
28+
# Install SHARK-Turbine
29+
######################################################
30+
RUN pip3 install torch==2.4.0+rocm6.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.1
31+
RUN pip3 install --pre iree-compiler==20240920.1022 iree-runtime==20240920.1022 -f https://iree.dev/pip-release-links.html
32+
33+
RUN apt install amd-smi-lib && sudo chown -R $USER:$USER /opt/rocm/share/amd_smi && python3 -m pip install /opt/rocm/share/amd_smi
34+
# Install turbine-models, where the export is implemented.
35+
36+
ENV TB_SHARK_DIR=/SHARK-Turbine/models/turbine_models/custom_models/torchbench
37+
38+
RUN git clone https://github.com/nod-ai/SHARK-Turbine -b torchbench \
39+
&& cd SHARK-Turbine \
40+
&& pip install --pre --upgrade -e models -r models/requirements.txt \
41+
&& cd $TB_SHARK_DIR \
42+
&& git clone https://github.com/pytorch/pytorch \
43+
&& cd pytorch/benchmarks \
44+
&& touch __init__.py && cd ../.. \
45+
&& git clone https://github.com/pytorch/benchmark && cd benchmark \
46+
&& python3 install.py --models BERT_pytorch Background_Matting LearningToPaint alexnet dcgan densenet121 hf_Albert hf_Bart hf_Bert hf_GPT2 hf_T5 mnasnet1_0 mobilenet_v2 mobilenet_v3_large nvidia_deeprecommender pytorch_unet resnet18 resnet50 resnet50_32x4d shufflenet_v2_x1_0 squeezenet1_1 timm_nfnet timm_efficientnet timm_regnet timm_resnest timm_vision_transformer timm_vovnet vgg16 \
47+
&& pip install -e .
48+
49+
ENV HF_HOME=/models/huggingface/
50+
51+
# initialization settings for CPX mode
52+
ENV HSA_USE_SVM=0
53+
ENV HSA_XNACK=0

0 commit comments

Comments
 (0)