Skip to content

Commit e570d07

Browse files
authored
Add pytorch/training/gpu/2.3.1/transformers/4.48.0/py311/Dockerfile (#134)
* Add latest PyTorch DLC with bumped dependencies * Fix `Dockerfile` due to extra `&&` * Lower `flash-attn` dependency version * Add `uv` to install `pip` dependencies faster This commit also contains some formatting improvements to better debug the `Dockerfile` such as indentation when a command is divided in multiple lines to know that it refers to the unindented command above; also set bash as the default shell, and fix `gcloud` CLI installation * Bump `transformers` to 4.48.0 and fix `Dockerfile` formatting Bump the `transformers` dependency to 4.48.0 to support the ModernBERT architecture, as well as bumping `diffusers` including new video and image generation pipelines, as well as a bunch of other features, improvements and bug fixes. Additionally, the `Dockerfile` formatting has been fixed. * Update `containers/pytorch/training/README.md` * Fix `containers/pytorch/training/README.md` * Set `transformers` version to 4.47.1 instead * Remove `--upgrade` flag from `torch` and `transformers` install * Bump `torch` to 2.3.1 and move `Dockerfile` * Remove `uv` from `Dockerfile` * Upgrade `transformers` to 4.48.0 * Remove strict version pinning on `protobuf`
1 parent 847a657 commit e570d07

File tree

2 files changed

+103
-2
lines changed

2 files changed

+103
-2
lines changed

containers/pytorch/training/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ The PyTorch Training containers will start a training job that will start on `do
2929
docker run --gpus all -ti \
3030
-v $(pwd)/artifact:/artifact \
3131
-e HF_TOKEN=$(cat ~/.cache/huggingface/token) \
32-
us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-pytorch-training-cu121.2-3.transformers.4-42.ubuntu2204.py310 \
32+
us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-pytorch-training-cu121.2-3.transformers.4-47.ubuntu2204.py311 \
3333
trl sft \
3434
--model_name_or_path google/gemma-2b \
3535
--attn_implementation "flash_attention_2" \
@@ -76,7 +76,7 @@ The PyTorch Training containers come with two different containers depending on
7676
- **GPU**: To build the PyTorch Training container for GPU, an instance with at least one NVIDIA GPU available is required to install `flash-attn` (used to speed up the attention layers during training and inference).
7777

7878
```bash
79-
docker build -t us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-pytorch-training-cu121.2-3.transformers.4-42.ubuntu2204.py310 -f containers/pytorch/training/gpu/2.3.0/transformers/4.42.3/py310/Dockerfile .
79+
docker build -t us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-pytorch-training-cu121.2-3.transformers.4-47.ubuntu2204.py311 -f containers/pytorch/training/gpu/2.3.0/transformers/4.47.1/py311/Dockerfile .
8080
```
8181

8282
- **TPU**: You can build PyTorch Training container for Google Cloud TPUs on any machine with docker build, you do not need to build it on a TPU VM
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
FROM nvidia/cuda:12.1.1-devel-ubuntu22.04
2+
SHELL ["/bin/bash", "-c"]
3+
4+
LABEL maintainer="Hugging Face"
5+
ARG DEBIAN_FRONTEND=noninteractive
6+
7+
# Versions
8+
ARG CUDA="cu121"
9+
ARG PYTORCH="2.3.1"
10+
ARG FLASH_ATTN="2.6.3"
11+
ARG TRANSFORMERS="4.48.0"
12+
ARG HUGGINGFACE_HUB="0.27.0"
13+
ARG DIFFUSERS="0.32.1"
14+
ARG PEFT="0.14.0"
15+
ARG TRL="0.13.0"
16+
ARG BITSANDBYTES="0.45.0"
17+
ARG DATASETS="3.2.0"
18+
ARG ACCELERATE="1.2.1"
19+
ARG EVALUATE="0.4.3"
20+
ARG SENTENCE_TRANSFORMERS="3.3.1"
21+
ARG DEEPSPEED="0.16.1"
22+
ARG MAX_JOBS=4
23+
24+
RUN apt-get update -y && \
25+
apt-get install software-properties-common -y && \
26+
add-apt-repository ppa:deadsnakes/ppa && \
27+
apt-get -y upgrade --only-upgrade systemd openssl cryptsetup && \
28+
apt-get install -y \
29+
build-essential \
30+
bzip2 \
31+
curl \
32+
git \
33+
git-lfs \
34+
tar \
35+
gcc \
36+
g++ \
37+
cmake \
38+
gnupg \
39+
libprotobuf-dev \
40+
libaio-dev \
41+
protobuf-compiler \
42+
python3.11 \
43+
python3.11-dev \
44+
libsndfile1-dev \
45+
ffmpeg && \
46+
apt-get clean autoremove --yes && \
47+
rm -rf /var/lib/apt/lists/*
48+
49+
# Set Python 3.11 as the default python version
50+
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 1 && \
51+
ln -sf /usr/bin/python3.11 /usr/bin/python
52+
53+
# Install pip from source and upgrade it
54+
RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
55+
python get-pip.py && \
56+
rm get-pip.py && \
57+
pip install --upgrade pip
58+
59+
# Install latest release PyTorch (PyTorch must be installed before any DeepSpeed C++/CUDA ops.)
60+
RUN pip install --no-cache-dir --index-url https://download.pytorch.org/whl/${CUDA} "torch==${PYTORCH}" torchvision torchaudio
61+
62+
# Install and upgrade Flash Attention 2
63+
RUN pip install --no-cache-dir packaging ninja
64+
RUN MAX_JOBS=${MAX_JOBS} pip install --no-build-isolation flash-attn==${FLASH_ATTN}
65+
66+
# Install Hugging Face Libraries
67+
RUN pip install --no-cache-dir \
68+
"transformers[sklearn,sentencepiece,vision]==${TRANSFORMERS}" \
69+
"huggingface_hub[hf_transfer]==${HUGGINGFACE_HUB}" \
70+
"diffusers==${DIFFUSERS}" \
71+
"datasets==${DATASETS}" \
72+
"accelerate==${ACCELERATE}" \
73+
"evaluate==${EVALUATE}" \
74+
"peft==${PEFT}" \
75+
"trl==${TRL}" \
76+
"sentence-transformers==${SENTENCE_TRANSFORMERS}" \
77+
"deepspeed==${DEEPSPEED}" \
78+
"bitsandbytes==${BITSANDBYTES}" \
79+
tensorboard \
80+
jupyter notebook
81+
82+
ENV HF_HUB_ENABLE_HF_TRANSFER="1"
83+
84+
# Install Google Cloud Dependencies
85+
RUN pip install --upgrade --no-cache-dir \
86+
google-cloud-storage \
87+
google-cloud-bigquery \
88+
google-cloud-aiplatform \
89+
google-cloud-pubsub \
90+
google-cloud-logging
91+
92+
# Install Google CLI single command
93+
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" \
94+
| tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && \
95+
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg \
96+
| apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - && \
97+
touch /var/lib/dpkg/status && \
98+
apt-get update -y && \
99+
apt-get install google-cloud-sdk -y && \
100+
apt-get clean autoremove --yes && \
101+
rm -rf /var/lib/{apt,dpkg,cache,log}

0 commit comments

Comments
 (0)