You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: add optimum-tpu TGI v0.2.3
The main feature is the addition of Llama 3.1, 3.2 and 3.3 (text-only)
models.
* fix: remove * when copying entrypoint
* review(TGI TPU): add a comment on why we install two python versions
# If we are building for GCP, we need to clone the optimum-tpu repo as this is built from the huggingface/Google-Cloud-Containers repository and not the huggingface/optimum-tpu repository
cd /opt/optimum-tpu && git checkout v${VERSION}; \
85
+
fi && \
86
+
# Check if the optimum-tpu repo is cloned properly
87
+
cp -a /tmp/src /opt/optimum-tpu && \
88
+
if [ ! -d "/opt/optimum-tpu/optimum" ]; then \
89
+
echo "Error: Building from incorrect repository. This build must be run from optimum-tpu repo. If building from google-cloud-containers repo, set ENABLE_GOOGLE_FEATURE=1 to automatically clone optimum-tpu" && \
90
+
exit 1; \
91
+
fi
92
+
93
+
94
+
# Python server build image
95
+
FROM base AS pyserver
96
+
97
+
RUN apt-get update -y \
98
+
&& apt-get install -y --no-install-recommends \
99
+
make \
100
+
python3-venv \
101
+
&& rm -rf /var/lib/apt/lists/* \
102
+
&& apt-get clean
103
+
104
+
RUN install -d /pyserver
105
+
WORKDIR /pyserver
106
+
COPY --from=optimum-tpu-installer /opt/optimum-tpu/text-generation-inference/server server
107
+
COPY --from=tgi /tgi/proto proto
108
+
RUN pip3 install -r server/build-requirements.txt
109
+
RUN VERBOSE=1 BUILDDIR=/pyserver/build PROTODIR=/pyserver/proto VERSION=${VERSION} make -C server gen-server
110
+
111
+
# TPU base image (used for deployment)
112
+
FROM base AS tpu_base
113
+
114
+
ARG VERSION=${VERSION}
115
+
116
+
# Install system prerequisites
117
+
# NOTE: we need both python3.10 and python3.11 to be installed, as the TGI router uses python 3.11 and optimum-tpu uses
118
+
# python 3.10. This has been fixed on newest version of optimum-tpu and will be removed in the next version (see
119
+
# https://github.com/huggingface/optimum-tpu/pull/135 for details).
120
+
RUN apt-get update -y \
121
+
&& apt-get install -y --no-install-recommends \
122
+
libpython3.10 \
123
+
libpython3.11 \
124
+
python3.11 \
125
+
git \
126
+
gnupg2 \
127
+
wget \
128
+
curl \
129
+
&& rm -rf /var/lib/apt/lists/* \
130
+
&& apt-get clean
131
+
132
+
# Update pip
133
+
RUN pip install --upgrade pip
134
+
135
+
# Install HuggingFace packages
136
+
ARG TRANSFORMERS_VERSION='4.46.3'
137
+
ARG ACCELERATE_VERSION='1.1.1'
138
+
ARG SAFETENSORS_VERSION='0.4.5'
139
+
140
+
ARG ENABLE_GOOGLE_FEATURE
141
+
142
+
ENV HF_HUB_ENABLE_HF_TRANSFER=1
143
+
ENV VERSION=${VERSION}
144
+
145
+
ENV PORT=${ENABLE_GOOGLE_FEATURE:+8080}
146
+
ENV PORT=${PORT:-80}
147
+
148
+
ENV HF_HOME=${ENABLE_GOOGLE_FEATURE:+/tmp}
149
+
ENV HF_HOME=${HF_HOME:-/data}
150
+
151
+
# Install requirements for TGI, that uses python3.11
152
+
RUN python3.11 -m pip install transformers==${TRANSFORMERS_VERSION}
153
+
154
+
# Install requirements for optimum-tpu, then for TGI then optimum-tpu
155
+
RUN python3 -m pip install hf_transfer safetensors==${SAFETENSORS_VERSION} typer
0 commit comments