Skip to content

Commit 6fbf742

Browse files
tjohnson31415njhill
authored andcommitted
feat: add support for models from ibm-fms
Initially this is focused on supporting Calico architecture models, including from `gpt_megatron` format checkpoints. Note: Even just importing / registering FMS model classes causes the process to be forked (for a warmup for PyTorch Compile). To avoid that, we can inspect the model type listed in the config.json before doing the import of FMS. Signed-off-by: Travis Johnson <[email protected]>
1 parent 34c44ff commit 6fbf742

File tree

5 files changed

+181
-14
lines changed

5 files changed

+181
-14
lines changed

Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ FROM base as test-base
134134

135135
ARG PYTHON_VERSION
136136

137-
RUN dnf install -y make unzip python${PYTHON_VERSION} python${PYTHON_VERSION}-pip gcc openssl-devel gcc-c++ && \
137+
RUN dnf install -y make unzip python${PYTHON_VERSION} python${PYTHON_VERSION}-pip gcc openssl-devel gcc-c++ git && \
138138
dnf clean all && \
139139
ln -fs /usr/bin/python${PYTHON_VERSION} /usr/bin/python3 && \
140140
ln -s /usr/bin/python${PYTHON_VERSION} /usr/local/bin/python && ln -s /usr/bin/pip${PYTHON_VERSION} /usr/local/bin/pip
@@ -279,7 +279,7 @@ ARG PYTHON_VERSION
279279
ARG SITE_PACKAGES=/opt/tgis/lib/python${PYTHON_VERSION}/site-packages
280280

281281
# Install C++ compiler (required at runtime when PT2_COMPILE is enabled)
282-
RUN dnf install -y gcc-c++ && dnf clean all \
282+
RUN dnf install -y gcc-c++ git && dnf clean all \
283283
&& useradd -u 2000 tgis -m -g 0
284284

285285
SHELL ["/bin/bash", "-c"]
@@ -312,7 +312,7 @@ RUN --mount=type=bind,from=auto-gptq-cache,src=/usr/src/auto-gptq-wheel,target=/
312312
# Install server
313313
COPY proto proto
314314
COPY server server
315-
RUN cd server && make gen-server && pip install ".[accelerate, onnx-gpu, quantize]" --no-cache-dir
315+
RUN cd server && make gen-server && pip install ".[accelerate, ibm-fms, onnx-gpu, quantize]" --no-cache-dir
316316

317317
# Patch codegen model changes into transformers 4.35
318318
RUN cp server/transformers_patch/modeling_codegen.py ${SITE_PACKAGES}/transformers/models/codegen/modeling_codegen.py

server/poetry.lock

Lines changed: 69 additions & 8 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

server/pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ onnxruntime = { version = "^1.16.3", optional = true }
2727
onnxruntime-gpu = { version = "^1.16.3", optional = true }
2828
onnx = { version = "^1.15.0", optional = true }
2929
einops = "^0.7.0"
30+
ibm-fms = { version = "^0.0", optional = true }
31+
fms-extras = {git = "https://github.com/foundation-model-stack/fms-extras", rev = "fdb1636de4261fd4102da659ab45d3fcc33fe8ef", optional = true}
3032

3133
# Explicitly install some transitive dependencies to avoid CVEs
3234
jinja2 = ">=3.1.3"
@@ -39,6 +41,7 @@ cryptography = ">=42.0.2"
3941
[tool.poetry.extras]
4042
accelerate = ["accelerate"]
4143
bnb = ["bitsandbytes"]
44+
ibm-fms = ["ibm-fms", "fms-extras"]
4245
onnx = ["optimum", "onnxruntime", "onnx"]
4346
onnx-gpu = ["optimum", "onnxruntime-gpu", "onnx"]
4447
# These are only required if using the quantize cli command

0 commit comments

Comments
 (0)