Skip to content

Commit fd1b987

Browse files
rmdg88Rui-Dias-Gomes
andauthored
feat: add rocm image build support and fix cuda (#292)
Signed-off-by: rmdg88 <rmdg88@gmail.com> Signed-off-by: Rui-Dias-Gomes <rui.dias.gomes@ibm.com> Co-authored-by: Rui-Dias-Gomes <rui.dias.gomes@ibm.com>
1 parent ce15e03 commit fd1b987

File tree

16 files changed

+1209
-927
lines changed

16 files changed

+1209
-927
lines changed

.github/styles/config/vocabularies/Docling/accept.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ Kubeflow
1919
(?i)PyTorch
2020
(?i)CUDA
2121
(?i)NVIDIA
22+
(?i)ROCm
2223
(?i)env
2324
Gradio
2425
bool

.github/workflows/cd.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ jobs:
1515
with:
1616
fetch-depth: 0 # for fetching tags, required for semantic-release
1717
- name: Install uv and set the python version
18-
uses: astral-sh/setup-uv@v5
18+
uses: astral-sh/setup-uv@v6
1919
with:
2020
enable-cache: true
2121
- name: Install dependencies
@@ -45,7 +45,7 @@ jobs:
4545
token: ${{ steps.app-token.outputs.token }}
4646
fetch-depth: 0 # for fetching tags, required for semantic-release
4747
- name: Install uv and set the python version
48-
uses: astral-sh/setup-uv@v5
48+
uses: astral-sh/setup-uv@v6
4949
with:
5050
enable-cache: true
5151
- name: Install dependencies

.github/workflows/ci-images-dryrun.yml

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,10 @@ jobs:
2121
build_args: |
2222
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cpu --no-extra flash-attn
2323
platforms: linux/amd64, linux/arm64
24-
- name: docling-project/docling-serve-cu124
25-
build_args: |
26-
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu124
27-
platforms: linux/amd64
24+
# - name: docling-project/docling-serve-cu124
25+
# build_args: |
26+
# UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu124
27+
# platforms: linux/amd64
2828
- name: docling-project/docling-serve-cu126
2929
build_args: |
3030
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu126
@@ -33,6 +33,10 @@ jobs:
3333
build_args: |
3434
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu128
3535
platforms: linux/amd64
36+
# - name: docling-project/docling-serve-rocm
37+
# build_args: |
38+
# UV_SYNC_EXTRA_ARGS=--no-group pypi --group rocm --no-extra flash-attn
39+
# platforms: linux/amd64
3640

3741
permissions:
3842
packages: write

.github/workflows/images.yml

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,10 +25,10 @@ jobs:
2525
build_args: |
2626
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cpu --no-extra flash-attn
2727
platforms: linux/amd64, linux/arm64
28-
- name: docling-project/docling-serve-cu124
29-
build_args: |
30-
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu124
31-
platforms: linux/amd64
28+
# - name: docling-project/docling-serve-cu124
29+
# build_args: |
30+
# UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu124
31+
# platforms: linux/amd64
3232
- name: docling-project/docling-serve-cu126
3333
build_args: |
3434
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu126
@@ -37,7 +37,10 @@ jobs:
3737
build_args: |
3838
UV_SYNC_EXTRA_ARGS=--no-group pypi --group cu128
3939
platforms: linux/amd64
40-
40+
# - name: docling-project/docling-serve-rocm
41+
# build_args: |
42+
# UV_SYNC_EXTRA_ARGS=--no-group pypi --group rocm --no-extra flash-attn
43+
# platforms: linux/amd64
4144
permissions:
4245
packages: write
4346
contents: read

.github/workflows/job-build.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ jobs:
1212
steps:
1313
- uses: actions/checkout@v4
1414
- name: Install uv and set the python version
15-
uses: astral-sh/setup-uv@v5
15+
uses: astral-sh/setup-uv@v6
1616
with:
1717
python-version: ${{ matrix.python-version }}
1818
enable-cache: true

.github/workflows/job-checks.yml

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ jobs:
1212
steps:
1313
- uses: actions/checkout@v4
1414
- name: Install uv and set the python version
15-
uses: astral-sh/setup-uv@v5
15+
uses: astral-sh/setup-uv@v6
1616
with:
1717
python-version: ${{ matrix.python-version }}
1818
enable-cache: true
@@ -28,7 +28,7 @@ jobs:
2828
run: uv sync --frozen --all-extras --no-extra flash-attn
2929

3030
- name: Run styling check
31-
run: pre-commit run --all-files
31+
run: uv run pre-commit run --all-files
3232

3333
build-package:
3434
uses: ./.github/workflows/job-build.yml
@@ -47,14 +47,16 @@ jobs:
4747
name: python-package-distributions
4848
path: dist/
4949
- name: Install uv and set the python version
50-
uses: astral-sh/setup-uv@v5
50+
uses: astral-sh/setup-uv@v6
5151
with:
5252
python-version: ${{ matrix.python-version }}
5353
enable-cache: true
54+
- name: Create virtual environment
55+
run: uv venv
5456
- name: Install package
5557
run: uv pip install dist/*.whl
5658
- name: Create the server
57-
run: python -c 'from docling_serve.app import create_app; create_app()'
59+
run: .venv/bin/python -c 'from docling_serve.app import create_app; create_app()'
5860

5961
markdown-lint:
6062
runs-on: ubuntu-latest
@@ -64,4 +66,3 @@ jobs:
6466
uses: DavidAnson/markdownlint-cli2-action@v16
6567
with:
6668
globs: "**/*.md"
67-

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ repos:
3333
args: ["--config=.github/vale.ini"]
3434
files: \.md$
3535
- repo: https://github.com/astral-sh/uv-pre-commit
36-
# uv version.
37-
rev: 0.7.13
36+
# uv version, https://github.com/astral-sh/uv-pre-commit/releases
37+
rev: 0.8.3
3838
hooks:
3939
- id: uv-lock

Containerfile

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,17 @@
11
ARG BASE_IMAGE=quay.io/sclorg/python-312-c9s:c9s
22

3-
FROM ${BASE_IMAGE}
3+
ARG UV_VERSION=0.8.3
44

5-
USER 0
5+
ARG UV_SYNC_EXTRA_ARGS=""
6+
7+
FROM ${BASE_IMAGE} AS docling-base
68

79
###################################################################################################
810
# OS Layer #
911
###################################################################################################
1012

13+
USER 0
14+
1115
RUN --mount=type=bind,source=os-packages.txt,target=/tmp/os-packages.txt \
1216
dnf -y install --best --nodocs --setopt=install_weak_deps=False dnf-plugins-core && \
1317
dnf config-manager --best --nodocs --setopt=install_weak_deps=False --save && \
@@ -21,16 +25,19 @@ RUN /usr/bin/fix-permissions /opt/app-root/src/.cache
2125

2226
ENV TESSDATA_PREFIX=/usr/share/tesseract/tessdata/
2327

28+
FROM ghcr.io/astral-sh/uv:${UV_VERSION} AS uv_stage
29+
2430
###################################################################################################
2531
# Docling layer #
2632
###################################################################################################
2733

34+
FROM docling-base
35+
2836
USER 1001
2937

3038
WORKDIR /opt/app-root/src
3139

3240
ENV \
33-
# On container environments, always set a thread budget to avoid undesired thread congestion.
3441
OMP_NUM_THREADS=4 \
3542
LANG=en_US.UTF-8 \
3643
LC_ALL=en_US.UTF-8 \
@@ -40,9 +47,9 @@ ENV \
4047
UV_PROJECT_ENVIRONMENT=/opt/app-root \
4148
DOCLING_SERVE_ARTIFACTS_PATH=/opt/app-root/src/.cache/docling/models
4249

43-
ARG UV_SYNC_EXTRA_ARGS=""
50+
ARG UV_SYNC_EXTRA_ARGS
4451

45-
RUN --mount=from=ghcr.io/astral-sh/uv:0.7.19,source=/uv,target=/bin/uv \
52+
RUN --mount=from=uv_stage,source=/uv,target=/bin/uv \
4653
--mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
4754
--mount=type=bind,source=uv.lock,target=uv.lock \
4855
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
@@ -61,7 +68,8 @@ RUN echo "Downloading models..." && \
6168
chmod -R g=u ${DOCLING_SERVE_ARTIFACTS_PATH}
6269

6370
COPY --chown=1001:0 ./docling_serve ./docling_serve
64-
RUN --mount=from=ghcr.io/astral-sh/uv:0.7.19,source=/uv,target=/bin/uv \
71+
72+
RUN --mount=from=uv_stage,source=/uv,target=/bin/uv \
6573
--mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
6674
--mount=type=bind,source=uv.lock,target=uv.lock \
6775
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \

Makefile

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,13 @@ docling-serve-cu128-image: Containerfile ## Build docling-serve container image
6060
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cu128:$(TAG) ghcr.io/docling-project/docling-serve-cu128:$(BRANCH_TAG)
6161
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cu128:$(TAG) quay.io/docling-project/docling-serve-cu128:$(BRANCH_TAG)
6262

63+
.PHONY: docling-serve-rocm-image
64+
docling-serve-rocm-image: Containerfile ## Build docling-serve container image with ROCm support
65+
$(ECHO_PREFIX) printf " %-12s Containerfile\n" "[docling-serve with ROCm 6.3]"
66+
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-group pypi --group rocm --no-extra flash-attn" -f Containerfile --platform linux/amd64 -t ghcr.io/docling-project/docling-serve-rocm:$(TAG) .
67+
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-rocm:$(TAG) ghcr.io/docling-project/docling-serve-rocm:$(BRANCH_TAG)
68+
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-rocm:$(TAG) quay.io/docling-project/docling-serve-rocm:$(BRANCH_TAG)
69+
6370
.PHONY: action-lint
6471
action-lint: .action-lint ## Lint GitHub Action workflows
6572
.action-lint: $(shell find .github -type f) | action-lint-file
@@ -107,3 +114,24 @@ run-docling-cu124: ## Run the docling-serve container with GPU support and assig
107114
$(CMD_PREFIX) docker rm -f docling-serve-cu124 2>/dev/null || true
108115
$(ECHO_PREFIX) printf " %-12s Running docling-serve container with GPU support on port 5001...\n" "[RUN CUDA 12.4]"
109116
$(CMD_PREFIX) docker run -it --name docling-serve-cu124 -p 5001:5001 ghcr.io/docling-project/docling-serve-cu124:main
117+
118+
.PHONY: run-docling-cu126
119+
run-docling-cu126: ## Run the docling-serve container with GPU support and assign a container name
120+
$(ECHO_PREFIX) printf " %-12s Removing existing container if it exists...\n" "[CLEANUP]"
121+
$(CMD_PREFIX) docker rm -f docling-serve-cu126 2>/dev/null || true
122+
$(ECHO_PREFIX) printf " %-12s Running docling-serve container with GPU support on port 5001...\n" "[RUN CUDA 12.6]"
123+
$(CMD_PREFIX) docker run -it --name docling-serve-cu126 -p 5001:5001 ghcr.io/docling-project/docling-serve-cu126:main
124+
125+
.PHONY: run-docling-cu128
126+
run-docling-cu128: ## Run the docling-serve container with GPU support and assign a container name
127+
$(ECHO_PREFIX) printf " %-12s Removing existing container if it exists...\n" "[CLEANUP]"
128+
$(CMD_PREFIX) docker rm -f docling-serve-cu128 2>/dev/null || true
129+
$(ECHO_PREFIX) printf " %-12s Running docling-serve container with GPU support on port 5001...\n" "[RUN CUDA 12.8]"
130+
$(CMD_PREFIX) docker run -it --name docling-serve-cu128 -p 5001:5001 ghcr.io/docling-project/docling-serve-cu128:main
131+
132+
.PHONY: run-docling-rocm
133+
run-docling-rocm: ## Run the docling-serve container with GPU support and assign a container name
134+
$(ECHO_PREFIX) printf " %-12s Removing existing container if it exists...\n" "[CLEANUP]"
135+
$(CMD_PREFIX) docker rm -f docling-serve-rocm 2>/dev/null || true
136+
$(ECHO_PREFIX) printf " %-12s Running docling-serve container with GPU support on port 5001...\n" "[RUN ROCm 6.3]"
137+
$(CMD_PREFIX) docker run -it --name docling-serve-rocm -p 5001:5001 ghcr.io/docling-project/docling-serve-rocm:main

README.md

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -50,17 +50,32 @@ curl -X 'POST' \
5050
}'
5151
```
5252

53-
### Container images
53+
### Container Images
5454

55-
Available container images:
55+
The following container images are available for running **Docling Serve** with different hardware and PyTorch configurations:
5656

57-
| Name | Description | Arch | Size |
58-
| -----|-------------|------|------|
59-
| [`ghcr.io/docling-project/docling-serve`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve) <br /> [`quay.io/docling-project/docling-serve`](https://quay.io/repository/docling-project/docling-serve) | Simple image for Docling Serve, installing all packages from the official pypi.org index. | `linux/amd64`, `linux/arm64` | 3.6 GB (arm64) <br /> 8.7 GB (amd64) |
60-
| [`ghcr.io/docling-project/docling-serve-cpu`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cpu) <br /> [`quay.io/docling-project/docling-serve-cpu`](https://quay.io/repository/docling-project/docling-serve-cpu) | Cpu-only image which installs `torch` from the pytorch cpu index. | `linux/amd64`, `linux/arm64` | 3.6 GB |
61-
| [`ghcr.io/docling-project/docling-serve-cu124`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cu124) <br /> [`quay.io/docling-project/docling-serve-cu124`](https://quay.io/repository/docling-project/docling-serve-cu124) | Cuda 12.4 image which installs `torch` from the pytorch cu124 index. | `linux/amd64` | 8.7 GB |
62-
| [`ghcr.io/docling-project/docling-serve-cu126`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cu126) <br /> [`quay.io/docling-project/docling-serve-cu126`](https://quay.io/repository/docling-project/docling-serve-cu126) | Cuda 12.6 image which installs `torch` from the pytorch cu126 index. | `linux/amd64` | 8.7 GB |
63-
| [`ghcr.io/docling-project/docling-serve-cu128`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cu128) <br /> [`quay.io/docling-project/docling-serve-cu128`](https://quay.io/repository/docling-project/docling-serve-cu128) | Cuda 12.8 image which installs `torch` from the pytorch cu128 index. | `linux/amd64` | 8.7 GB |
57+
#### 📦 Distributed Images
58+
59+
| Image | Description | Architectures | Size |
60+
|-------|-------------|----------------|------|
61+
| [`ghcr.io/docling-project/docling-serve`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve) <br> [`quay.io/docling-project/docling-serve`](https://quay.io/repository/docling-project/docling-serve) | Base image with all packages installed from the official PyPI index. | `linux/amd64`, `linux/arm64` | 4.4 GB (arm64) <br> 8.7 GB (amd64) |
62+
| [`ghcr.io/docling-project/docling-serve-cpu`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cpu) <br> [`quay.io/docling-project/docling-serve-cpu`](https://quay.io/repository/docling-project/docling-serve-cpu) | CPU-only variant, using `torch` from the PyTorch CPU index. | `linux/amd64`, `linux/arm64` | 4.4 GB |
63+
| [`ghcr.io/docling-project/docling-serve-cu126`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cu126) <br> [`quay.io/docling-project/docling-serve-cu126`](https://quay.io/repository/docling-project/docling-serve-cu126) | CUDA 12.6 build with `torch` from the cu126 index. | `linux/amd64` | 10.0 GB |
64+
| [`ghcr.io/docling-project/docling-serve-cu128`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cu128) <br> [`quay.io/docling-project/docling-serve-cu128`](https://quay.io/repository/docling-project/docling-serve-cu128) | CUDA 12.8 build with `torch` from the cu128 index. | `linux/amd64` | 11.4 GB |
65+
66+
#### 🚫 Not Distributed
67+
68+
An image for AMD ROCm 6.3 (`docling-serve-rocm`) is supported but **not published** due to its large size.
69+
70+
To build it locally:
71+
72+
```bash
73+
git clone --branch main git@github.com:docling-project/docling-serve.git
74+
cd docling-serve/
75+
make docling-serve-rocm-image
76+
```
77+
78+
For deployment using Docker Compose, see [docs/deployment.md](docs/deployment.md).
6479

6580
Coming soon: `docling-serve-slim` images will reduce the size by skipping the model weights download.
6681

0 commit comments

Comments
 (0)