File tree Expand file tree Collapse file tree 3 files changed +13
-6
lines changed
Expand file tree Collapse file tree 3 files changed +13
-6
lines changed Original file line number Diff line number Diff line change @@ -9,11 +9,14 @@ WORKDIR /app
99
1010RUN DEBIAN_FRONTEND=noninteractive apt-get update \
1111 && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
12- build-essential make ffmpeg poppler-utils tesseract-ocr tesseract-ocr-deu tesseract-ocr-eng \
12+ build-essential make ffmpeg poppler-utils \
13+ tesseract-ocr tesseract-ocr-deu tesseract-ocr-eng \
14+ libleptonica-dev pkg-config \
1315 && python3 -m venv "${POETRY_VIRTUALENVS_PATH}" \
1416 && ${POETRY_VIRTUALENVS_PATH}/bin/pip install "poetry==${POETRY_VERSION}" \
1517 && rm -rf /var/lib/apt/lists/*
1618ENV PATH="${POETRY_VIRTUALENVS_PATH}/bin:$PATH"
19+ ENV TESSDATA_PREFIX=/usr/share/tesseract-ocr/5/tessdata
1720
1821# Copy lockfiles first
1922COPY services/document-extractor/pyproject.toml services/document-extractor/poetry.lock /app/services/document-extractor/
Original file line number Diff line number Diff line change @@ -7,16 +7,21 @@ The following endpoints are provided by the *documents_extractor*:
77# Requirements
88All required python libraries can be found in the [ pyproject.toml] ( pyproject.toml ) file.
99In addition to python libraries the following system packages are required:
10- ```
10+
11+ ``` shell
1112build-essential
1213make
1314ffmpeg
1415poppler-utils
1516tesseract-ocr
1617tesseract-ocr-deu
1718tesseract-ocr-eng
19+ libleptonica-dev
20+ pkg-config
1821```
1922
23+ The Tesseract data path is set via ` TESSDATA_PREFIX=/usr/share/tesseract-ocr/5/tessdata ` in both prod and dev images.
24+
2025# Endpoints
2126
2227## ` /extract `
@@ -31,4 +36,3 @@ The following types of information will be extracted:
3136A detailed explanation of the deployment can be found in the [ project README] ( ../../README.md ) .
3237The * helm-chart* used for the deployment can be found in the [ infrastructure directory] ( ../../infrastructure/ ) .
3338
34-
Original file line number Diff line number Diff line change 1- FROM --platform=linux/amd64 python:3.11 .7-bookworm
1+ FROM --platform=linux/amd64 python:3.13 .7-bookworm
22
33# Dev image for mcp-server (no local libs)
44ENV POETRY_VIRTUALENVS_PATH=/app/services/mcp-server/.venv
@@ -23,8 +23,8 @@ RUN poetry config virtualenvs.create false \
2323 && cd /app/services/mcp-server \
2424 && poetry install --no-interaction --no-ansi --no-root --with dev
2525
26- # Create non-root user
27- RUN adduser --disabled-password --gecos "" --uid 65532 nonroot
26+ # Create non-root user (align with prod UID for consistent file perms)
27+ RUN adduser --disabled-password --gecos "" --uid 10001 nonroot
2828
2929WORKDIR /app/services/mcp-server
3030RUN mkdir -p log && chmod 700 log \
You can’t perform that action at this time.
0 commit comments