-
Notifications
You must be signed in to change notification settings - Fork 370
Feat: Add neuron backend to TEI #742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 21 commits
710b8c1
cc84f29
139b179
dd0c08d
1e4f3c9
adfa2e9
a25cf98
56c15d8
142520a
7ada877
976b71c
dc3edc2
3676b94
b803566
81c57d3
7f517b9
9752998
c517aa2
d1708a3
08301f0
37519d9
533d853
0829b6f
aa47549
1464cc3
9961846
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,6 +18,7 @@ on: | |
| - "Cargo.lock" | ||
| - "rust-toolchain.toml" | ||
| - "Dockerfile" | ||
| - "Dockerfile-neuron" | ||
| branches: | ||
| - "main" | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| name: Run Neuron integration tests | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| - cron: '0 0 * * *' # Run the workflow nightly to check Neuron integration is working | ||
|
|
||
| jobs: | ||
| tests: | ||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.job }}-${{ github.head_ref || github.run_id }} | ||
| cancel-in-progress: true | ||
| runs-on: | ||
| group: aws-inf2-8xlarge | ||
| steps: | ||
| - name: Checkout repository | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@v5 | ||
|
|
||
| - name: Build Docker image for Neuron | ||
| run: | | ||
| docker build . -f Dockerfile-neuron -t tei-neuron | ||
|
|
||
| - name: Run integration tests | ||
| working-directory: integration_tests | ||
| env: | ||
| HF_TOKEN: ${{ secrets.HF_TOKEN }} | ||
| DOCKER_IMAGE: tei-neuron | ||
| run: | | ||
| uv sync --locked --all-extras --dev | ||
| uv run pytest --durations=0 -sv neuron/ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,190 @@ | ||
| FROM lukemathwalker/cargo-chef:latest-rust-1.85-bookworm AS chef | ||
| WORKDIR /usr/src | ||
|
|
||
| ENV SCCACHE=0.10.0 | ||
| ENV RUSTC_WRAPPER=/usr/local/bin/sccache | ||
|
|
||
| # Download, configure sccache | ||
| RUN curl -fsSL https://github.com/mozilla/sccache/releases/download/v$SCCACHE/sccache-v$SCCACHE-x86_64-unknown-linux-musl.tar.gz | tar -xzv --strip-components=1 -C /usr/local/bin sccache-v$SCCACHE-x86_64-unknown-linux-musl/sccache && \ | ||
| chmod +x /usr/local/bin/sccache | ||
|
|
||
| FROM chef AS planner | ||
|
|
||
| COPY backends backends | ||
| COPY core core | ||
| COPY router router | ||
| COPY Cargo.toml ./ | ||
| COPY Cargo.lock ./ | ||
|
|
||
| RUN cargo chef prepare --recipe-path recipe.json | ||
|
|
||
| FROM chef AS builder | ||
|
|
||
| ARG GIT_SHA | ||
| ARG DOCKER_LABEL | ||
|
|
||
| # sccache specific variables | ||
| ARG SCCACHE_GHA_ENABLED | ||
|
|
||
| COPY --from=planner /usr/src/recipe.json recipe.json | ||
|
|
||
| RUN --mount=type=secret,id=actions_results_url,env=ACTIONS_RESULTS_URL \ | ||
| --mount=type=secret,id=actions_runtime_token,env=ACTIONS_RUNTIME_TOKEN \ | ||
| cargo chef cook --release --features python --no-default-features --recipe-path recipe.json && sccache -s | ||
|
|
||
| COPY backends backends | ||
| COPY core core | ||
| COPY router router | ||
| COPY Cargo.toml ./ | ||
| COPY Cargo.lock ./ | ||
|
|
||
| RUN PROTOC_ZIP=protoc-21.12-linux-x86_64.zip && \ | ||
| curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP && \ | ||
| unzip -o $PROTOC_ZIP -d /usr/local bin/protoc && \ | ||
| unzip -o $PROTOC_ZIP -d /usr/local 'include/*' && \ | ||
| rm -f $PROTOC_ZIP | ||
|
|
||
| FROM builder AS http-builder | ||
|
|
||
| RUN --mount=type=secret,id=actions_results_url,env=ACTIONS_RESULTS_URL \ | ||
| --mount=type=secret,id=actions_runtime_token,env=ACTIONS_RUNTIME_TOKEN \ | ||
| cargo build --release --bin text-embeddings-router -F python -F http --no-default-features && sccache -s | ||
|
|
||
| FROM builder AS grpc-builder | ||
|
|
||
| COPY proto proto | ||
|
|
||
| RUN --mount=type=secret,id=actions_results_url,env=ACTIONS_RESULTS_URL \ | ||
| --mount=type=secret,id=actions_runtime_token,env=ACTIONS_RUNTIME_TOKEN \ | ||
| cargo build --release --bin text-embeddings-router -F grpc -F python --no-default-features && sccache -s | ||
|
|
||
| FROM public.ecr.aws/docker/library/ubuntu:22.04 AS neuron | ||
|
|
||
| ENV HUGGINGFACE_HUB_CACHE=/data \ | ||
| PORT=80 | ||
|
|
||
| ENV PATH="/usr/local/bin:/root/.local/bin:${PATH}" | ||
|
|
||
| RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ | ||
| python3 \ | ||
| python3-pip \ | ||
| python3-dev \ | ||
| build-essential \ | ||
| git \ | ||
| curl \ | ||
| cmake \ | ||
| pkg-config \ | ||
| protobuf-compiler \ | ||
| ninja-build \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| RUN ln -s /usr/bin/python3 /usr/local/bin/python || true | ||
| RUN ln -s /usr/bin/pip3 /usr/local/bin/pip || true | ||
|
|
||
| WORKDIR /usr/src | ||
| COPY backends backends | ||
| COPY backends/python/server/text_embeddings_server/models/__init__.py backends/python/server/text_embeddings_server/models/__init__.py | ||
| COPY backends/python/server/pyproject.toml backends/python/server/pyproject.toml | ||
| RUN cd backends/python/server && \ | ||
| make install | ||
|
|
||
| ARG NEURONX_COLLECTIVES_LIB_VERSION=2.28.27.0-bc30ece58 | ||
| ARG NEURONX_RUNTIME_LIB_VERSION=2.28.23.0-dd5879008 | ||
| ARG NEURONX_TOOLS_VERSION=2.26.14.0 | ||
|
|
||
| ARG NEURONX_CC_VERSION=2.21.33363.0+82129205 | ||
| ARG NEURONX_FRAMEWORK_VERSION=2.8.0.2.10.16998+e9bf8a50 | ||
| ARG NEURONX_DISTRIBUTED_VERSION=0.15.22404+1f27bddf | ||
|
|
||
| RUN apt-get update \ | ||
| && apt-get upgrade -y \ | ||
| && apt-get install -y --no-install-recommends \ | ||
| apt-transport-https \ | ||
| build-essential \ | ||
| ca-certificates \ | ||
| cmake \ | ||
| curl \ | ||
| emacs \ | ||
| git \ | ||
| gnupg2 \ | ||
| gpg-agent \ | ||
| jq \ | ||
| libgl1-mesa-glx \ | ||
| libglib2.0-0 \ | ||
| libsm6 \ | ||
| libxext6 \ | ||
| libxrender-dev \ | ||
| libcap-dev \ | ||
| libhwloc-dev \ | ||
| openjdk-11-jdk \ | ||
| unzip \ | ||
| vim \ | ||
| wget \ | ||
| zlib1g-dev \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && rm -rf /tmp/tmp* \ | ||
| && apt-get clean | ||
|
|
||
| # Ubuntu 22.04 = jammy; use signed-by (apt-key is deprecated) | ||
| RUN wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | gpg --dearmor -o /usr/share/keyrings/neuron-archive-keyring.gpg && \ | ||
| echo "deb [signed-by=/usr/share/keyrings/neuron-archive-keyring.gpg] https://apt.repos.neuron.amazonaws.com jammy main" > /etc/apt/sources.list.d/neuron.list | ||
|
|
||
| RUN apt-get update \ | ||
| && apt-get install -y \ | ||
| aws-neuronx-tools=$NEURONX_TOOLS_VERSION \ | ||
| aws-neuronx-collectives=$NEURONX_COLLECTIVES_LIB_VERSION \ | ||
| aws-neuronx-runtime-lib=$NEURONX_RUNTIME_LIB_VERSION \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && rm -rf /tmp/tmp* \ | ||
| && apt-get clean | ||
|
|
||
| ENV PATH="/opt/aws/neuron/bin:${PATH}" | ||
|
|
||
| RUN pip install --index-url https://pip.repos.neuron.amazonaws.com \ | ||
| --extra-index-url https://pypi.org/simple \ | ||
| --trusted-host pip.repos.neuron.amazonaws.com \ | ||
| neuronx-cc==$NEURONX_CC_VERSION \ | ||
| torch-neuronx==$NEURONX_FRAMEWORK_VERSION \ | ||
| torchvision \ | ||
| neuronx_distributed==$NEURONX_DISTRIBUTED_VERSION \ | ||
| && rm -rf ~/.cache/pip/* | ||
|
|
||
| # HF ARGS | ||
| # Note: optimum-neuron 0.4.4 requires transformers~=4.57.1 | ||
| ARG TRANSFORMERS_VERSION=4.57.1 | ||
| ARG DIFFUSERS_VERSION=0.35.2 | ||
| ARG HUGGINGFACE_HUB_VERSION=0.36.0 | ||
| ARG OPTIMUM_NEURON_VERSION=0.4.4 | ||
| ARG SENTENCE_TRANSFORMERS=5.1.2 | ||
| ARG PEFT_VERSION=0.17.0 | ||
| ARG DATASETS_VERSION=4.1.1 | ||
|
|
||
| # Install Hugging Face libraries and dependencies for TEI on Neuron | ||
| RUN pip install --no-cache-dir -U \ | ||
| networkx==2.8.8 \ | ||
| transformers[sentencepiece,audio,vision]==${TRANSFORMERS_VERSION} \ | ||
| diffusers==${DIFFUSERS_VERSION} \ | ||
| compel \ | ||
| controlnet-aux \ | ||
| huggingface_hub==${HUGGINGFACE_HUB_VERSION} \ | ||
| hf_transfer \ | ||
| datasets==${DATASETS_VERSION} \ | ||
| optimum-neuron==${OPTIMUM_NEURON_VERSION} \ | ||
| sentence_transformers==${SENTENCE_TRANSFORMERS} \ | ||
| peft==${PEFT_VERSION} \ | ||
| && rm -rf ~/.cache/pip/* | ||
|
Comment on lines
+163
to
+175
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not something to tackle in this PR maybe, but I'd rather rely on a cc @regisss and @kaixuanliu as this was something mentioned in the past, but apparently it was failing on Intel HPUs (?)
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it should work on HPU, not sure why it failed at that time. so don't hesitate to go that way, and if you have a lock file you would like me to test on HPU, happy to do it :)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @regisss, I'll restart Nico's PR to add |
||
|
|
||
|
|
||
JingyaHuang marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| FROM neuron AS grpc | ||
|
|
||
| COPY --from=grpc-builder /usr/src/target/release/text-embeddings-router /usr/local/bin/text-embeddings-router | ||
|
|
||
| ENTRYPOINT ["text-embeddings-router"] | ||
| CMD ["--json-output"] | ||
|
|
||
| FROM neuron | ||
JingyaHuang marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| COPY --from=http-builder /usr/src/target/release/text-embeddings-router /usr/local/bin/text-embeddings-router | ||
|
|
||
| ENTRYPOINT ["text-embeddings-router"] | ||
| CMD ["--json-output"] | ||
Uh oh!
There was an error while loading. Please reload this page.