Skip to content

xla_extension failed encountered when trying to use exla in a Docker container #90

@jeryldev

Description

@jeryldev

I encounter xla_extension failed when I try to run exla while building a docker container. Here are some of the snippets from my Dockerfile:

ARG BUILDER_IMAGE="hexpm/elixir:1.14.0-erlang-24.0.1-debian-bullseye-20210902-slim"
ARG RUNNER_IMAGE="debian:bullseye-20210902-slim"

FROM ${BUILDER_IMAGE}

...

# install build dependencies
# https://github.com/elixir-nx/xla?tab=readme-ov-file#building-from-source
RUN apt-get update -y && apt-get install -y build-essential git apt-transport-https curl gnupg python3-pip gcc-9 g++-9 \
    && apt-get clean && rm -f /var/lib/apt/lists/*_*

RUN export CC=/usr/bin/gcc-9

# https://bazel.build/install/ubuntu#install-on-ubuntu
RUN curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor >bazel-archive-keyring.gpg
RUN mv bazel-archive-keyring.gpg /usr/share/keyrings
RUN echo "deb [arch=amd64 signed-by=/usr/share/keyrings/bazel-archive-keyring.gpg] https://storage.googleapis.com/bazel-apt stable jdk1.8" | tee /etc/apt/sources.list.d/bazel.list
RUN apt-get update -y && apt-get install -y bazel-6.5.0
RUN ln -s /usr/bin/bazel-6.5.0 /usr/bin/bazel

RUN pip install numpy

...

I get this error after I run the Dockerfile

[4,467 / 5,843] Compiling mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp; 134s local ... (16 actions, 15 running)
[4,468 / 5,843] Compiling mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp; 136s local ... (16 actions running)
[4,469 / 5,843] Compiling mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp; 137s local ... (16 actions running)
[4,470 / 5,843] Compiling mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp; 139s local ... (16 actions running)
[4,470 / 5,843] Compiling mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp; 210s local ... (16 actions running)
ERROR: /home/user/.cache/bazel/_bazel_user/ee4c0f1833dfaa435cb867c88f5a190e/external/llvm-project/mlir/BUILD.bazel:4925:11: Compiling mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp failed: (Exit 1): gcc failed: error executing command (from target @llvm-project//mlir:LLVMDialect) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 85 arguments skipped)
gcc: fatal error: Killed signal terminated program cc1plus
compilation terminated.
Target //xla/extension:xla_extension failed to build
Use --verbose_failures to see the command lines of failed build steps.
[4,487 / 5,843] checking cached actions
INFO: Elapsed time: 1131.980s, Critical Path: 278.37s
INFO: 4487 processes: 343 internal, 4144 local.
FAILED: Build did NOT complete successfully
make: *** [Makefile:26: /home/user/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-cpu.tar.gz] Error 1
could not compile dependency :xla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile xla", update it with "mix deps.update xla" or clean it with "mix deps.clean xla"
==> lai
** (Mix) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

I only encounter this issue when trying to build a docker container. I do not encounter any issues when I run mix phx.server.
Do we have an official Dockerfile sample for cases where docker container setup is required?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions