-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
Hi, I'm trying to build the XLA tarball for use on my machine, the chipset is AMD RYZEN AI MAX+ 395 and so I believe I have to use the latest ROCM which is 7.0.1 at the moment. I'm running Ubuntu 24.04 LTS, Elixir 1.18.4 and Erlang 28.
To get as far as I have, I've made these changes to XLA 0.9.1:
~/xla/builds$ git diff
diff --git a/builds/Dockerfile b/builds/Dockerfile
index ef618aa..47f19b6 100644
--- a/builds/Dockerfile
+++ b/builds/Dockerfile
@@ -1,6 +1,5 @@
ARG VARIANT
-# By default we build on Ubuntu 20 to compile against an older version of glibc.
-ARG BASE_IMAGE="hexpm/elixir:1.15.8-erlang-24.3.4.17-ubuntu-focal-20240427"
+ARG BASE_IMAGE="hexpm/elixir:1.18.4-erlang-28.1-ubuntu-noble-20250910"
# Pre-stages for base image variants
@@ -41,7 +40,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates
apt-get install -y rocm-dev rocm-libs && \
apt-get clean -y && rm -rf /var/lib/apt/lists/*
-ENV ROCM_PATH "/opt/rocm-${ROCM_VERSION}.0"
+ENV ROCM_PATH "/opt/rocm-${ROCM_VERSION}"
FROM base-${VARIANT}
@@ -73,7 +72,7 @@ ENV USE_BAZEL_VERSION=7.4.1
# Install Python and the necessary global dependencies
RUN apt-get update && apt-get install -y python3 python3-pip && \
ln -s /usr/bin/python3 /usr/bin/python && \
- python -m pip install --upgrade pip numpy && \
+ python -m pip install --break-system-packages --upgrade numpy && \
apt-get clean -y && rm -rf /var/lib/apt/lists/*
# Setup project files
diff --git a/builds/build.sh b/builds/build.sh
index 0ef386a..b1eafde 100755
--- a/builds/build.sh
+++ b/builds/build.sh
@@ -44,7 +44,7 @@ case "$target" in
"rocm")
docker build -t xla-rocm -f builds/Dockerfile \
--build-arg VARIANT=rocm \
- --build-arg ROCM_VERSION=6.0 \
+ --build-arg ROCM_VERSION=7.0.1 \
--build-arg XLA_TARGET=rocm \
.
;;I've been struggling to get past this error and others like it - always the same problem, sometimes a different package. It seems that in some nested bazel context, gcc is being used instead of clang and it blows up:
ERROR: /root/.cache/bazel/_bazel_root/77031b6b54d069fa14d9031c964d5f8f/external/com_google_absl/absl/base/BUILD.bazel:53:11: Compiling absl/base/log_severity.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing CppCompile command (from target @@com_google_absl//absl/base:log_severity) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 50 arguments skipped)
gcc: error: unrecognized command-line option ‘-Qunused-arguments’This is the full build output:
~/xla/builds$ ./build.sh rocm
[3/4] STEP 1/5: FROM hexpm/elixir:1.18.4-erlang-28.1-ubuntu-noble-20250910 AS base-rocm
[3/4] STEP 2/5: ARG ROCM_VERSION
--> Using cache 21e24e0eae92a1421ff0c9e675f893f42d44d4abd1d4ae1efec39ee6cfbfaf6f
--> 21e24e0eae92
[3/4] STEP 3/5: ARG DEBIAN_FRONTEND=noninteractive
--> Using cache fa2602493146f93990bcedfa8e49b15e2d6caaf2399cda5c66b077583f09e048
--> fa2602493146
[3/4] STEP 4/5: RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates curl gnupg && distro="$(. /etc/lsb-release && echo "$DISTRIB_CODENAME")" && curl -sL https://repo.radeon.com/rocm/rocm.gpg.key | apt-key add - && echo
"deb [arch=amd64] https://repo.radeon.com/rocm/apt/${ROCM_VERSION}/ $distro main" | tee /etc/apt/sources.list.d/rocm.list && printf 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600\n' | tee /etc/apt/preferences.d/rocm-pin-600 && apt-get
update && apt-get install -y rocm-dev rocm-libs && apt-get clean -y && rm -rf /var/lib/apt/lists/*
--> Using cache eb1ecf97b94fbc780cafec7b776e9718d4fea8e50aa80152e582a3127910975b
--> eb1ecf97b94f
[3/4] STEP 5/5: ENV ROCM_PATH "/opt/rocm-${ROCM_VERSION}"
--> Using cache 6faa8af1d456dcfe49899ef7c7a8f78d5fe4942d7d7dae2e8982f3160898675f
--> 6faa8af1d456
[4/4] STEP 1/18: FROM 6faa8af1d456dcfe49899ef7c7a8f78d5fe4942d7d7dae2e8982f3160898675f
[4/4] STEP 2/18: ENV LC_ALL=C.UTF-8
--> Using cache 2d6cc23ec5ab67be303f3c48c1370ffe2e0e6f2898218060e6a73a7a27dded34
--> 2d6cc23ec5ab
[4/4] STEP 3/18: ARG DEBIAN_FRONTEND=noninteractive
--> Using cache c3372a81144a819d69d47ac1b757d856850adef77bc14e933b4386fe8c5c4ae4
--> c3372a81144a
[4/4] STEP 4/18: RUN apt-get update && apt-get update && apt-get install -y ca-certificates curl git unzip wget && clang_version="18" && apt-get install -y wget gnupg software-properties-common lsb-release && wget -qO- https://apt.llvm.org/llvm.s
h | bash -s -- $clang_version && update-alternatives --install /usr/bin/clang clang /usr/bin/clang-$clang_version 100 && update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-$clang_version 100 && apt-get clean -y && rm -rf /var/l
ib/apt/lists/*
--> Using cache df0ba7255559c0435c250c204fe4c061234e41486643a973906e2d8777f9c447
--> df0ba7255559
[4/4] STEP 5/18: RUN wget -O bazel "https://github.com/bazelbuild/bazelisk/releases/download/v1.26.0/bazelisk-linux-$(dpkg --print-architecture)" && chmod +x bazel && mv bazel /usr/local/bin/bazel
--> Using cache 66f60f0af52a7fa16f26edca3d4bd54a1e1671dcda95ac7842d5400ccf2604a6
--> 66f60f0af52a
[4/4] STEP 6/18: ENV USE_BAZEL_VERSION=7.4.1
--> Using cache f709ad0de0c1ded69241a10bf49950a9898f5d40dec005493cadcb6a312c810b
--> f709ad0de0c1
[4/4] STEP 7/18: RUN apt-get update && apt-get install -y python3 python3-pip && ln -s /usr/bin/python3 /usr/bin/python && python -m pip install --break-system-packages --upgrade numpy && apt-get clean -y && rm -rf /var/lib/apt/lists/*
--> Using cache e809dbe4f3f99e5afc1ab6715b3de4cd0ace3811f34ad41b89095b7aeda6b72b
--> e809dbe4f3f9
[4/4] STEP 8/18: WORKDIR /xla
--> Using cache 657b6f64c18e6b2e941b5faec14c71f6ff495bf62db029377d03a38bf216dede
--> 657b6f64c18e
[4/4] STEP 9/18: ARG XLA_TARGET
--> Using cache 6fdf59978609e2d14ddd281fd714fbdb4d4688101f38248112a91d2a75d157dc
--> 6fdf59978609
[4/4] STEP 10/18: ENV XLA_TARGET=${XLA_TARGET}
--> Using cache f3f030b163727a76554f7cc2ff4dde35f3eb829e2cba20135f2f7f53fd1b65fe
--> f3f030b16372
[4/4] STEP 11/18: ENV XLA_CACHE_DIR=/cache
--> Using cache 6a2cd45f69c00a9d837da0932869fdfd6261dfff99e6f06d473048e261e57796
--> 6a2cd45f69c0
[4/4] STEP 12/18: ENV XLA_BUILD=true
--> Using cache 7058122124cc9fdbbe67e362895434bd4d3e602b63d75c8671e59bf35b7a8688
--> 7058122124cc
[4/4] STEP 13/18: COPY mix.exs mix.lock ./
--> Using cache cd64a721a1680f45a1fee53b75bc6ff8e9385310530919aa5ec196bcd570c619
--> cd64a721a168
[4/4] STEP 14/18: RUN mix deps.get
--> Using cache a70a744706d337a60c7e89e5006ef70bb60b81c916ed268535d9224cc8a75720
--> a70a744706d3
[4/4] STEP 15/18: COPY lib lib
--> Using cache ff64515b45c41d46cf1ed3af06c752eb54aac51b1468c857369b7317e68fa2a5
--> ff64515b45c4
[4/4] STEP 16/18: COPY README.md Makefile ./
--> Using cache e57a9d9800f12677857e2f588567a5047153c069e7a3d68396d0e10fce5ab34d
--> e57a9d9800f1
[4/4] STEP 17/18: COPY extension extension
--> Using cache 43f422e50bdfaf34fd3acb8264650f5584872389294dd5853daee8da8d8b1e89
--> 43f422e50bdf
[4/4] STEP 18/18: CMD [ "mix", "compile" ]
--> Using cache 4f13af52135bd862d28d2d0cb0e318634d2078114eebcea775f924db26903ef9
[4/4] COMMIT xla-rocm
--> 4f13af52135b
Successfully tagged localhost/xla-rocm:latest
4f13af52135bd862d28d2d0cb0e318634d2078114eebcea775f924db26903ef9
==> earmark_parser
Compiling 2 files (.xrl)
Compiling 1 file (.yrl)
Compiling 3 files (.erl)
Compiling 46 files (.ex)
warning: Tuple.append/2 is deprecated. Use insert_at instead
│
65 │ tag_tpl |> Tuple.append(Enum.reverse(lines)) |> Tuple.append(@verbatim)
│ ~
│
└─ lib/earmark_parser/helpers/html_parser.ex:65:22: EarmarkParser.Helpers.HtmlParser._parse_rest/3
└─ lib/earmark_parser/helpers/html_parser.ex:65:59: EarmarkParser.Helpers.HtmlParser._parse_rest/3
└─ lib/earmark_parser/helpers/html_parser.ex:69:39: EarmarkParser.Helpers.HtmlParser._parse_rest/3
└─ lib/earmark_parser/helpers/html_parser.ex:69:88: EarmarkParser.Helpers.HtmlParser._parse_rest/3
└─ lib/earmark_parser/helpers/html_parser.ex:70:39: EarmarkParser.Helpers.HtmlParser._parse_rest/3
└─ lib/earmark_parser/helpers/html_parser.ex:70:76: EarmarkParser.Helpers.HtmlParser._parse_rest/3
└─ lib/earmark_parser/helpers/html_parser.ex:71:40: EarmarkParser.Helpers.HtmlParser._parse_rest/3
└─ lib/earmark_parser/helpers/html_parser.ex:71:77: EarmarkParser.Helpers.HtmlParser._parse_rest/3
Generated earmark_parser app
==> elixir_make
Compiling 1 file (.ex)
Generated elixir_make app
==> nimble_parsec
Compiling 4 files (.ex)
Generated nimble_parsec app
==> makeup
Compiling 15 files (.ex)
Generated makeup app
==> makeup_elixir
Compiling 6 files (.ex)
Generated makeup_elixir app
==> makeup_erlang
Compiling 4 files (.ex)
Generated makeup_erlang app
==> ex_doc
Compiling 26 files (.ex)
Generated ex_doc app
==> xla
Compiling 5 files (.ex)
Generated xla app
rm -f /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/xla/extension && \
ln -s "/xla/extension" /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/xla/extension && \
cd /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20 && \
bazel build --define "framework_shared_object=false" -c opt --config=rocm --action_env=HIP_PLATFORM=hcc --action_env=TF_ROCM_AMDGPU_TARGETS="gfx900,gfx906,gfx908,gfx90a,gfx940,gfx941,gfx942,gfx1030,gfx1100,gfx1200,gfx1201" --repo_env=CC=clang --repo_env=CXX=clang++ --copt=-Wno-error=unused-command-line-argument --copt=-Wno-gnu-offsetof-extensions --copt=-Qunused-arguments --copt=-Wno-error=c23-extensions //xla/extension:xla_extension && \
mkdir -p /cache/0.9.1/build/ && \
cp -f /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/bazel-bin/xla/extension/xla_extension.tar.gz /cache/0.9.1/build/xla_extension-0.9.1-x86_64-linux-gnu-rocm.tar.gz
Starting local Bazel server and connecting to it...
INFO: Reading 'startup' options from /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/tensorflow.bazelrc: --windows_enable_symlinks
INFO: Options provided by the client:
Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/.bazelrc:
Inherited 'common' options: --noenable_bzlmod --noincompatible_enable_cc_toolchain_resolution
INFO: Reading rc options for 'build' from /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/tensorflow.bazelrc:
Inherited 'common' options: --announce_rc --experimental_cc_shared_library --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/tensorflow.bazelrc:
'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --features=-force_no_whole_archive --enable_platform_specific_config --config=short_logs --@rules_python//python/config_settings:precompile=force_disabled
INFO: Found applicable config definition build:short_logs in file /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/tensorflow.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:rocm in file /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/tensorflow.bazelrc: --config=rocm_base
INFO: Found applicable config definition build:rocm_base in file /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/tensorflow.bazelrc: --copt=-Wno-gnu-offsetof-extensions --crosstool_top=@local_config_rocm//crosstool:toolchain --define=using_rocm_hipcc=true --define=tensorflow_mkldnn_contraction_kernel=0 --define=xnn_enable_avxvnniint8=false --define=xnn_enable_avx512fp16=false --repo_env TF_NEED_ROCM=1
INFO: Found applicable config definition build:linux in file /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/tensorflow.bazelrc: --host_copt=-w --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --experimental_guard_against_concurrent_changes
Computing main repo mapping:
DEBUG: /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/third_party/py/python_repo.bzl:156:14:
HERMETIC_PYTHON_VERSION variable was not set correctly, using default version.
Python 3.11 will be used.
To select Python version, either set HERMETIC_PYTHON_VERSION env variable in
your shell:
export HERMETIC_PYTHON_VERSION=3.12
OR pass it as an argument to bazel command directly or inside your .bazelrc
file:
--repo_env=HERMETIC_PYTHON_VERSION=3.12
DEBUG: /root/.cache/xla_build/xla-870d90fd098c480fb8a426126bd02047adb2bc20/third_party/py/python_repo.bzl:87:10:
=============================
Hermetic Python configuration:
Version: "3.11"
Kind: ""
Interpreter: "default" (provided by rules_python)
Requirements_lock label: "@//:requirements_lock_3_11.txt"
=====================================
Loading:
Loading: 1 packages loaded
Analyzing: target //xla/extension:xla_extension (2 packages loaded, 0 targets configured)
Analyzing: target //xla/extension:xla_extension (2 packages loaded, 0 targets configured)
Analyzing: target //xla/extension:xla_extension (272 packages loaded, 21646 targets configured)
INFO: Analyzed target //xla/extension:xla_extension (283 packages loaded, 37505 targets configured).
[1 / 1] no actions running
ERROR: /root/.cache/bazel/_bazel_root/77031b6b54d069fa14d9031c964d5f8f/external/com_google_absl/absl/base/BUILD.bazel:53:11: Compiling absl/base/log_severity.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing CppCompile command (from target @@com_google_absl//absl/base:log_severity) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 50 arguments skipped)
gcc: error: unrecognized command-line option ‘-Qunused-arguments’
Target //xla/extension:xla_extension failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 3.900s, Critical Path: 0.22s
INFO: 52 processes: 51 internal, 1 local.
ERROR: Build did NOT complete successfully
make: *** [Makefile:24: /cache/0.9.1/build/xla_extension-0.9.1-x86_64-linux-gnu-rocm.tar.gz] Error 1
** (Mix) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".Perhaps there's some obvious fix for this, and if not perhaps I should try going the Torchx direction?
Cheers
Metadata
Metadata
Assignees
Labels
No labels