Skip to content

Compile CUDA libraries for aarch64-linux #10223

@imciner2

Description

@imciner2

That's correct, we currently can only generate CUDA binaries for targets that match the host system (i.e., x86_64). In fact, I recently investigated exactly this for libxc, but didn't manage to get it working. I'll copy my conclusions here, before Slack swallows them:

I took a brief look at libxc for aarch64, and there's a bunch of issues preventing us to move forward:

  • CUDA_SDK_jll is currently installed as a BuildDependency, so can't be executed by the host arch. We could switch this to a HostBuildDependency, however the host environment is musl, while the CUDA SDK is glibc. Often that works out OK-ish, but Pkg refuses to download the glibc artifact when instantiating the musl env
  • I tried switching the compiler to Clang, which is easy enough by doing -DCMAKE_CUDA_COMPILER=clang, however that exposes a couple of issues. one, somehow --target= leaks into the command line flags when CMake identifies the compiler, breaking all sort of stuff. Fixing that to say --target=aarch64-..., a header isn't found (__config_site). This seems caused by the fact that LLVM has a bug, looking into the wrong locations, as noted here: https://github.com/JuliaPackaging/BinaryBuilderBase.jl/blob/ac6831078a4241d85ff891e6067a06a9e6dc1052/src/Runner.jl#L431-L442. apparently that needs to be generalized to all Clang-based platforms, which I verified works by jerry rigging the invocation to include nostdinc++. The header location added in the linked change doesn't seem to exist on aarch64, which may be problematic later down the line, but I didn't get that far because:
  • using Clang as the CUDA compiler still wants to execute ptxas, which brings us back to the initial issue of CUDA_SDK_jll not being executable. so we would probably need to fix that anyway, i.e., support either overriding the platform to allow using glibc binaries on musl so that HostBuildDependency works, or making sure foreign binaries are executable.
  • I decided to try the former using qemu-use-static, however, our Qemu_static_jll isn't built for musl either, meaning it can't be installed as a HostBuildDependency either. I started fixing that by attempting a rebuild of Qemu for musl, however, we're using musl 1.2.2 in the musl rootfs which doesn't yet have MAP_FIXED_NOREPLACE as used by qemu.
  • it also should be said that even with qemu-user-static as a HostBuildDependency in the container, not everything is fixed, because the current version of the sandbox doesn't grant you access to proc/binfmt, meaning you can't register qemu-user-static as an interpreter for foreign binaries, but would need to replace tools like nvcc and ptxas with wrappers that invoke under qemu-user-static. But I didn't get to that part because of not managing to upgrade qemu

With BinaryBuilder2.jl, the qemu/binfmt solution will be integrated, and we should be able to automatically execute foreign binaries and depend on the target-specific CUDA SDK. Given the amount of work it would require to get it working right now, I decided to wait for BinaryBuilder2.jl.

Originally posted by @maleadt in #10217 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    cuda 🕹️Builders related to Nvidia CUDA

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions