Skip to content

Conversation

@hppritcha
Copy link
Member

that prevents correct linkage of libcuda.so if it is in a non standard location.

Related to spack/spack#40913

Signed-off-by: Howard Pritchard [email protected]
(cherry picked from commit be28fa6)

that prevents correct linkage of libcuda.so if it is in
a non standard location.

Related to spack/spack#40913

Signed-off-by: Howard Pritchard <[email protected]>
(cherry picked from commit be28fa6)
@hppritcha hppritcha requested a review from wenduwan November 13, 2023 19:00
@github-actions github-actions bot added this to the v5.0.1 milestone Nov 13, 2023
dalcinl added a commit to regro-cf-autotick-bot/openmpi-feedstock that referenced this pull request Nov 16, 2023
@dalcinl
Copy link
Contributor

dalcinl commented Nov 16, 2023

The current set of changes in this PR may not be enough.

make[2]: Entering directory '/home/conda/feedstock_root/build_artifacts/openmpi-mpi_1700137892102/work/opal/mca/btl/smcuda'
  CC       mca_btl_smcuda_la-btl_smcuda.lo
  CC       mca_btl_smcuda_la-btl_smcuda_component.lo
  CC       mca_btl_smcuda_la-btl_smcuda_frag.lo
  CC       mca_btl_smcuda_la-btl_smcuda_accelerator.lo
  CCLD     mca_btl_smcuda.la
/home/conda/feedstock_root/build_artifacts/openmpi-mpi_1700137892102/_build_env/bin/../lib/gcc/x86_64-conda-linux-gnu/12.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lcuda: No such file or directory
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:1541: mca_btl_smcuda.la] Error 1
make[2]: Leaving directory '/home/conda/feedstock_root/build_artifacts/openmpi-mpi_1700137892102/work/opal/mca/btl/smcuda'
make[1]: *** [Makefile:1964: all-recursive] Error 1
make[1]: Leaving directory '/home/conda/feedstock_root/build_artifacts/openmpi-mpi_1700137892102/work/opal'
make: *** [Makefile:1532: all-recursive] Error 1

https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=824922&view=logs&jobId=314306c5-8124-5c6f-db81-0da6d2989ddc&j=314306c5-8124-5c6f-db81-0da6d2989ddc&t=fe0f7f24-8e8c-50f2-29b2-ac390c9705f2

@hppritcha
Copy link
Member Author

@dalcinl is right, same error in the smcuda btl makefile.am. does this conda-forge build using --enable-static and/or --enable-mca-dso? Just asking because i'm curious why spack build with cuda support didn't hit this but did hit the problem in the accelerator cuda component.

i'll open a PR on main first with this fix.

@dalcinl
Copy link
Contributor

dalcinl commented Nov 16, 2023

@hppritcha conda-forge is using --enable-mca-dso. I would have liked to just enable DSOs for CUDA and UCX, but I could not manage to find the right list of components to make these libraries optional 😞, so I gave up and passed --enable-mca-dso for everything.

@janjust
Copy link
Contributor

janjust commented Nov 16, 2023

@hppritcha let's wait to merge this until we get cherry-pick of this #12083 as well

when libcuda.so is in a non-standard location.

also fix rcache/gpusm and rcache/rgpsum

Similar fix to that in open-mpi#12065

Signed-off-by: Howard Pritchard <[email protected]>
(cherry picked from commit 2767278)
@hppritcha
Copy link
Member Author

@janjust pushed the additional commit to this PR. So 5.0.x will get two for the price of one!

@janjust janjust merged commit d6c9251 into open-mpi:v5.0.x Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants