-
Notifications
You must be signed in to change notification settings - Fork 694
RPATH Fix for portable_lib Python Extension #14422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14422
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Unrelated FailureAs of commit 446d855 with merge base 0f22062 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
BUILD_RPATH "@loader_path/../../../torch/lib" | ||
INSTALL_RPATH "@loader_path/../../../torch/lib" | ||
) | ||
else() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GregoryComer Do you think this will work for windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I was planning to test the windows wheel on a local machine. I'll see if we run into any issues.
Fixing lint. Other than that, CI looks good. I took a look at the previous cherry-pick attempt to main (#13290) and it looks like the build failures were due to CMake export changes. That's resolved now. |
RPATH Fix for portable_lib Python Extension Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from the CI build environment. Error: ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib Referenced from: .../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried: '/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file) Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for runtime library resolution. Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669): - macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries relative to the .so file location - Linux: Uses $ORIGIN/../../../torch/lib for the same purpose - Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue when using ExecutorTorch wheels built on CI. Note: The same fix may be needed for _training_lib if it experiences similar issues. Test Plan: ``` # Build the wheel locally python setup.py bdist_wheel # create fresh conda env conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11 # install pip install ./dist/executorch-*.whl # Verify python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')" ```
2c4f7ba
to
446d855
Compare
Thank you! |
I verified that the wheel built from this job, installed into a clean env, can load pybindings on macos. I also tested on Windows on a clean env, and I can't load the _portable_lib dll, though it's also broken on the release branch. I'll file as release blocking task. |
@pytorchbot cherry-pick --onto release/1.0 -c critical |
**Note: This is an attempt to cherry-pick Mergen's RPATH fix from #13254 onto main. Fixes #14421. Original description below.** Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from the CI build environment. Error: ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib Referenced from: .../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried: '/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file) Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for runtime library resolution. Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669): - macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries relative to the .so file location - Linux: Uses $ORIGIN/../../../torch/lib for the same purpose - Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue when using ExecutorTorch wheels built on CI. Note: The same fix may be needed for _training_lib if it experiences similar issues. Test Plan: ``` # Build the wheel locally python setup.py bdist_wheel # create fresh conda env conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11 # install pip install ./dist/executorch-*.whl # Verify python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')" ``` Co-authored-by: Mergen Nachin <[email protected]> (cherry picked from commit 641e737)
Cherry picking #14422The cherry pick PR is at #14442 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
**Note: This is an attempt to cherry-pick Mergen's RPATH fix from #13254 onto main. Fixes #14421. Original description below.** Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from the CI build environment. Error: ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib Referenced from: .../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried: '/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file) Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for runtime library resolution. Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669): - macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries relative to the .so file location - Linux: Uses $ORIGIN/../../../torch/lib for the same purpose - Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue when using ExecutorTorch wheels built on CI. Note: The same fix may be needed for _training_lib if it experiences similar issues. Test Plan: ``` # Build the wheel locally python setup.py bdist_wheel # create fresh conda env conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11 # install pip install ./dist/executorch-*.whl # Verify python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')" ``` Co-authored-by: Mergen Nachin <[email protected]> (cherry picked from commit 641e737)
**Note: This is an attempt to cherry-pick Mergen's RPATH fix from pytorch#13254 onto main. Fixes pytorch#14421. Original description below.** Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from the CI build environment. Error: ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib Referenced from: .../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried: '/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file) Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for runtime library resolution. Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669): - macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries relative to the .so file location - Linux: Uses $ORIGIN/../../../torch/lib for the same purpose - Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue when using ExecutorTorch wheels built on CI. Note: The same fix may be needed for _training_lib if it experiences similar issues. Test Plan: ``` # Build the wheel locally python setup.py bdist_wheel # create fresh conda env conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11 # install pip install ./dist/executorch-*.whl # Verify python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')" ``` Co-authored-by: Mergen Nachin <[email protected]>
Note: This is an attempt to cherry-pick Mergen's RPATH fix from #13254 onto main. Fixes #14421. Original description below.
Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from
the CI build environment.
Error:
ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib
Referenced from:
.../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried:
'/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file)
Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for
runtime library resolution.
Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669):
Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue
when using ExecutorTorch wheels built on CI.
Note: The same fix may be needed for _training_lib if it experiences similar issues.
Test Plan: