-
Notifications
You must be signed in to change notification settings - Fork 38
Description
I have a PyTorch build recipe in https://github.com/zboszor/meta-python-ai which is very Intel oriented at the moment, using MKL and oneDNN.
I also found that at least a few dependees of PyTorch (like yolov5 and ultralytics) have a very simple check to use GPU acceleration, deciding whether it's CUDA or CPU (a.k.a. "our way, or the highway"). It doesn't matter for them if PyTorch itself was built using Vulkan, OpenCL or ROCm.
So, I wanted to try to use chipStar to fake CUDA support in PyTorch, using the Yocto cross-compiler framework. To ease the process, I created a symlink called nvcc
that points to cucc
, among other things. The WIP changes are at zboszor/meta-python-ai#3
PyTorch tries to deduce the details of CUDA using nvcc --version
, which stalls and eventually makes the system go OOM.
I have looked at what it was doing and I found that the number of hipcc and python subprocesses grow ad infinitum, eventually depleting the RAM in the machine.
The process tree is cucc -> hipcc -> nvcc (which is actually cucc) -> hipcc -> ...
I think the chipStar fork of hipcc should check for whether nvcc is the same as cucc to avoid this infinite loop.