Skip to content

Multi-backend support (ROCm and MUSA)#21

Open
yeahdongcn wants to merge 1 commit intoDao-AILab:masterfrom
yeahdongcn:xd/multi-platform
Open

Multi-backend support (ROCm and MUSA)#21
yeahdongcn wants to merge 1 commit intoDao-AILab:masterfrom
yeahdongcn:xd/multi-platform

Conversation

@yeahdongcn
Copy link

@yeahdongcn yeahdongcn commented Oct 10, 2025

This PR was inspired by #11 and extends it to support multiple backends, including ROCm and MUSA.

Build/install/unit tests all passed on ROCm and MUSA. Please see the logs below for more information.

Testing Done

ROCm 7.0.0 + Torch 2.9.0a0+git7bcbafe

root@4ba3a2f2f1b5:/sgl-workspace/fast-hadamard-transform# python setup.py install


torch.__version__  = 2.9.0a0+git7bcbafe


/sgl-workspace/fast-hadamard-transform/csrc/vendor.h -> /sgl-workspace/fast-hadamard-transform/csrc/vendor_hip.h [skipped, already hipified]
/sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform.h -> /sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform.h [skipped, no changes]
/sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform.cpp -> /sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform_hip.cpp [skipped, already hipified]
/sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform_common.h -> /sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform_common_hip.h [skipped, already hipified]
/sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform_special.h -> /sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform_special.h [skipped, no changes]
/sgl-workspace/fast-hadamard-transform/csrc/static_switch.h -> /sgl-workspace/fast-hadamard-transform/csrc/static_switch.h [skipped, no changes]
/sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform_gpu.cu -> /sgl-workspace/fast-hadamard-transform/csrc/fast_hadamard_transform_gpu.hip [skipped, already hipified]
Successfully preprocessed all matching files.
Total number of unsupported CUDA function calls: 0


Total number of replaced kernel launches: 5
/opt/venv/lib/python3.10/site-packages/setuptools/dist.py:759: SetuptoolsDeprecationWarning: License classifiers are deprecated.
!!

        ********************************************************************************
        Please consider removing the following classifiers in favor of a SPDX license expression:

        License :: OSI Approved :: BSD License

        See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
        ********************************************************************************

!!
  self._finalize_license_expression()
running install
/opt/venv/lib/python3.10/site-packages/setuptools/_distutils/cmd.py:90: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
/opt/venv/lib/python3.10/site-packages/setuptools/_distutils/cmd.py:90: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  self.initialize_options()
running bdist_egg
running egg_info
writing fast_hadamard_transform.egg-info/PKG-INFO
writing dependency_links to fast_hadamard_transform.egg-info/dependency_links.txt
writing requirements to fast_hadamard_transform.egg-info/requires.txt
writing top-level names to fast_hadamard_transform.egg-info/top_level.txt
adding license file 'LICENSE'
adding license file 'AUTHORS'
writing manifest file 'fast_hadamard_transform.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying fast_hadamard_transform/__init__.py -> build/lib.linux-x86_64-cpython-310/fast_hadamard_transform
copying fast_hadamard_transform/fast_hadamard_transform_interface.py -> build/lib.linux-x86_64-cpython-310/fast_hadamard_transform
running build_ext
building 'fast_hadamard_transform_cuda' extension
ninja: no work to do.
x86_64-linux-gnu-g++ -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -shared -Wl,-O1 -Wl,-Bsymbolic-functions /sgl-workspace/fast-hadamard-transform/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform_gpu.o /sgl-workspace/fast-hadamard-transform/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform_hip.o -L/opt/venv/lib/python3.10/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -L/usr/lib/x86_64-linux-gnu -lc10 -ltorch -ltorch_cpu -ltorch_python -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-310/fast_hadamard_transform_cuda.cpython-310-x86_64-linux-gnu.so
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-cpython-310/fast_hadamard_transform_cuda.cpython-310-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/fast_hadamard_transform
copying build/lib.linux-x86_64-cpython-310/fast_hadamard_transform/__init__.py -> build/bdist.linux-x86_64/egg/fast_hadamard_transform
copying build/lib.linux-x86_64-cpython-310/fast_hadamard_transform/fast_hadamard_transform_interface.py -> build/bdist.linux-x86_64/egg/fast_hadamard_transform
byte-compiling build/bdist.linux-x86_64/egg/fast_hadamard_transform/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/fast_hadamard_transform/fast_hadamard_transform_interface.py to fast_hadamard_transform_interface.cpython-310.pyc
creating stub loader for fast_hadamard_transform_cuda.cpython-310-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/fast_hadamard_transform_cuda.py to fast_hadamard_transform_cuda.cpython-310.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
__pycache__.fast_hadamard_transform_cuda.cpython-310: module references __file__
creating 'dist/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg
removing '/opt/venv/lib/python3.10/site-packages/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg' (and everything under it)
creating /opt/venv/lib/python3.10/site-packages/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg
Extracting fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg to /opt/venv/lib/python3.10/site-packages
Adding fast-hadamard-transform 1.0.4.post1 to easy-install.pth file

Installed /opt/venv/lib/python3.10/site-packages/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg
Processing dependencies for fast-hadamard-transform==1.0.4.post1
Searching for ninja==1.13.0
Best match: ninja 1.13.0
Adding ninja 1.13.0 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for packaging==25.0
Best match: packaging 25.0
Adding packaging 25.0 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for torch==2.9.0a0+git7bcbafe
Best match: torch 2.9.0a0+git7bcbafe
Adding torch 2.9.0a0+git7bcbafe to easy-install.pth file
Installing torchfrtrace script to /opt/venv/bin
Installing torchrun script to /opt/venv/bin

Using /opt/venv/lib/python3.10/site-packages
Searching for fsspec==2025.3.0
Best match: fsspec 2025.3.0
Adding fsspec 2025.3.0 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for jinja2==3.1.6
Best match: jinja2 3.1.6
Adding jinja2 3.1.6 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for networkx==2.8.8
Best match: networkx 2.8.8
Adding networkx 2.8.8 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for sympy==1.13.3
Best match: sympy 1.13.3
Adding sympy 1.13.3 to easy-install.pth file
Installing isympy script to /opt/venv/bin

Using /opt/venv/lib/python3.10/site-packages
Searching for typing-extensions==4.14.1
Best match: typing-extensions 4.14.1
Adding typing-extensions 4.14.1 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for filelock==3.19.1
Best match: filelock 3.19.1
Adding filelock 3.19.1 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for MarkupSafe==3.0.2
Best match: MarkupSafe 3.0.2
Adding MarkupSafe 3.0.2 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Searching for mpmath==1.3.0
Best match: mpmath 1.3.0
Adding mpmath 1.3.0 to easy-install.pth file

Using /opt/venv/lib/python3.10/site-packages
Finished processing dependencies for fast-hadamard-transform==1.0.4.post1
root@4ba3a2f2f1b5:/sgl-workspace/fast-hadamard-transform# cd tests
root@4ba3a2f2f1b5:/sgl-workspace/fast-hadamard-transform/tests# pytest test_fast_hadamard_transform.py
=================================================================== test session starts ====================================================================
platform linux -- Python 3.10.12, pytest-8.4.1, pluggy-1.6.0
rootdir: /sgl-workspace/fast-hadamard-transform
plugins: langsmith-0.4.31, anyio-4.10.0, asyncio-1.1.0, subtests-0.13.1, xdoctest-1.1.0, flakefinder-1.1.0, hypothesis-5.35.1, shard-0.1.2, xdist-3.3.1, cpp-2.3.0, rerunfailures-14.0
asyncio: mode=strict, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 51 items
Running 51 items in this shard

test_fast_hadamard_transform.py ...................................................                                                                  [100%]

============================================================== 51 passed in 71.87s (0:01:11) ===============================================================
root@4ba3a2f2f1b5:/sgl-workspace/fast-hadamard-transform/tests#

MUSA 4.3.0 + Torch 2.5.0

root@worker3218:/ws# python setup.py install
/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py:61: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]


torch.__version__  = 2.5.0


running install
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  self.initialize_options()
running bdist_egg
running egg_info
writing fast_hadamard_transform.egg-info/PKG-INFO
writing dependency_links to fast_hadamard_transform.egg-info/dependency_links.txt
writing requirements to fast_hadamard_transform.egg-info/requires.txt
writing top-level names to fast_hadamard_transform.egg-info/top_level.txt
adding license file 'LICENSE'
adding license file 'AUTHORS'
writing manifest file 'fast_hadamard_transform.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying fast_hadamard_transform/fast_hadamard_transform_interface.py -> build/lib.linux-x86_64-cpython-310/fast_hadamard_transform
copying fast_hadamard_transform/__init__.py -> build/lib.linux-x86_64-cpython-310/fast_hadamard_transform
running build_ext
building 'fast_hadamard_transform_cuda' extension
Emitting ninja build file /ws/build/temp.linux-x86_64-cpython-310/build.ninja...
Compiling objects...
Using envvar MAX_JOBS (128) as the number of workers...
[1/2] /usr/local/musa/bin/mcc -x musa -MMD -MF /ws/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform.o.d -I/ws -I/usr/local/musa/include -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/aten/src -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/torch_musa_codegen -I/usr/local/lib/python3.10/dist-packages -I/usr/local/musa/include -I/usr/include/python3.10 -c -c /ws/csrc/fast_hadamard_transform.cpp -o /ws/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform.o -fPIC -O3 -fPIC -std=c++17 -x musa -mtgpu --cuda-gpu-arch=mp_31 -fno-strict-aliasing -ffast-math -Od3 -fmusa-flush-denormals-to-zero -DUSE_MUSA=1 --offload-arch=mp_31 -march=native -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=fast_hadamard_transform_cuda -D_GLIBCXX_USE_CXX11_ABI=1
[2/2] /usr/local/musa/bin/mcc -x musa -MMD -MF /ws/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform_gpu.o.d -I/ws -I/usr/local/musa/include -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/aten/src -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch_musa/share/torch_musa_codegen -I/usr/local/lib/python3.10/dist-packages -I/usr/local/musa/include -I/usr/include/python3.10 -c -c /ws/csrc/fast_hadamard_transform_gpu.cu -o /ws/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform_gpu.o -fPIC -O3 -fPIC -std=c++17 -x musa -mtgpu --cuda-gpu-arch=mp_31 -fno-strict-aliasing -ffast-math -Od3 -fmusa-flush-denormals-to-zero -DUSE_MUSA=1 --offload-arch=mp_31 -march=native -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=fast_hadamard_transform_cuda -D_GLIBCXX_USE_CXX11_ABI=1
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/InlineDeviceGuard.h:8:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:114:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:123:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:133:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:182:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:197:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:231:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Layout.h:3:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Backend.h:143:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Backend.h:282:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:5:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Layout.h:76:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:6:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/MemoryFormat.h:61:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymIntArrayRef.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymInt.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymBool.h:3:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:35:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:38:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:41:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:47:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:50:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:53:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:57:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:67:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:77:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:83:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:86:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:89:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:92:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:95:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:98:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:101:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:104:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:107:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:110:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:113:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:116:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:119:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:122:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:125:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:128:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:134:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:139:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:144:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:149:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:154:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:159:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:162:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:165:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:168:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:171:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:174:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:177:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:180:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:183:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:201:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:204:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:207:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:210:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:4:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/ScalarType.h:343:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/ScalarType.h:486:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/ScalarType.h:526:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:6:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/TensorOptions.h:724:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/TensorOptions.h:769:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Storage.h:6:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/StorageImpl.h:9:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyObjectSlot.h:125:3: warning: non-void function does not return a value in all control paths [-Wreturn-type]
  }
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/Generator.h:18:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/GeneratorImpl.h:8:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/TensorImpl.h:1057:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:10:
/usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSACachingAllocator.h:214:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSACachingAllocator.h:223:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
62 warnings generated when compiling for mp_31.
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/InlineDeviceGuard.h:8:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:114:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:123:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:133:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:182:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:197:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/DeviceGuardImplInterface.h:231:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Layout.h:3:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Backend.h:143:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Backend.h:282:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:5:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Layout.h:76:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:6:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/MemoryFormat.h:61:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:5:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/GPUTrace.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyInterpreter.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymIntArrayRef.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymInt.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymBool.h:3:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:35:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:38:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:41:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:47:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:50:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:53:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:57:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:67:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:77:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:83:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:86:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:89:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:92:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:95:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:98:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:101:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:104:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:107:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:110:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:113:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:116:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:119:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:122:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:125:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:128:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:134:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:139:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:144:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:149:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:154:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:159:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:162:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:165:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:168:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:171:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:174:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:177:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:180:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:183:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:201:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:204:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:207:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/SymNodeImpl.h:210:3: warning: non-void function does not return a value [-Wreturn-type]
  };
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:4:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/ScalarType.h:343:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/ScalarType.h:486:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/ScalarType.h:526:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:6:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/TensorOptions.h:724:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/TensorOptions.h:769:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/Storage.h:6:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/StorageImpl.h:9:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/impl/PyObjectSlot.h:125:3: warning: non-void function does not return a value in all control paths [-Wreturn-type]
  }
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/aten/utils/Utils.h:4:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/Dispatch.h:3:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/DeprecatedTypeProperties.h:9:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/ATen/core/Generator.h:18:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/GeneratorImpl.h:8:
/usr/local/lib/python3.10/dist-packages/torch_musa/share/generated_cuda_compatible/include/c10/core/TensorImpl.h:1057:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
In file included from /ws/csrc/fast_hadamard_transform_gpu.cu:15:
In file included from /ws/csrc/vendor.h:24:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSAGuard.h:7:
In file included from /usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/GuardImpl.h:10:
/usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSACachingAllocator.h:214:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
/usr/local/lib/python3.10/dist-packages/torch_musa/csrc/core/MUSACachingAllocator.h:223:3: warning: non-void function does not return a value [-Wreturn-type]
  }
  ^
62 warnings generated when compiling for host.
x86_64-linux-gnu-g++ -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 /ws/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform.o /ws/build/temp.linux-x86_64-cpython-310/csrc/fast_hadamard_transform_gpu.o -L/usr/local/lib/python3.10/dist-packages/torch/lib -L/usr/local/lib/python3.10/dist-packages/torch_musa/lib -L/usr/local/musa/lib -L/usr/lib/x86_64-linux-gnu -lc10 -ltorch -ltorch_cpu -ltorch_python -lmusa_python -o build/lib.linux-x86_64-cpython-310/fast_hadamard_transform_cuda.cpython-310-x86_64-linux-gnu.so -Wl,-rpath,$ORIGIN/lib -Wl,-rpath,/usr/local/lib/python3.10/dist-packages/torch/lib -Wl,-rpath,/usr/local/lib/python3.10/dist-packages/torch_musa/lib
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-cpython-310/fast_hadamard_transform_cuda.cpython-310-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/fast_hadamard_transform
copying build/lib.linux-x86_64-cpython-310/fast_hadamard_transform/fast_hadamard_transform_interface.py -> build/bdist.linux-x86_64/egg/fast_hadamard_transform
copying build/lib.linux-x86_64-cpython-310/fast_hadamard_transform/__init__.py -> build/bdist.linux-x86_64/egg/fast_hadamard_transform
byte-compiling build/bdist.linux-x86_64/egg/fast_hadamard_transform/fast_hadamard_transform_interface.py to fast_hadamard_transform_interface.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/fast_hadamard_transform/__init__.py to __init__.cpython-310.pyc
creating stub loader for fast_hadamard_transform_cuda.cpython-310-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/fast_hadamard_transform_cuda.py to fast_hadamard_transform_cuda.cpython-310.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying fast_hadamard_transform.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
__pycache__.fast_hadamard_transform_cuda.cpython-310: module references __file__
creating 'dist/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg
removing '/usr/local/lib/python3.10/dist-packages/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg' (and everything under it)
creating /usr/local/lib/python3.10/dist-packages/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg
Extracting fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg to /usr/local/lib/python3.10/dist-packages
Adding fast-hadamard-transform 1.0.4.post1 to easy-install.pth file

Installed /usr/local/lib/python3.10/dist-packages/fast_hadamard_transform-1.0.4.post1-py3.10-linux-x86_64.egg
Processing dependencies for fast-hadamard-transform==1.0.4.post1
Searching for ninja==1.11.1
Best match: ninja 1.11.1
Adding ninja 1.11.1 to easy-install.pth file
Installing ninja script to /usr/local/bin

Using /usr/local/lib/python3.10/dist-packages
Searching for packaging==24.2
Best match: packaging 24.2
Adding packaging 24.2 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Searching for torch==2.5.0
Best match: torch 2.5.0
Adding torch 2.5.0 to easy-install.pth file
Installing convert-caffe2-to-onnx script to /usr/local/bin
Installing convert-onnx-to-caffe2 script to /usr/local/bin
Installing torchfrtrace script to /usr/local/bin
Installing torchrun script to /usr/local/bin

Using /usr/local/lib/python3.10/dist-packages
Searching for sympy==1.13.1
Best match: sympy 1.13.1
Adding sympy 1.13.1 to easy-install.pth file
Installing isympy script to /usr/local/bin

Using /usr/local/lib/python3.10/dist-packages
Searching for fsspec==2025.9.0
Best match: fsspec 2025.9.0
Adding fsspec 2025.9.0 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Searching for jinja2==3.1.6
Best match: jinja2 3.1.6
Adding jinja2 3.1.6 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Searching for networkx==3.4.2
Best match: networkx 3.4.2
Adding networkx 3.4.2 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Searching for typing-extensions==4.15.0
Best match: typing-extensions 4.15.0
Adding typing-extensions 4.15.0 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Searching for filelock==3.19.1
Best match: filelock 3.19.1
Adding filelock 3.19.1 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Searching for mpmath==1.3.0
Best match: mpmath 1.3.0
Adding mpmath 1.3.0 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Searching for MarkupSafe==3.0.2
Best match: MarkupSafe 3.0.2
Adding MarkupSafe 3.0.2 to easy-install.pth file

Using /usr/local/lib/python3.10/dist-packages
Finished processing dependencies for fast-hadamard-transform==1.0.4.post1
root@worker3218:/ws# cd tests/
root@worker3218:/ws/tests# pytest test_fast_hadamard_transform.py 
================================================================= test session starts ==================================================================
platform linux -- Python 3.10.12, pytest-7.2.2, pluggy-1.6.0
rootdir: /ws
plugins: hypothesis-6.140.2, anyio-4.10.0
collected 51 items                                                                                                                                     

test_fast_hadamard_transform.py ...................................................                                                              [100%]

=================================================================== warnings summary ===================================================================
../../usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py:61
  /usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py:61: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
    import pynvml  # type: ignore[import]

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================= 51 passed, 1 warning in 82.26s (0:01:22) =======================================================
root@worker3218:/ws/tests# 

@yeahdongcn yeahdongcn force-pushed the xd/multi-platform branch 4 times, most recently from 1546fb1 to 52e8ff1 Compare October 11, 2025 07:44
@yeahdongcn yeahdongcn changed the title Multi-backend support Multi-backend Support (ROCm and MUSA) Oct 11, 2025
@yeahdongcn yeahdongcn changed the title Multi-backend Support (ROCm and MUSA) Multi-backend support (ROCm and MUSA) Oct 11, 2025
@fanshao123456
Copy link

How long does the compilation usually take?

@yeahdongcn
Copy link
Author

How long does the compilation usually take?

Tested on a GPU Droplet from AMD Developer Cloud.

root@7615ee52f135:/sgl-workspace/fast-hadamard-transform# time python setup.py install
...
real    2m0.123s
user    2m19.199s
sys     0m3.304s

@fanshao123456
Copy link

How long does the compilation usually take?

Tested on a GPU Droplet from AMD Developer Cloud.

root@7615ee52f135:/sgl-workspace/fast-hadamard-transform# time python setup.py install
...
real    2m0.123s
user    2m19.199s
sys     0m3.304s

Why does it get stuck during compilation? I'm running it on a Hygon GPU
like this
python3 setup.py install torch.version = 2.5.1 /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/vendor.h -> /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/vendor_hip.h [skipped, already hipified] /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform.h -> /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform.h [skipped, no changes] /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform.cpp -> /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform_hip.cpp [skipped, already hipified] /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform_common.h -> /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform_common_hip.h [skipped, already hipified] /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform_special.h -> /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform_special.h [skipped, no changes] /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/static_switch.h -> /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/static_switch.h [skipped, no changes] /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform_gpu.cu -> /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/csrc/fast_hadamard_transform_gpu.hip [skipped, already hipified] Successfully preprocessed all matching files. Total number of unsupported CUDA function calls: 0 Total number of replaced kernel launches: 5 each_nvcc_Input_output: -O3 -O3 each_nvcc_Input_output: -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_OPERATORS__ each_nvcc_Input_output: -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF_CONVERSIONS__ each_nvcc_Input_output: -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ each_nvcc_Input_output: -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ each_nvcc_Input_output: -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ each_nvcc_Input_output: -U__CUDA_NO_BFLOAT162_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ each_nvcc_Input_output: -DUSE_ROCM=1 -DUSE_ROCM=1 input_nvcc_args: ['-O3', '-U__CUDA_NO_HALF_OPERATORS__', '-U__CUDA_NO_HALF_CONVERSIONS__', '-U__CUDA_NO_BFLOAT16_OPERATORS__', '-U__CUDA_NO_BFLOAT16_CONVERSIONS__', '-U__CUDA_NO_BFLOAT162_OPERATORS__', '-U__CUDA_NO_BFLOAT162_CONVERSIONS__', '-DUSE_ROCM=1'] output_nvcc_args: ['-O3', '-U__CUDA_NO_HALF_OPERATORS__', '-U__CUDA_NO_HALF_CONVERSIONS__', '-U__CUDA_NO_BFLOAT16_OPERATORS__', '-U__CUDA_NO_BFLOAT16_CONVERSIONS__', '-U__CUDA_NO_BFLOAT162_OPERATORS__', '-U__CUDA_NO_BFLOAT162_CONVERSIONS__', '-DUSE_ROCM=1'] /usr/local/lib/python3.10/dist-packages/setuptools/dist.py:759: SetuptoolsDeprecationWarning: License classifiers are deprecated. !! ******************************************************************************** Please consider removing the following classifiers in favor of a SPDX license expression: License :: OSI Approved :: BSD License See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details. ******************************************************************************** !! self._finalize_license_expression() running install /usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py:90: SetuptoolsDeprecationWarning: setup.py install is deprecated. !! ******************************************************************************** Please avoid running `setup.py directly. Instead, use pypa/build, pypa/installer or other standards-based tools. By 2025-Oct-31, you need to update your project and remove deprecated calls or your builds will no longer be supported. See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details. ******************************************************************************** !! self.initialize_options() running build running build_py running build_ext building 'fast_hadamard_transform_cuda' extension Emitting ninja build file /workspace/whl/20251013/fast-hadamard-transform-xd-multi-platform/build/temp.linux-x86_64-cpython-310/build.ninja... Compiling objects... Using envvar MAX_JOBS (1) as the number of workers

@yeahdongcn
Copy link
Author

@fanshao123456 This should be related to your dev env setup.

@fanshao123456
Copy link

fast-hadamard-transform

So do I need to modify the environment, like Torch or something else? I noticed that fast-hadamard-transform takes a 3D input—can it be replaced with the 2D hadamard-transform?

@hadipourh
Copy link

I created a new library supporting multiple backends:
https://github.com/hadipourh/fwht

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
@yiakwy-xpu-ml-framework-team

@yeahdongcn Have you verified in any AMD devices ?

@yeahdongcn
Copy link
Author

@yeahdongcn Have you verified in any AMD devices ?

Yes, see👆 verified on mi300x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants