Skip to content

Torch 2.7 and CUDA 12.8 compile failing how to fix? X-Pose\models\UniPose\ops>python setup.py build_ext bdist_wheel #45

@FurkanGozukara

Description

@FurkanGozukara

(venv) E:\LP_V17\X-Pose\models\UniPose\ops>python setup.py build_ext bdist_wheel
running build_ext
building 'MultiScaleDeformableAttention' extension
E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py:2330: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Emitting ninja build file E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -IE:\LP_V17\X-Pose\models\UniPose\ops\src -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IE:\LP_V17\LivePortrait\venv\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu -o E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
FAILED: E:/LP_V17/X-Pose/models/UniPose/ops/build/temp.win-amd64-cpython-310/Release/LP_V17/X-Pose/models/UniPose/ops/src/cuda/ms_deform_attn_cuda.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -IE:\LP_V17\X-Pose\models\UniPose\ops\src -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IE:\LP_V17\LivePortrait\venv\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu -o E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/detail/libcxx/include/cmath(1032): warning #221-D: floating-point value does not fit in required floating-point type
if (__r >= ::nextafter(static_cast<_RealT>(_MaxVal), ((float)(1e+300))))
^

Remark: The warnings can be suppressed with "-diag-suppress "

E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu(64): error: no suitable conversion function from "const at::DeprecatedTypeProperties" to "c10::ScalarType" exists
[&] { const auto& the_type = value.type(); constexpr const char* at_dispatch_name = "ms_deform_attn_forward_cuda"; at::ScalarType st = ::detail::scalar_type(the_type); ; switch (st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { if (!(false)) { ::c10::detail::torchCheckFail( func, "E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu", static_cast<uint32_t>(74), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeTat::ScalarType::Double; return ([&] { ms_deformable_im2col_cuda(at::cuda::getCurrentCUDAStream(), value.data<scalar_t>() + n * im2col_step * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, columns.data<scalar_t>()); })(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { if (!(false)) { ::c10::detail::torchCheckFail( func, "E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu", static_cast<uint32_t>(74), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeTat::ScalarType::Float; return ([&] { ms_deformable_im2col_cuda(at::cuda::getCurrentCUDAStream(), value.data<scalar_t>() + n * im2col_step_ * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, columns.data<scalar_t>()); })(); } default: if (!(false)) { ::c10::detail::torchCheckFail( func, "E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu", static_cast<uint32_t>(74), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)",'"', at_dispatch_name, "" not implemented for '", toString(_st), "'"))); }; } }();

                                            ^

E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu(134): error: no suitable conversion function from "const at::DeprecatedTypeProperties" to "c10::ScalarType" exists
[&] { const auto& the_type = value.type(); constexpr const char* at_dispatch_name = "ms_deform_attn_backward_cuda"; at::ScalarType st = ::detail::scalar_type(the_type); ; switch (st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { if (!(false)) { ::c10::detail::torchCheckFail( func, "E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu", static_cast<uint32_t>(147), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeTat::ScalarType::Double; return ([&] { ms_deformable_col2im_cuda(at::cuda::getCurrentCUDAStream(), grad_output_g.data<scalar_t>(), value.data<scalar_t>() + n * im2col_step * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, grad_value.data<scalar_t>() + n * im2col_step_ * per_value_size, grad_sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, grad_attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size); })(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { if (!(false)) { ::c10::detail::torchCheckFail( func, "E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu", static_cast<uint32_t>(147), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeTat::ScalarType::Float; return ([&] { ms_deformable_col2im_cuda(at::cuda::getCurrentCUDAStream(), grad_output_g.data<scalar_t>(), value.data<scalar_t>() + n * im2col_step_ * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, grad_value.data<scalar_t>() + n * im2col_step_ * per_value_size, grad_sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, grad_attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size); })(); } default: if (!(false)) { ::c10::detail::torchCheckFail( func, "E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu", static_cast<uint32_t>(147), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)",'"', at_dispatch_name, "" not implemented for '", toString(_st), "'"))); }; } }();

                                             ^

2 errors detected in the compilation of "E:/LP_V17/X-Pose/models/UniPose/ops/src/cuda/ms_deform_attn_cuda.cu".
ms_deform_attn_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2480, in _run_ninja_build
subprocess.run(
File "C:\Python310\lib\subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "E:\LP_V17\X-Pose\models\UniPose\ops\setup.py", line 64, in
setup(
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_init_.py", line 87, in setup
return distutils.core.setup(**attrs)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\core.py", line 185, in setup
return run_commands(dist)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\core.py", line 201, in run_commands
dist.run_commands()
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\dist.py", line 968, in run_commands
self.run_command(cmd)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\dist.py", line 1217, in run_command
super().run_command(command)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\dist.py", line 987, in run_command
cmd_obj.run()
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
_build_ext.run(self)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 346, in run
self.build_extensions()
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 1007, in build_extensions
build_ext.build_extensions(self)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 466, in build_extensions
self._build_extensions_serial()
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 492, in _build_extensions_serial
self.build_extension(ext)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\command\build_ext.py", line 246, in build_extension
_build_ext.build_extension(self, ext)
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 547, in build_extension
objects = self.compiler.compile(
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 975, in win_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2133, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2496, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

(venv) E:\LP_V17\X-Pose\models\UniPose\ops>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions