Torch 2.7 and CUDA 12.8 compile failing how to fix? X-Pose\models\UniPose\ops>python setup.py build_ext bdist_wheel

(venv) E:\LP_V17\X-Pose\models\UniPose\ops>python setup.py build_ext bdist_wheel
running build_ext
building 'MultiScaleDeformableAttention' extension
E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py:2330: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
  warnings.warn(
Emitting ninja build file E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -IE:\LP_V17\X-Pose\models\UniPose\ops\src -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IE:\LP_V17\LivePortrait\venv\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu -o E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
FAILED: E:/LP_V17/X-Pose/models/UniPose/ops/build/temp.win-amd64-cpython-310/Release/LP_V17/X-Pose/models/UniPose/ops/src/cuda/ms_deform_attn_cuda.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -IE:\LP_V17\X-Pose\models\UniPose\ops\src -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include -IE:\LP_V17\LivePortrait\venv\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IE:\LP_V17\LivePortrait\venv\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu -o E:\LP_V17\X-Pose\models\UniPose\ops\build\temp.win-amd64-cpython-310\Release\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/detail/libcxx/include/cmath(1032): warning #221-D: floating-point value does not fit in required floating-point type
    if (__r >= ::nextafter(static_cast<_RealT>(_MaxVal), ((float)(1e+300))))
                                                          ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu(64): error: no suitable conversion function from "const at::DeprecatedTypeProperties" to "c10::ScalarType" exists
          [&] { const auto& the_type = value.type(); constexpr const char* at_dispatch_name = "ms_deform_attn_forward_cuda"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { if (!(false)) { ::c10::detail::torchCheckFail( __func__, "E:\\LP_V17\\X-Pose\\models\\UniPose\\ops\\src\\cuda\\ms_deform_attn_cuda.cu", static_cast<uint32_t>(74), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false.  " "(Could this error message be improved?  If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeT<at::ScalarType::Double>; return ([&] { ms_deformable_im2col_cuda(at::cuda::getCurrentCUDAStream(), value.data<scalar_t>() + n * im2col_step_ * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, columns.data<scalar_t>()); })(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { if (!(false)) { ::c10::detail::torchCheckFail( __func__, "E:\\LP_V17\\X-Pose\\models\\UniPose\\ops\\src\\cuda\\ms_deform_attn_cuda.cu", static_cast<uint32_t>(74), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false.  " "(Could this error message be improved?  If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeT<at::ScalarType::Float>; return ([&] { ms_deformable_im2col_cuda(at::cuda::getCurrentCUDAStream(), value.data<scalar_t>() + n * im2col_step_ * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, columns.data<scalar_t>()); })(); } default: if (!(false)) { ::c10::detail::torchCheckFail( __func__, "E:\\LP_V17\\X-Pose\\models\\UniPose\\ops\\src\\cuda\\ms_deform_attn_cuda.cu", static_cast<uint32_t>(74), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false.  " "(Could this error message be improved?  If so, " "please report an enhancement request to PyTorch.)",'"', at_dispatch_name, "\" not implemented for '", toString(_st), "'"))); }; } }();

                                                ^

E:\LP_V17\X-Pose\models\UniPose\ops\src\cuda\ms_deform_attn_cuda.cu(134): error: no suitable conversion function from "const at::DeprecatedTypeProperties" to "c10::ScalarType" exists
          [&] { const auto& the_type = value.type(); constexpr const char* at_dispatch_name = "ms_deform_attn_backward_cuda"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { if (!(false)) { ::c10::detail::torchCheckFail( __func__, "E:\\LP_V17\\X-Pose\\models\\UniPose\\ops\\src\\cuda\\ms_deform_attn_cuda.cu", static_cast<uint32_t>(147), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false.  " "(Could this error message be improved?  If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeT<at::ScalarType::Double>; return ([&] { ms_deformable_col2im_cuda(at::cuda::getCurrentCUDAStream(), grad_output_g.data<scalar_t>(), value.data<scalar_t>() + n * im2col_step_ * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, grad_value.data<scalar_t>() + n * im2col_step_ * per_value_size, grad_sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, grad_attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size); })(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { if (!(false)) { ::c10::detail::torchCheckFail( __func__, "E:\\LP_V17\\X-Pose\\models\\UniPose\\ops\\src\\cuda\\ms_deform_attn_cuda.cu", static_cast<uint32_t>(147), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false.  " "(Could this error message be improved?  If so, " "please report an enhancement request to PyTorch.)","dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name))); }; } } while (0); using scalar_t [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeT<at::ScalarType::Float>; return ([&] { ms_deformable_col2im_cuda(at::cuda::getCurrentCUDAStream(), grad_output_g.data<scalar_t>(), value.data<scalar_t>() + n * im2col_step_ * per_value_size, spatial_shapes.data<int64_t>(), level_start_index.data<int64_t>(), sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size, batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point, grad_value.data<scalar_t>() + n * im2col_step_ * per_value_size, grad_sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size, grad_attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size); })(); } default: if (!(false)) { ::c10::detail::torchCheckFail( __func__, "E:\\LP_V17\\X-Pose\\models\\UniPose\\ops\\src\\cuda\\ms_deform_attn_cuda.cu", static_cast<uint32_t>(147), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false.  " "(Could this error message be improved?  If so, " "please report an enhancement request to PyTorch.)",'"', at_dispatch_name, "\" not implemented for '", toString(_st), "'"))); }; } }();

                                                 ^

2 errors detected in the compilation of "E:/LP_V17/X-Pose/models/UniPose/ops/src/cuda/ms_deform_attn_cuda.cu".
ms_deform_attn_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2480, in _run_ninja_build
    subprocess.run(
  File "C:\Python310\lib\subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "E:\LP_V17\X-Pose\models\UniPose\ops\setup.py", line 64, in <module>
    setup(
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
    return run_commands(dist)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
    dist.run_commands()
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\dist.py", line 968, in run_commands
    self.run_command(cmd)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\dist.py", line 1217, in run_command
    super().run_command(command)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
    cmd_obj.run()
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
    _build_ext.run(self)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 346, in run
    self.build_extensions()
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 1007, in build_extensions
    build_ext.build_extensions(self)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 466, in build_extensions
    self._build_extensions_serial()
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 492, in _build_extensions_serial
    self.build_extension(ext)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\command\build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 547, in build_extension
    objects = self.compiler.compile(
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 975, in win_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2133, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "E:\LP_V17\LivePortrait\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2496, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

(venv) E:\LP_V17\X-Pose\models\UniPose\ops>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch 2.7 and CUDA 12.8 compile failing how to fix? X-Pose\models\UniPose\ops>python setup.py build_ext bdist_wheel #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Torch 2.7 and CUDA 12.8 compile failing how to fix? X-Pose\models\UniPose\ops>python setup.py build_ext bdist_wheel #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions