Skip to content

Conversation

@liuzongyue6
Copy link
Collaborator

cuda 13 running issue from Prof. Frank

uv run ./run --dataset_dir tests/data/set1_lund_door \
/home/dellaert/git/gtsfm/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:235: UserWarning: 
NVIDIA GeForce RTX 5060 Ti with CUDA capability sm_120 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_70 sm_75 sm_80 sm_86 sm_90.
If you want to use the NVIDIA GeForce RTX 5060 Ti GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(
/home/dellaert/git/gtsfm/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:235: UserWarning: 
NVIDIA GeForce RTX 5060 Ti with CUDA capability sm_120 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_70 sm_75 sm_80 sm_86 sm_90.
If you want to use the NVIDIA GeForce RTX 5060 Ti GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(
2025-12-12 17:11:32,162 - distributed.worker - ERROR - Compute Failed
Key:       apply_global_descriptor_batch-1ccaa20b3e49f96e8d9890bb75dcfeb3
State:     executing
Task:  <Task 'apply_global_descriptor_batch-1ccaa20b3e49f96e8d9890bb75dcfeb3' apply_global_descriptor_batch(...)>
Exception: "RuntimeError('CUDA error: no kernel image is available for execution on the device\\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1\\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.\\n')"
Traceback: '  File "/home/dellaert/git/gtsfm/gtsfm/retriever/image_pairs_generator.py", line 77, in apply_global_descriptor_batch\n    return global_descriptor.describe_batch(images=image_batch)\n  File "/home/dellaert/git/gtsfm/gtsfm/frontend/cacher/global_descriptor_cacher.py", line 99, in describe_batch\n    global_descriptors = self._global_descriptor.describe_batch(images)\n  File "/home/dellaert/git/gtsfm/gtsfm/frontend/global_descriptor/netvlad_global_descriptor.py", line 61, in describe_batch\n    batch_descriptors = self._model({"image": images})\n  File "/home/dellaert/git/gtsfm/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl\n    return self._call_impl(*args, **kwargs)\n  File "/home/dellaert/git/gtsfm/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl\n    return forward_call(*args, **kwargs)\n  File "/home/dellaert/git/gtsfm/thirdparty/hloc/netvlad.py", line 172, in forward\n    assert image.min() >= -EPS and image.max() <= 1 + EPS\n'

…y README for UV installation instructions and system package requirements.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates PyTorch, torchvision, and xformers dependencies to support NVIDIA Blackwell GPUs (RTX 5060 Ti with CUDA capability sm_120). The main driver is resolving a CUDA compatibility error where the previous PyTorch 2.5.1 with CUDA 12.1 support did not include the sm_120 architecture needed for Blackwell GPUs.

Key changes:

  • Upgraded PyTorch from 2.5.1 to >=2.7.0,<2.8.0 and torchvision from 0.20.1 to >=0.22.0,<0.23.0
  • Updated CUDA support from 12.1 to 12.8 for Blackwell GPU compatibility
  • Upgraded xformers from 0.0.29 to 0.0.30 for compatibility with newer PyTorch

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
pyproject.toml Updated PyTorch and torchvision version constraints, added pytorch-cu128 index, updated xformers version, and changed CUDA version comment to reflect Blackwell support
README.md Updated installation instructions to reflect CUDA 12.8 and torch 2.7.0, reorganized UV setup section headers, and moved system package installation instructions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@liuzongyue6 liuzongyue6 changed the title [skip benchmark] Update dependencies in pyproject.toml for PyTorch and xformers; modif… [skip benchmarks] Update dependencies for PyTorch, for cuda13 Dec 23, 2025
@liuzongyue6 liuzongyue6 changed the title [skip benchmarks] Update dependencies for PyTorch, for cuda13 Update dependencies for PyTorch, for cuda13 Dec 23, 2025
@liuzongyue6 liuzongyue6 requested a review from dellaert December 23, 2025 14:33
Copy link
Member

@dellaert dellaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you !!!!!
Let me try it on my machine and if it works I’ll approve and merge

@liuzongyue6 liuzongyue6 changed the title Update dependencies for PyTorch, for cuda13 Update dependencies for PyTorch, for system cuda13 Jan 7, 2026
@liuzongyue6 liuzongyue6 requested a review from dellaert January 7, 2026 20:08
Copy link
Member

@dellaert dellaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments, and one question:
we now seem to have two sets of requirements: one in yml file, and one in toml file? And we have to keep them in sync?

@liuzongyue6
Copy link
Collaborator Author

@dellaert All *.md file are in the root and looks unorganized.
Shall I create a docs/ folder?

@liuzongyue6 liuzongyue6 requested a review from dellaert January 8, 2026 02:15
- Created docs/ folder with setup/ and deployment/ subdirectories
- Moved conda-setup.md to docs/setup/
- Moved uv-setup.md to docs/setup/
- Moved CLUSTER.md to docs/deployment/
- Moved CONTRIBUTING.md to docs/
- Updated all references in README.md to point to new locations

This improves project organization by keeping the root directory cleaner
and grouping related documentation together.
Copy link
Member

@dellaert dellaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@liuzongyue6 liuzongyue6 merged commit 46e0afa into master Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants