Skip to content

CUDA v11 Compatibility Issue #56

@Sizhang92190

Description

@Sizhang92190

Really nice work, definitely worth trying out!

It took me a bit of time to install the package properly, so I wanted to share a potential fix that might help others facing the same issue.

I initially followed the installation instructions and ran those command:

1. pip install numpy==1.21.2 pandas==1.5.3
2. pip install torch==1.12.1+cu113 -f https://download.pytorch.org/whl/torch_stable.html
3. pip install biopython==1.79 dm-tree==0.1.6 modelcif==0.7 ml-collections==0.1.0 scipy==1.7.1 absl-py einops
4. pip install pytorch_lightning==2.0.4 fair-esm mdtraj==1.9.9 wandb
5. pip install 'openfold @ git+https://github.com/aqlaboratory/openfold.git@103d037'

However, after running command 4, I noticed following msg:

...
Attempting uninstall: torch
  Found existing installation: torch 1.12.1+cu113
  Uninstalling torch-1.12.1+cu113:
    Successfully uninstalled torch-1.12.1+cu113
...

Instead, torch 2.8.0 was installed. This caused a CUDA version mismatch error during step 5:
...RuntimeError: ('The detected CUDA version (%s) mismatches the version that was used to compilePyTorch (%s). Please make sure to use the same CUDA versions.', '11.8', '12.8')...

The issue seems to have been caused by pytorch_lightning==2.0.4; downgrading to pytorch_lightning==1.9.0 resolved the problem.

However, simply replacing step 4 with:
pip install pytorch_lightning==1.9.0 fair-esm mdtraj==1.9.9 wandb won’t work.

Instead, this command works ok:
pip install torch==1.12.1+cu113 pytorch_lightning==1.9.0 fair-esm mdtraj==1.9.9 wandb -f https://download.pytorch.org/whl/torch_stable.html

Here, my working installation steps:

module load cuda/11.8 gcc-11
git clone https://github.com/bjing2016/alphaflow
conda create -n alphaflow python=3.9
conda activate alphaflow
pip install numpy==1.21.2 pandas==1.5.3
pip install torch==1.12.1+cu113 -f https://download.pytorch.org/whl/torch_stable.html 
pip install biopython==1.79 dm-tree==0.1.6 modelcif==0.7 ml-collections==0.1.0 scipy==1.7.1 absl-py einops
pip install pytorch_lightning==2.0.4 fair-esm mdtraj==1.9.9 wandb 
pip install torch==1.12.1+cu113 pytorch_lightning==1.9.0 fair-esm mdtraj==1.9.9 wandb -f https://download.pytorch.org/whl/torch_stable.html
pip install 'openfold @ git+https://github.com/aqlaboratory/openfold.git@103d037'

Note on CUDA 12.6:
I also tested the CUDA 12.6-based installation following [Issue #40 ].(#40)
The environment could be installed successfully, but when running prediction jobs, I got this:
TypeError: __init__() missing 2 required positional arguments: 'opm_first' and 'fuse_projection_weights'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions