You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add and restructure documentation for RFdiffusion2
Introduces new documentation files including installation instructions, overview, and links to README and license. Updates Sphinx configuration to include sphinx-copybutton and reorganizes index.rst for improved navigation. The GitHub workflow now installs sphinx-copybutton. README updated with a link to the documentation site.
* Add sphinx-new-tab-link and update documentation overview
Added the sphinx-new-tab-link extension to the Sphinx build process and configuration. Updated overview.md with a detailed explanation of RFdiffusion2's capabilities and differences from the original RFdiffusion, and excluded overview.md from Sphinx build patterns.
* Update and expand documentation for RFdiffusion2
Improved the README with clearer setup, installation, and inference instructions, and added troubleshooting tips. Expanded the documentation by adding new usage examples, configuration options, and ORI token explanations. Updated Sphinx configuration and index to include new documentation files, and adjusted workflow and dependencies to comment out unused Sphinx extensions. Enhanced installation guide with troubleshooting for Apptainer image issues and clarified instructions for both Apptainer and source installations.
* Add CUDA 12.1/12.4 environment setup files
Introduced conda environment YAMLs and pip requirements for CUDA 12.1 and 12.4 support in the 'envs' directory. Updated installation documentation to guide users on using these files and troubleshooting environment setup. Minor README correction for pipeline argument.
* Update envs/requirements_cuda121.txt
Co-authored-by: Copilot <[email protected]>
* Update envs/cuda124_env.yml
Co-authored-by: Copilot <[email protected]>
* Update envs/cuda121_env.yml
Co-authored-by: Copilot <[email protected]>
* Update doc/source/installation.md
Co-authored-by: Copilot <[email protected]>
* Update doc/source/installation.md
Co-authored-by: Copilot <[email protected]>
* Update doc/source/conf.py
Co-authored-by: Copilot <[email protected]>
* Fix formatting and spelling in README.md
* Update README.md
Co-authored-by: Copilot <[email protected]>
* Update README.md
Co-authored-by: Copilot <[email protected]>
---------
Co-authored-by: Copilot <[email protected]>
Copy file name to clipboardExpand all lines: README.md
+68-25Lines changed: 68 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
> 🚧 **Under Construction:**
2
2
> This is an initial release of the functionality — further documentation and cleanup is in progress. Currently only inference is supported. But everything currently documented in this README should be runnable. If you are experiencing a bug with inference, feel free to file an issue and attach the output .pdb, .trb, **and** a picture of the design visualized according to the pymol visualization section of this README.
3
3
4
-
# RFdiffusion 2
4
+
# RFdiffusion2
5
5
6
6
Open source code for RFdiffusion2 as described in the following pre-print.
7
7
@@ -18,7 +18,24 @@ Open source code for RFdiffusion2 as described in the following pre-print.
18
18
19
19
More detailed information about how to run, install, and use RFdiffusion2 can be found [here](https://rosettacommons.github.io/RFdiffusion2/).
If these setup instructions do not work for your system see the [Installation Guide](installation.html) for troubleshooting issues with the
38
+
provided image and alternative instructions for how to install RFdiffusion2 from source.
22
39
23
40
1.**Clone the repo.**
24
41
@@ -28,6 +45,7 @@ More detailed information about how to run, install, and use RFdiffusion2 can be
28
45
```bash
29
46
export PYTHONPATH="/my/path/to/RFdiffusion2"
30
47
```
48
+
You will need to export this variable every time you use RFdiffusion2.
31
49
32
50
3.**Download the model weights and containers:**
33
51
```bash
@@ -36,12 +54,12 @@ More detailed information about how to run, install, and use RFdiffusion2 can be
36
54
```
37
55
These files are quite large so the download process can take over half an hour. If the download process gets terminated before finishing run `python setup.py overwrite` so that any partially downloaded files can be overwritten.
38
56
39
-
4.**Install apptainer**
57
+
4.**Install Apptainer**
40
58
41
-
*Note: you can also run RFdiffusion2 with singularity.*
59
+
*Note: You can also run RFdiffusion2 with [Singularity](https://sylabs.io/singularity/).*
42
60
43
-
RFdiffusion2 (RFD2) uses [apptainer](https://apptainer.org) to simplify the environment set-up.
44
-
If you do not already have apptainer on your system, please follow the apptainer [installation instructions](https://apptainer.org/docs/admin/main/installation.html) for your operating system.
61
+
RFdiffusion2 (RFD2) uses [Apptainer](https://apptainer.org) to simplify the environment setup.
62
+
If you do not already have Apptainer on your system, please follow the [Apptainer installation instructions](https://apptainer.org/docs/admin/main/installation.html) for your operating system.
45
63
46
64
If you manage your packages on linux with `apt` you can simply run:
47
65
```bash
@@ -60,19 +78,26 @@ More detailed information about how to run, install, and use RFdiffusion2 can be
For other usage examples, see the [Usage page](rosettacommons.github.io/RFdiffusion2/usage/usage.html) in the external documentation.
65
85
66
-
To run a demo of some of the inference capabilities, including enzyme design from an atomic motif + small molecule, enzyme design from an atomic motif of unknown sequence positions + small molecule, and small-molecule binder design (with RASA conditioning to enforce burial of the small molecule).
67
-
(See `rf_diffusion/benchmark/demo.json` for how these tasks are declared.) Note that this will be extremely slow if not run on a GPU.
86
+
The below commands run several demos that show off some of the inference capabilities of RFdiffusion2 including:
87
+
- enzyme design from an atomic motif and small molecule
88
+
- enzyme design from an atomic motif of unknown sequence positions and a small molecule
89
+
- small-molecule binder design with RASA conditioning
90
+
The possible demo options can be found in `rf_diffusion/benchmark/open_source_demo.json` which also shows the different inference configurations
91
+
used in each demo. Note that this will be extremely slow if not run on a GPU.
68
92
69
-
The default argument `in_proc=True` in `open_source_demo.yaml` makes the script run locally. With `in_proc=False` the pipeline will automatically distribute the tasks using SLURM, but this is not yet supported.
93
+
<!--The default argument `in_proc=True` in `open_source_demo.yaml` makes the script run locally. With `in_proc=False` the pipeline will automatically distribute the tasks using SLURM, but this is not yet supported.-->
70
94
The demo generates 150 residue proteins with many of the residues atomized, so it can take upwards of 30 minutes to run all cases. Each case may take up to 10 minutes on an RTX2060.
in the directory that you are running the image file from.
86
112
87
-
This runs only the design stage of the pipeline. In order to continue through sequence-fitting with [LigandMPNN](https://github.com/dauparas/LigandMPNN) and folding with [Chai1](https://github.com/chaidiscovery/chai-lab), pass the command line argument: `stop_step=''`. Note that Chai1 cannot run on all GPU architectures.
113
+
This runs only the design stage of the pipeline. In order to continue through sequence-fitting with [LigandMPNN](https://github.com/dauparas/LigandMPNN) and folding with [Chai1](https://github.com/chaidiscovery/chai-lab), pass the command line argument: `stop_step='end'`. Note that Chai1 cannot run on all GPU architectures.
88
114
89
115
Pipeline runs can be resumed by passing `outdir=/path/to/your/output/directory`.
90
116
91
117
92
118
## Viewing Designs
119
+
<aid="viewing-designs"></a>
93
120
94
121
Visualizing the design outputs can be confusing when looking at the raw .pdb files, especially for unindexed motif scaffolding, in which the input motif is incorporated into the protein at indices of the network's choice.
95
122
To simplify this, we provide scripts for visualizing the outputs of the network that interact with a local PyMOL instance over XMLRPC.
96
123
97
-
Download [PyMOL](https://www.pymol.org/) and run it as an XMLRPCServer with:
124
+
Download [PyMOL](https://www.pymol.org/) and run it as an XML-RPC server with:
98
125
```bash
99
126
pymol -R
100
127
```
101
128
102
129
### PyMOL and designs on the same machine
130
+
<aid="same_machine_pymol"></a>
103
131
104
132
Run:
105
133
```bash
@@ -119,27 +147,39 @@ You should see something like:
119
147
- Any small molecules will have their carbon atoms colored purple.
120
148
121
149
### PyMOL running locally, designs on remote GPU
150
+
<aid="remote_machine_pymol"></a>
122
151
123
152
It is common for users to be sshed into a gpu cluster for running designs.
124
153
It is still possible to view designs on a remote computer from your local PyMOL, as long as your remote computer has a route to your local computer (via VPN or ssh proxy).
125
154
126
-
Simply find your hostname on your cluster with:
127
-
```bash
128
-
hostname -I
129
-
192.168.0.113 100.64.128.68
155
+
You will need to know an IP address that will point back to your computer, typically you can use 127.0.0.1.
156
+
157
+
You will also need to add the -R option when you are signing into your cluster and provide the path back to your PyMOL server:
158
+
```
159
+
ssh username@hostname -R 9123:localhost:9123
130
160
```
161
+
You do **not** need to replace `localhost` with an IP address. The 9123 is the port that should have been printed in the PyMOL terminal
162
+
when you first set up the XML-RPC server.
131
163
132
-
The second number is the route to your computer.
164
+
If you need to sign into multiple servers before you can run Apptainer, you will need to sign into your cluster like this:
Make sure to replace 100.64.128.68 with your computer's route. 9123 is the port that PyMol uses, after running `pymol -R` you should see a message containing this route number.
174
+
If 127.0.0.1 does not point to your local machine, replace it in all the above steps in this section with an IP address that does point to your machine.
175
+
176
+
For more information about using PyMOL remotely, see this blog post on [Controlling PyMOL from afar](https://www.blopig.com/blog/2024/11/controlling-pymol-from-afar/).
139
177
140
-
## Additional Info
178
+
## Additional Information
179
+
<aid="additional_info"></a>
141
180
142
-
### Running the AME Benchmark
181
+
### Running the AME benchmark
182
+
<aid="ame_benchmark"></a>
143
183
144
184
We crawled M-CSA for 41 enzymes where all reactants and products are present to create this benchmark.
145
185
Only positon-agnostic tip atoms are provided to the network. 100 designs for each case are created. Run it with:
Running this entire benchmark will perform [41 active sites * 100 designs per active site * 8 sequences per design] chai folding runs, which will take a prohibitively long time on a single machine, but for reproducibility it is included.
150
190
151
-
### Pipeline Metrics
191
+
### Pipeline metrics
192
+
<aid="pipeline_metrics"></a>
152
193
153
194
We also include the code that was used to benchmark the network.
154
195
The outline of the benchmarking process is:
@@ -178,7 +219,8 @@ Where:
178
219
-`full_atom`: All heavy atoms
179
220
-`motif_atom`: Only motif heavy atoms
180
221
181
-
#### RFdiffusion2 Outputs
222
+
#### RFdiffusion2 outputs
223
+
<aid="rfdiffusion2_outputs"></a>
182
224
183
225
The network outputs the protein with both the indexed backbone region and the unindexed atomized region.
184
226
After that several idealization steps are conducted; the backbone is idealized, the protein is deatomized and the unindexed residues are assigned their corresponding indexed residue (using a greedy algorithm that searches for the closest C-alpha in the indexed backbone).
@@ -189,7 +231,8 @@ There are two further idealization steps that are optional to users:
189
231
190
232
The protein at this point has sequence and structure for the motif regions but only backbone (N,Ca,C,O,C-Beta) coordinates for diffused residues (as well as any non-protein components e.g. small molecules).
191
233
192
-
#### LigandMPNN Outputs
234
+
#### LigandMPNN outputs
235
+
<aid="ligandmpnn_outputs"></a>
193
236
194
237
Sequence is fit using LigandMPNN in a ligand-aware, motif-rotamer-aware mode. LigandMPNN also performs packing. LigandMPNN attempts to keep the motif rotamers unchanged, however the pack uses a more conservative set of torsions than RF All-Atom (i.e. fewer DoF) to pack the rotamers and thus there is often some deviation between the RF All-Atom-idealized and ligandmpnn-idealized motif rotamers. The idealization gap between the diffusion-output rotamer set and the RF All-Atom-idealized rotamer set can be found with metrics key: `metrics.IdealizedResidueRMSD.rmsd_constellation`. The corresponding gap between the rf2aa-idealized (or not idealized if `inference.idealize_sidechain_outputs == False`) rotamer set and the ligandmpnn-idealized rotamer set can be found with metrics key: `motif_ideality_diff`.
0 commit comments