Skip to content

Commit 1ba7992

Browse files
committed
Merge remote-tracking branch 'origin/main' into fix_scaffoldguided_design
2 parents 7c30fee + ff20fba commit 1ba7992

File tree

9 files changed

+337
-110
lines changed

9 files changed

+337
-110
lines changed

.github/workflows/main.yml

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,13 @@ jobs:
6969
uv pip install --no-cache-dir -e . --no-deps
7070
rm -rf ~/.cache # /app/RFdiffusion/tests
7171
72+
- name: Preseed DGL backend
73+
shell: bash
74+
run: |
75+
mkdir -p "$HOME/.dgl"
76+
printf '{"backend": "pytorch"}' > "$HOME/.dgl/config.conf"
77+
echo "DGLBACKEND=pytorch" >> "$GITHUB_ENV"
78+
7279
- name: Download weights
7380
run: |
7481
mkdir models
@@ -87,8 +94,29 @@ jobs:
8794
- name: Setup and Run ppi_scaffolds tests
8895
run: |
8996
tar -xvf examples/ppi_scaffolds_subset.tar.gz -C examples
90-
cd tests && uv run python test_diffusion.py
97+
total_chunks=$(nproc)
98+
cd tests
99+
100+
#launch all chunks in background and record PIDs + labels
101+
pids=""
102+
for chunk_index in $(seq 1 $total_chunks); do
103+
echo "Running chunk $chunk_index of $total_chunks"
104+
uv run python test_diffusion.py --total_chunks $total_chunks --chunk_index $chunk_index &
105+
pids="$pids $!"
106+
done
107+
108+
# wait for each and track failures
109+
fail=0
110+
for pid in $pids; do
111+
if ! wait "$pid"; then
112+
echo "A chunk (PID $pid) failed"
113+
fail=1
114+
else
115+
echo "A chunk (PID $pid) passed"
116+
fi
117+
done
91118
119+
exit "$fail"
92120
93121
# - name: Test with pytest
94122
# run: |

README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@ RFdiffusion is an open source method for structure generation, with or without c
2121
- Binder design
2222
- Design diversification ("partial diffusion", sampling around a design)
2323

24+
## [Documentation](https://sites.google.com/omsf.io/rfdiffusion/overview)
25+
View the RFdiffusion documentation resource maintained by Rosetta Commons [here](https://sites.google.com/omsf.io/rfdiffusion/overview).
26+
2427
----
2528

2629
# Table of contents
@@ -54,7 +57,9 @@ RFdiffusion is an open source method for structure generation, with or without c
5457

5558
# Getting started / installation
5659

57-
Thanks to Sergey Ovchinnikov, RFdiffusion is available as a [Google Colab Notebook](https://colab.research.google.com/github/sokrypton/ColabDesign/blob/v1.1.1/rf/examples/diffusion.ipynb) if you would like to run it there!
60+
Thanks to Sergey Ovchinnikov,many of the features of RFdiffusion are available as a [Google Colab Notebook](https://colab.research.google.com/github/sokrypton/ColabDesign/blob/v1.1.1/rf/examples/diffusion.ipynb) if you would like to run it there!
61+
62+
There is also an official, Rosetta Commons-maintainted Docker image that you can find [here](https://hub.docker.com/r/rosettacommons/rfdiffusion). Thank you to Sergey Lyskov, Ajasja Ljubetic, and Hope Woods for creating this resource.
5863

5964
We strongly recommend reading this README carefully before getting started with RFdiffusion, and working through some of the examples in the Colab Notebook.
6065

examples/design_macrocyclic_binder.sh

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,15 @@
11
#!/bin/bash
22

3-
prefix=./outputs/diffused_binder_cyclic2
3+
# Note that in the example below the indices in the
4+
# input_pdbs/7zkr_GABARAP.pdb file have been shifted
5+
# by +2 in chain A relative to pdbID 7zkr.
46

5-
# Note that the indices in this pdb file have been
6-
# shifted by +2 in chain A relative to pdbID 7zkr.
7-
pdb='./input_pdbs/7zkr_GABARAP.pdb'
8-
9-
num_designs=10
10-
script="../scripts/run_inference.py"
11-
$script --config-name base \
12-
inference.output_prefix=$prefix \
13-
inference.num_designs=$num_designs \
7+
../scripts/run_inference.py \
8+
--config-name base \
9+
inference.output_prefix=example_outputs/diffused_binder_cyclic2 \
10+
inference.num_designs=10 \
1411
'contigmap.contigs=[12-18 A3-117/0]' \
15-
inference.input_pdb=$pdb \
12+
inference.input_pdb=./input_pdbs/7zkr_GABARAP.pdb \
1613
inference.cyclic=True \
1714
diffuser.T=50 \
1815
inference.cyc_chains='a' \
Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,15 @@
11
#!/bin/bash
22

3-
prefix=./outputs/uncond_cycpep
4-
# Note that the indices in this pdb file have been
5-
# shifted by +2 in chain A relative to pdbID 7zkr.
6-
pdb='./input_pdbs/7zkr_GABARAP.pdb'
3+
# Note that in the example below the indices in the
4+
# input_pdbs/7zkr_GABARAP.pdb file have been shifted
5+
# by +2 in chain A relative to pdbID 7zkr.
76

8-
num_designs=10
9-
script="../scripts/run_inference.py"
10-
$script --config-name base \
11-
inference.output_prefix=$prefix \
12-
inference.num_designs=$num_designs \
7+
../scripts/run_inference.py \
8+
--config-name base \
9+
inference.output_prefix=example_outputs/uncond_cycpep \
10+
inference.num_designs=10 \
1311
'contigmap.contigs=[12-18]' \
14-
inference.input_pdb=$pdb \
12+
inference.input_pdb=input_pdbs/7zkr_GABARAP.pdb \
1513
inference.cyclic=True \
1614
diffuser.T=50 \
1715
inference.cyc_chains='a'

examples/design_tetrahedral_oligos.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@
55
# This external potential promotes contacts both within (with a relative weight of 1) and between chains (relative weight 0.1)
66
# We specify that we want to apply these potentials to all chains, with a guide scale of 2.0 (a sensible starting point)
77
# We decay this potential with quadratic form, so that it is applied more strongly initially
8-
# We specify a total length of 1200aa, so each chain is 100 residues long
8+
# We specify a total length of 1200aa, so each chain is 100 residues long - length updated to 600aa, so each chain is 50 residues long for testing to run faster
99

10-
python ../scripts/run_inference.py --config-name=symmetry inference.symmetry="tetrahedral" inference.num_designs=10 inference.output_prefix="example_outputs/tetrahedral_oligo" 'potentials.guiding_potentials=["type:olig_contacts,weight_intra:1,weight_inter:0.1"]' potentials.olig_intra_all=True potentials.olig_inter_all=True potentials.guide_scale=2.0 potentials.guide_decay="quadratic" 'contigmap.contigs=[1200-1200]'
10+
python ../scripts/run_inference.py --config-name=symmetry inference.symmetry="tetrahedral" inference.num_designs=10 inference.output_prefix="example_outputs/tetrahedral_oligo" 'potentials.guiding_potentials=["type:olig_contacts,weight_intra:1,weight_inter:0.1"]' potentials.olig_intra_all=True potentials.olig_inter_all=True potentials.guide_scale=2.0 potentials.guide_decay="quadratic" 'contigmap.contigs=[600-600]'

rfdiffusion/contigs.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ def __init__(
8080
self.inpaint,
8181
self.inpaint_hal,
8282
self.inpaint_rf,
83+
self.sampled_mask_length_bound,
8384
) = self.expand_sampled_mask()
8485
self.ref = self.inpaint + self.receptor
8586
self.hal = self.inpaint_hal + self.receptor_hal
@@ -241,6 +242,8 @@ def expand_sampled_mask(self):
241242
inpaint_chain_idx = -1
242243
receptor_chain_break = []
243244
inpaint_chain_break = []
245+
_receptor_mask_length_bound = []
246+
_inpaint_mask_length_bound = []
244247
for con in self.sampled_mask:
245248
if (
246249
all([i[0].isalpha() for i in con.split("/")[:-1]])
@@ -286,6 +289,7 @@ def expand_sampled_mask(self):
286289
receptor_chain_break.append(
287290
(receptor_idx - 1, 200)
288291
) # 200 aa chain break
292+
_receptor_mask_length_bound.append(len(receptor))
289293
else:
290294
inpaint_chain_idx += 1
291295
for subcon in con.split("/"):
@@ -320,6 +324,7 @@ def expand_sampled_mask(self):
320324
)
321325
inpaint_idx += int(subcon.split("-")[0])
322326
inpaint_chain_break.append((inpaint_idx - 1, 200))
327+
_inpaint_mask_length_bound.append(len(inpaint))
323328

324329
if self.topo is True or inpaint_hal == []:
325330
receptor_hal = [(i[0], i[1]) for i in receptor_hal]
@@ -335,14 +340,21 @@ def expand_sampled_mask(self):
335340
inpaint_rf[ch_break[0] :] += ch_break[1]
336341
for ch_break in receptor_chain_break[:-1]:
337342
receptor_rf[ch_break[0] :] += ch_break[1]
338-
343+
sampled_mask_length_bound = []
344+
sampled_mask_length_bound.extend(_inpaint_mask_length_bound)
345+
if _inpaint_mask_length_bound:
346+
inpaint_last_bound = _inpaint_mask_length_bound[-1]
347+
else:
348+
inpaint_last_bound = 0
349+
sampled_mask_length_bound.extend(map(lambda x: x + inpaint_last_bound, _receptor_mask_length_bound))
339350
return (
340351
receptor,
341352
receptor_hal,
342353
receptor_rf.tolist(),
343354
inpaint,
344355
inpaint_hal,
345356
inpaint_rf.tolist(),
357+
sampled_mask_length_bound
346358
)
347359

348360
def get_inpaint_seq_str(self, inpaint_s, ss=False):

rfdiffusion/inference/model_runners.py

Lines changed: 39 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from rfdiffusion import util
1515
from hydra.core.hydra_config import HydraConfig
1616
import os
17+
import string
1718

1819
from rfdiffusion.model_input_logger import pickle_function_call
1920
import sys
@@ -144,13 +145,14 @@ def initialize(self, conf: DictConfig) -> None:
144145
self.symmetry = None
145146

146147
self.allatom = ComputeAllAtomCoords().to(self.device)
147-
148+
148149
if self.inf_conf.input_pdb is None:
149150
# set default pdb
150151
script_dir=os.path.dirname(os.path.realpath(__file__))
151152
self.inf_conf.input_pdb=os.path.join(script_dir, '../../examples/input_pdbs/1qys.pdb')
152153
self.target_feats = iu.process_target(self.inf_conf.input_pdb, parse_hetatom=True, center=False)
153154
self.chain_idx = None
155+
self.idx_pdb = None
154156

155157
##############################
156158
### Handle Partial Noising ###
@@ -330,8 +332,42 @@ def sample_init(self, return_forward_trajectory=False):
330332
contig_map=self.contig_map
331333

332334
self.diffusion_mask = self.mask_str
333-
self.chain_idx=['A' if i < self.binderlen else 'B' for i in range(L_mapped)]
334-
335+
length_bound = self.contig_map.sampled_mask_length_bound.copy()
336+
337+
first_res = 0
338+
self.chain_idx = []
339+
self.idx_pdb = []
340+
all_chains = {contig_ref[0] for contig_ref in self.contig_map.ref}
341+
available_chains = sorted(list(set(string.ascii_letters) - all_chains))
342+
343+
# Iterate over each chain
344+
for last_res in length_bound:
345+
chain_ids = {contig_ref[0] for contig_ref in self.contig_map.ref[first_res: last_res]}
346+
# If we are designing this chain, it will have a '-' in the contig map
347+
# Renumber this chain from 1
348+
if "_" in chain_ids:
349+
self.idx_pdb += [idx + 1 for idx in range(last_res - first_res)]
350+
chain_ids = chain_ids - {"_"}
351+
# If there are no fixed residues that have a chain id, pick the first available letter
352+
if not chain_ids:
353+
if not available_chains:
354+
raise ValueError(f"No available chains! You are trying to design a new chain, and you have "
355+
f"already used all upper- and lower-case chain ids (up to 52 chains): "
356+
f"{','.join(all_chains)}.")
357+
chain_id = available_chains[0]
358+
available_chains.remove(chain_id)
359+
# Otherwise, use the chain of the fixed (motif) residues
360+
else:
361+
assert len(chain_ids) == 1, f"Error: Multiple chain IDs in chain: {chain_ids}"
362+
chain_id = list(chain_ids)[0]
363+
self.chain_idx += [chain_id] * (last_res - first_res)
364+
# If this is a fixed chain, maintain the chain and residue numbering
365+
else:
366+
self.idx_pdb += [contig_ref[1] for contig_ref in self.contig_map.ref[first_res: last_res]]
367+
assert len(chain_ids) == 1, f"Error: Multiple chain IDs in chain: {chain_ids}"
368+
self.chain_idx += [list(chain_ids)[0]] * (last_res - first_res)
369+
first_res = last_res
370+
335371
####################################
336372
### Generate initial coordinates ###
337373
####################################

scripts/run_inference.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,7 @@ def main(conf: HydraConfig) -> None:
141141
sampler.binderlen,
142142
chain_idx=sampler.chain_idx,
143143
bfacts=bfacts,
144+
idx_pdb=sampler.idx_pdb
144145
)
145146

146147
# run metadata

0 commit comments

Comments
 (0)