-
Notifications
You must be signed in to change notification settings - Fork 88
Description
Hello there, thank you for the amazing work.
I have a quesiton that the atom number of RF3 model doesnt match with the atom number of ProteinMPNN. Here is my workflow.
First I ran a rfd3 to generate backbones.
rfd3 design out_dir=/media/user/ALL_USERS/hjd/rfd3/results ckpt_path=~/.foundry/checkpoints/rfd3_latest.ckpt inputs=/media/user/ALL_USERS/hjd/rfd3/JOB.json diffusion_batch_size=500 n_batches=4
The json file looks like this:
{
"E1": {
"dialect": 2,
"infer_ori_strategy": "hotspots",
"input": "/media/user/ALL_USERS/hjd/rfd3/cleaned_A.pdb",
"contig": "20-100,/0,A150-420",
"select_hotspots": {
"A181": "CG2,CG1",
"A407": "NE1,CZ2",
"A214": "NH2,NH1",
"A404": "CD1,CZ",
"A222": "CB,CG"
}
}
}
Then I ran ProteinMPNN to refine the sequence.
mpnn \
--structure_path "${cif_file}" \
--out_directory "${OUT_DIR}" \
--checkpoint_path "${CHECKPOINT_PATH}" \
--model_type protein_mpnn \
--is_legacy_weights True
After that, I saved all the fa files from last step to a file called "sequences_output.json". And ran RF3.
rf3 fold inputs="./sequences_output.json" ckpt_path="/home/jedi/.foundry/checkpoints/rf3_foundry_01_24_latest_remapped.ckpt" out_dir="./rf3_ouput"
The sequences_output.json file looks like this:
{ "name": "E1_3_model_7_b0_d0(1)", "components": [ { "seq": "GWIEGVVLEFVDDDTVLVDDGERVYRVLRSSVENPENARVGSRVRVSTLTAEEVPVVCPGGTCFSVPTL", "chain_id": "A" } ] }, { "name": "E1_3_model_421_b0_d0(2)", "components": [ { "seq": "MEELVEKIKKKLEAEGYKVLKVKVNEDGTVSVVVEKDGKYYELTFDSKGNLLSKEPVRVVVKVPVNGKATYYKCDCGDSEAGVIFEDYTIPGC", "chain_id": "A" } ] }
The CIF files generated by RF3 have more atoms than the CIF files generated by ProteinMPNN and RFD3. The CIF files have the same length between RFD3 and ProteinMPNN. Is it because RF3 used the B Chain from ProteinMPNN as input? So the CIF model of RF3 is larger?
Many thanks!!!