Replies: 5 comments
-
I think this encapsulates the main difference well. There's a bit of trend currently to assume that designs which the ML models do well on are the ones which are more likely to be successful. (And conversely, if the ML program doesn't do well on it, it's likely out-of-distribution for "native-like" systems, and thus is much less likely to be a good design.) As such, to increase the success rate, you want to find designs where the prediction programs have high confidence and have results which are consistent with the input design. -- Hence the iterated convergence approach. You keep feeding the results of the design/repredict pipeline back on itself until you come up with a design where none of the programs being used have issues with it. (They all agree the design will turn out how you expect it to.) Your suggested approach is one to get a diversity of structures, but you might be missing out on that consistency validation. You're not necessarily vetting that ProteinMPNN thinks that the updated backbone is compatible with the current sequence. -- This could be fine: there's no guarantee that the self-consistent structures are better than ones generated by other methods. (It's an assumption rather than a iron-clad fact.) But it may indicate that you may need to have additional stringent filters/selection on the results. You'll generate a diversity of structures, but are they good structures? Will they fold to what you want them to do and be active how you want? Also keep in mind that the info that you're feeding into the experiment is just the sequence of the design -- you can't specify the structure outside of specifying the sequence. As such, any relax step which happens after the final sequence design step is only valuable to the extent it helps you select which sequences you take forward to experimental testing. |
Beta Was this translation helpful? Give feedback.
-
|
@roccomoretti Thank you so much! I have one more question: is there any particular reason behind using exactly 4 loops, rather than 3, 5, or any other number? |
Beta Was this translation helpful? Give feedback.
-
|
I don't know why they chose 4 cycles -- my guess is that's what they found to generally work well in practice to give decently convergent results without spending too much computational time. |
Beta Was this translation helpful? Give feedback.
-
|
Hello! First of all, thank you very much for sharing this repository and the associated work. We are also working on protein design inspired by the same paper, and I have two questions about the current pipeline: 1. ProteinMPNN's sampling temperature 2. Adapting ProteinMPNN-Relax pipeline for cyclic peptides I repeat this ProteinMPNN → Rosetta threading (link to RFDiffusion backbone or Relax structure) → FastRelax process for 4 cycles. 2-1) Conceptually, is this cyclic workflow (ProteinMPNN → threading → Relax, repeated 4 times) consistent with the principles and intent of your original FastRelax-based protocol? Thank you for your kind responses and for your valuable work on this research. I sincerely appreciate your time and help. |
Beta Was this translation helpful? Give feedback.
-
The original developers of this tool have moved on to other roles/projects and do not regularly check these issues/discussions. I recommend reaching out to the corresponding author(s) of the paper for their recommendation. I am also going to move this from being an 'Issue' to being a 'Discussion' as I believe the content of these questions match that forum's purpose better. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,I'm currently implementing a workflow based on the recent RFpeptides paper (Rettie et al., 2025, Nature Chemical Biology) and had a question about the sequence design step. I'd appreciate any insights from the community.
The Paper's Workflow: The authors describe an iterative, 4-round process for each diffused backbone, which (as I understand it) looks like this:
My Alternative Workflow Idea: I was considering an alternative, and potentially computationally cheaper, approach to achieve sequence diversity: 1. Take the original, single backbone from RFdiffusion. Generate 4 sequences on this same fixed backbone using LigandMPNN, but use a higher temperature (e.g., T=0.1, 0.2...?) .
2. Take each of these 4 sequences and run Rosetta FastRelax on them once.
My Questions:
Any thoughts or experiences with these different design strategies would be extremely helpful.
Beta Was this translation helpful? Give feedback.
All reactions