GitHub - Robaina/Proteus: A python tool to optimize protein properties through AI-guided directed evolution

In silico protein optimization through AI-guided directed evolution

💡 Overview

This project aims to speed up protein engineering by leveraging the power of deep learning to guide the process of directed evolution. Our framework, Proteus, is designed to predict and optimize protein functions in a virtual environment, significantly speeding up the research and development process in biotechnology and pharmaceutical industries.

Protein optimization through in-silico-directed evolution guided by deep learning. Beginning with a wild-type protein sequence from a natural source (env. protein), such as a marine metagenomic sample collected within a given MPA, the sequence is replicated into a population of sequences that undergo a masking procedure where specific amino acids are targeted for mutation. The ESM 2 protein language model then suggests new amino acids for these positions (mask filling), creating a population of mutated sequences. These sequences are folded into 3D structures and evaluated using a suite of fitness models, including affinity to ligand, and additional metrics for stability and other desired properties. The iterative process refines the protein sequences through multiple rounds, selecting sequences that exhibit improved fitness scores, which are represented by larger numerical values. This optimized protein with enhanced properties will then be subject to experimental validation.

🔅 Features

High-throughput Screening Simulation: Simulate the process of directed evolution with our deep learning models to predict the most promising protein variants.

Protein Function Prediction: Utilize state-of-the-art algorithms to predict the function of protein sequences.

Optimization Algorithms: Implement various optimization algorithms to find the optimal protein sequence for a desired function.

A directed evolutionary process guided by deep learning to optimize bacterial Lipase A. Selected variants generated in the last optimization step and affinity values corresponding to the best-performing variant, i.e., the lowest affinity, at each step; the affinity of the wild type is included for comparison (colored in red). Throughout the optimization process, affinity decreased from -3.06 kcal/mol in the wild type to -4.46 kcal/mol in the best-performing variant at step 14, corresponding to an affinity improvement of 45.6%. The total ΔΔG change, computed with ThermoMPNN, of each protein variant is displayed as bar plots. During the optimization process, total ΔΔG was constrained to be less than 1 kcal/mol.

🔧 Getting Started

To get started with Proteus, follow these steps:

Clone the repository:

git clone https://github.com/Robaina/Proteus.git
cd Proteus

Install the required dependencies:

pip install -r requirements.txt

Build and install Proteus:

poetry build
pip install dist/proteus-*-py3-none-any.whl

Explore the Jupyter notebooks in the notebooks directory to see examples and tutorials on how to use the framework.

🚀 Usage

To use Proteus, you can run the CLI tool to start the optimization process. The CLI tool provides a user-friendly interface to interact with the framework and run simulations. Run the following command to see the available options:

proteus --help

Contributing

We welcome contributions! Please read our CONTRIBUTING.md for details on how to submit pull requests, report issues, and contribute to the code.

🔓 License

This project is licensed under the GPL3 - see the LICENSE file for details.

💜 Acknowledgments

This project, Proteus, started as a fork of the DirectedEvolution project. We have since made substantial modifications to adapt it to our goals. We extend our sincere gratitude to the creators and contributors of DirectedEvolution for laying the groundwork that inspired our project. Their innovative approach and dedication to advancing the field have been instrumental in shaping our development path.

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.github		.github
imgs		imgs
notebooks		notebooks
predictors		predictors
src/proteus		src/proteus
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

In silico protein optimization through AI-guided directed evolution

💡 Overview

🔅 Features

🔧 Getting Started

🚀 Usage

Contributing

🔓 License

💜 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Robaina/Proteus

Folders and files

Latest commit

History

Repository files navigation

In silico protein optimization through AI-guided directed evolution

💡 Overview

🔅 Features

🔧 Getting Started

🚀 Usage

Contributing

🔓 License

💜 Acknowledgments

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages