- My Public talk on Alphafold2 Paper Reading By Xingqiang,Chen .Key/.pptx in AF2-PPT file.
- Sergey Ovchinnikov talk on AF2 slides /.pptx in AF2-PPT file.
We provide 32 Jupyter Notebooks covering every algorithm from the AlphaFold2 supplementary materials. Each notebook includes:
- Algorithm pseudocode/image reference
- Source code location mapping
- NumPy implementation
- Executable test cases with verification
π Full Algorithm Index
| Category | Algorithms | Notebooks |
|---|---|---|
| Data Preprocessing | MSA Block Deletion | Algorithm 1 |
| Embedding | Input Embedder, relpos, one_hot | Alg 3, Alg 4, Alg 5 |
| Evoformer | Stack, MSA Attention, Triangle Ops | Alg 6-15 |
| Templates | Pair Stack, Pointwise Attention | Alg 16, Alg 17 |
| Extra MSA | Stack, Global Attention | Alg 18, Alg 19 |
| Structure Module | IPA, Backbone, Atom Coords | Alg 20-25 |
| Losses | FAPE, Torsion, pLDDT | Alg 26-29 |
| Recycling | Inference, Training, Embedder | Alg 30, Alg 31, Alg 32 |
| Main Pipeline | Full Inference | Algorithm 2 |
π Complete Algorithm List (Click to Expand)
We now include AlphaFold3 algorithm notebooks! AF3 introduces significant architectural changes including diffusion-based structure prediction.
π AlphaFold3 Algorithm Index
| Category | Key Algorithms | Notebooks |
|---|---|---|
| Input | MSA Features, Templates, Atom Features | Alg 1-4 |
| MSA Module | Outer Product, MSA Attention | Alg 5-7 |
| Pairformer | Triangle Ops, Single Attention | Alg 8-14 |
| Diffusion | Diffusion Module, AdaLN, Transformer | Alg 15, Alg 16 |
| Confidence | Distogram, Confidence, LDDT | Alg 20-23 |
# Official AlphaFold3
AF3-Ref-src/alphafold3-official/
# PyTorch Implementation (lucidrains)
AF3-Ref-src/alphafold3-pytorch/
# Architecture Walkthrough
AF3-Ref-src/alphafold3-walkthrough/We now include Boltz algorithm notebooks! Boltz is a family of models for biomolecular interaction prediction:
- Boltz-1: First fully open source model to approach AlphaFold3 accuracy
- Boltz-2: Adds binding affinity prediction, approaching FEP accuracy 1000x faster
| Category | Key Algorithms | Notebooks |
|---|---|---|
| Input Processing | Input Embedder, Atom Encoder, RelPos | Alg 1-3 |
| MSA Module | MSA Module, Outer Product, Pair Averaging | Alg 4-6 |
| Pairformer | Pairformer, Triangle Ops, Attention | Alg 7-11 |
| Diffusion | Diffusion Module, Transformer, Fourier | Alg 12-15 |
| Confidence & Affinity | Confidence, Distogram, Affinity (Boltz-2) | Alg 16-18 |
| Loss Functions | Diffusion Loss, Confidence Loss | Alg 19-20 |
# Official Boltz Repository
Boltz-Ref-src/boltz-official/Papers:
Boltz-2 introduces binding affinity prediction - the first DL model approaching FEP accuracy while being 1000x faster.
| Category | Key Algorithms | Notebooks |
|---|---|---|
| Affinity Prediction | Affinity Module, Gaussian Smearing | Alg 1-2 |
| Contact Guidance | Contact Conditioning | Alg 3 |
| Enhanced v2 Modules | Input v2, Template v2, Diffusion v2 | Alg 5-7 |
| Improved Confidence | Confidence v2, B-Factor | Alg 8, 10 |
# Official Repository (contains both Boltz-1 and Boltz-2)
Boltz-Ref-src/boltz-official/
# Boltzina - Virtual Screening with Boltz-2
Boltz-Ref-src/boltzina/- DeepMind: AlphaFold-Using-AI-for-scientific-discovery
- DeepMind: alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology
- DeepMind: putting-the-power-of-alphafold-into-the-worlds-hands
- Reference papers list here and you can download them by Baidu Cloud Driver Link with the code 9w2p.
- Reference Papers' Source Codes are managed via git submodules in
AF2-Ref-src/
# Official AlphaFold (DeepMind)
AF2-Ref-src/alphafold-official/
# OpenFold (PyTorch implementation)
AF2-Ref-src/openfold/
# ColabFold (Colab-friendly version)
AF2-Ref-src/colabfold/
# MMseqs2 (Sequence search)
AF2-Ref-src/mmseqs2/
# HH-suite (Template search)
AF2-Ref-src/hh-suite/
# trRosetta2 (Predecessor model)
AF2-Ref-src/trRosetta2/
# ESM (Facebook protein language model)
AF2-Ref-src/esm/
# UniRep (Protein representations)
AF2-Ref-src/unirep/
# SeqVec (Sequence embeddings)
AF2-Ref-src/seqvec/To initialize submodules after cloning:
git submodule update --init --recursiveAll input data are freely available from public sources.
Structures from the PDB were used for training and as templates (https://www.wwpdb.org/ftp/pdb-ftp-sites; for the associated sequence data and 40% sequence clustering see also https://ftp.wwpdb.org/pub/pdb/derived_data/ and https://cdn.rcsb.org/resources/sequence/clusters/bc-40.out).
Training used a version of the PDB downloaded 28/08/2019, while CASP14 template search used a version downloaded 14/05/2020. Template search also used the PDB70 data- base, downloaded 13/05/2020 (https://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/).
We show experimental structures from the PDB with accessions 6Y4F76, 6YJ177, 6VR478, 6SK079, 6FES80, 6W6W81, 6T1Z82, and 7JTL83.
For MSA lookup at both training and prediction time,
we used UniRef90 v2020_01 (https://ftp.ebi.ac.uk/pub/databases/uniprot/previous_releases/release-2020_01/uniref/),
BFD (https://bfd.mmseqs.com), Uniclust30 v2018_08 (https://wwwuser.gwdg.de/~compbiol/uniclust/2018_08/),
and MGnify clusters v2018_12 (https://ftp.ebi.ac.uk/pub/databases/metagenomics/peptide_database/2018_12/). Uniclust30 v2018_08 was further used as input for constructing a distillation structure dataset.
for the AlphaFold model, trained weights, and an inference script is available under an open-source license at https://github.com/deepmind/alphafold.
Neural networks were developed with
- TensorFlow v1 (https://github.com/tensorflow/tensorflow),
- Sonnet v1 (https://github.com/deepmind/sonnet),
- JAX v0.1.69 (https://github.com/google/jax/),
- Haiku v0.0.4 (https://github.com/deepmind/dm-haiku).
For MSA search on
- UniRef90, MGnify clusters, and reduced BFD we used jackhmmer and for template search on the PDB SEQRES we used
- hmmsearch, both from HMMER v3.3 (http://eddylab.org/soft-ware/hmmer/).
For template search against PDB70, we used HHsearch from HH-suite v3.0-beta.3 14/07/2017 (https://github.com/soedinglab/hh-suite). For constrained relaxation of structures, we used OpenMM v7.3.1 (https://github.com/openmm/openmm) with the Amber99sb force field.
Docking analysis on DGAT used
- P2Rank v2.1 (https://github.com/rdk/p2rank),
- MGLTools v1.5.6 (https://ccsb.scripps.edu/mgltools/)
- and AutoDockVina v1.1.2 (http://vina.scripps.edu/download/) on a workstation running Debian GNU/Linux rodete 5.10.40-1rodete1-amd64 x86_64.
Data analysis used
- Python v3.6 (https://www.python.org/),
- NumPy v1.16.4 (https://github.com/numpy/numpy),
- SciPy v1.2.1 (https://www.scipy.org/),
- seaborn v0.11.1 (https://github.com/mwaskom/seaborn),
- scikit-learn v0.24.0 (https://github.com/scikit-learn/),
- Matplotlib v3.3.4 (https://github.com/matplotlib/matplotlib),
- pandas v1.1.5 (https://github.com/pandas-dev/pandas),
- and Colab (https://research.google.com/colaboratory).
- TM-align v20190822 (https://zhanglab.dcmb.med.umich.edu/TM-align) was used for computing TM-scores.
Structure analysis used Pymol v2.3.0 (https://github.com/schrodinger/pymol-open-source).

