GitHub - sarkarsrijon/ipae: Computational Protein Binding Affinity Calculation

How to Use AlphaFold PAE Data in Python

Run AlphaFold Predictions
Submit your protein/peptide sequences to the AlphaFold Server.
Download Results
After completion, download the results ZIP file from the server.
Extract PAE Data
- Open any full_data_n.json file from the results.
- Locate the "pae" key and copy the entire array (including all nested brackets).
Paste into Python
- Of the given Python analysis scripts, choose one as per purpose, find the line where the PAE matrix is defined, e.g.:
```
pae = np.array([...])
```
- Replace the [...] with the array you copied from the JSON file.
Set Chain Lengths
- Define chain_lengths as a list of amino acid sequence lengths for each peptide/protein chain:
```
chain_lengths = [length1, length2, ...]
```

This workflow lets you easily analyze inter-chain PAE values from AlphaFold PAE data for further inference. Interface Predicted Alignment Error, also referred to as iPAE or PAE_i, is used as a surrogate for PPI binding affinities. Thanks to Brian Coventry for showing me the n = 2 case. I extended it to cases of dimer-ligand interaction and n peptides for specific utility and generalizability, respectively.

The underlying principle is that we want to exclude the self-interaction contribution from the peptides while calculating our interface metric of PAE. Thus, we consider the non-diagonal matrix blocks in the larger matrix defining residue-residue contributions across different peptide chains. Here is an example of a three-peptide interaction, but one can similarly visualize for n = 2 or larger cases.

	A	B	C
A	A→A	A→B	A→C
B	B→A	B→B	B→C
C	C→A	C→B	C→C

The values corresponding to the chain_pair_pae_min key in summay_confidence_n.json would indicate the lowest PAE the model predicts for each pairwise grouping of the different sequences being modeled, to likely achieve in the best case. However, the above iPAE calculations boil down to a single, aggregate, and cumulative measure. The various metrics should be used judiciously as they serve different purposes.

iPAE or PAE values could be influenced or biased, such as by small sequence length, but final values less than 15 are usually considered. Lastly, the measures are mere approximations for binding affinities, not exact substitutes for them.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
dimer_ligand_interaction.py		dimer_ligand_interaction.py
n_peptides.py		n_peptides.py
two_peptides.py		two_peptides.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to Use AlphaFold PAE Data in Python

About

Uh oh!

Releases

Packages

Languages

sarkarsrijon/ipae

Folders and files

Latest commit

History

Repository files navigation

How to Use AlphaFold PAE Data in Python

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages