Skip to content

Commit ec1ea74

Browse files
typos, small fixes
1 parent 0adc87b commit ec1ea74

File tree

1 file changed

+57
-20
lines changed
  • education/HADDOCK3/HADDOCK3-protein-peptide

1 file changed

+57
-20
lines changed

education/HADDOCK3/HADDOCK3-protein-peptide/index.md

Lines changed: 57 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ pip install haddock3
6565
Further, we are providing pre-processed haddock-compatible PDB and configuration files, as well as pre-computed docking results. Please download and unzip the provided [zip archive](https://surfdrive.surf.nl/files/index.php/s/Io1JF9FYiXz9NTb) and make sure to note the location of the extracted folder on your system. There is also a linux command for it:
6666

6767
<a class="prompt prompt-cmd">
68-
wget https://surfdrive.surf.nl/files/index.php/s/Io1JF9FYiXz9NTb -O protein-peptide.zip<br>
68+
wget https://surfdrive.surf.nl/files/index.php/s/Io1JF9FYiXz9NTb/download -O protein-peptide.zip<br>
6969
unzip protein-peptide.zip
7070
</a>
7171

@@ -108,7 +108,7 @@ You should see the entry "P23804 · MDM2_MOUSE". Feel free to take you time and
108108
</a>
109109

110110
This protein has no experimentally solved 3D structure, only AlphaFold model is available.
111-
This model model covers the full-length sequence of MDM2, but for docking we only need a its p53-binding domain.
111+
This model model covers the full-length sequence of MDM2, but for docking we only need its p53-binding domain.
112112
This domain corresponds to residues 26 to 109. Check out [Family & Domains](https://www.uniprot.org/uniprotkb/P23804/entry#family_and_domains){:target="_blank"} section of the UniProt to see all other regions of the protein.
113113
The remaining regions, particularly the disordered one, are known not to interact with the peptide, so it's a good idea remove them, both to make the docking problem easier, and to reduce the computational cost of the docking.
114114

@@ -143,7 +143,7 @@ File menu -> Open -> select protein-peptide/AF-P23804-F1-model_v4.pdb <br>
143143
</a>
144144
And to align two models:
145145
<a class="prompt prompt-pymol">
146-
align AF-P23804-F1-model_v4.pdb, AF_MDM2_26_109.pdb
146+
align AF-P23804-F1-model_v4, AF_MDM2_26_109
147147
</a>
148148

149149
If starting PyMOL directly from the command line, you can load multiple PDB files in one go:
@@ -166,6 +166,8 @@ The sequence of interest (residues 18-32 of the p53_mouse) is:
166166
SQETFSGLWKLLPPE
167167
</pre>
168168

169+
_**Note**_ this sequence can be found on the UniProt page for `P02340 · P53_MOUSE`, under the 'Structure' section.
170+
169171
We will generate three idealized peptide conformations: α-helix, β-sheet, and polyproline II (ppII).
170172
This can be done using PyMOL’s built-in fab command. To see a usage example:
171173

@@ -191,13 +193,13 @@ Or, alternatively:
191193

192194
Once all 3 PDB files are saved in your work directory, save them as a single ensemble PDB (`pdb-mkensemble`) and assign chain B (`pdb_chain`):
193195
<a class="prompt prompt-cmd">
194-
pdb_mkensemble peptide_helix.pdb peptide_sheet.pdb peptide_ppii.pdb | pdb_chain -B | pdb_tidy -strict > peptide_ensemble.pdb
196+
pdb_mkensemble peptide_helix.pdb peptide_sheet.pdb peptide_ppii.pdb | pdb_chain -B | pdb_tidy -strict > peptide_ens.pdb
195197
</a>
196198

197199
To quickly inspect the contents of the generated ensemble, you can look at the header of the file with:
198200

199201
<a class="prompt prompt-cmd">
200-
head peptide_ensemble.pdb
202+
head peptide_ens.pdb
201203
</a>
202204

203205
<hr>
@@ -309,7 +311,7 @@ Typically, active residues are complemented by nearby passive residues on the sa
309311
After generating `protein-peptide_ambig.tbl`, one can validate the syntax of this file using:
310312

311313
<a class="prompt prompt-cmd">
312-
haddock3-restraints validate_tbl protein-peptide_ambig.tbl --silent
314+
haddock3-restraints validate_tbl protein-peptide_ambig.tbl \-\-silent
313315
</a>
314316
If the file is valid, there will be no output.
315317

@@ -426,7 +428,8 @@ haddock3-cfg -m clustrmsd
426428
</a>
427429

428430

429-
This workflow is ready-to-run, and can be executed as-is, using pre-made PDB and restraint files. To use your own files, make sure you provide correct relative or absolute path for each file used during the run (`molecules`, `ambig_fname` and `reference_fname`).
431+
This workflow is ready-to-run, and can be executed as-is, using pre-made PDB and restraint files. To use your own files, make sure you provide correct relative or absolute path for each file used during the run (`molecules`, `ambig_fname` and `reference_fname`).
432+
If you are using your own reference, make sure the PDB file is adequately preprocessed.
430433

431434
### Running HADDOCK3
432435

@@ -502,9 +505,9 @@ Once your run has completed (or oncw you open precomputed `runs/run1/`), inspect
502505

503506
In addition to the various modules defined in the workflow file, you will also find a `log` file (text file) and three additional directories:
504507

505-
* the `data` directory containing the input data (PDB and restraint files) for the various modules, as well as original workflow configuration file.
506-
* the `analysis`directory containing various plots to visualise the results for each caprieval step.
507-
* the `traceback` directory containing the names of the generated models for each step, allowing to trace back a model and it's rank throughout the various stages.
508+
* `data` directory containing the input data (PDB and restraint files) for the various modules, as well as original workflow configuration file;
509+
* `analysis`directory containing various plots to visualise the results for each caprieval step;
510+
* `traceback` directory containing the names of the generated models for each step, allowing to trace back a model and it's rank throughout the various stages.
508511

509512
You can find information about the duration of the run at the bottom of the log file. Each sampling/refinement/selection module will contain PDB files - models produced by this module.
510513

@@ -625,7 +628,7 @@ This script extracts CAPRI statistics per model and reports the number of models
625628
To use it, run the script with the path to the run directory you want to analyse as its argument:
626629

627630
<a class="prompt prompt-cmd">
628-
./scripts/extract-capri-stats.sh ./runs/run1
631+
bash ./scripts/extract-capri-stats.sh ./runs/run1
629632
</a>
630633
<details style="background-color:#DAE4E7">
631634
<summary>
@@ -712,7 +715,7 @@ _**Note:**_ To extract similar statistics per cluster, use `scripts/extract-capr
712715
It’s time to visualise some of the docking models! This part is not only nice and colorful, but also quite important.
713716
Model visualisation allows you to check whether the models look as expected, if the clusters well-defined, zoom in on the interface, etc.
714717

715-
To visualize the models from top cluster of your favorite run, start PyMOL and load the cluster representatives you want to view, e.g. this could be the top models from cluster1. These can be found in the `runs/run1/07_seletopclusts/` directory. Each run has a similar directory. Alternatively, in `analysis/XX_caprieval_analysis` you can find `summary.tgz` with either top-models of best clusters (decompress with `tar -xf summary.tgz`), or top-10 models among all unclustered ones.
718+
To visualize the models from top cluster of your favorite run, start PyMOL and load the cluster representatives you want to view, e.g. this could be the top models from cluster1. These can be found in the `runs/run1/12_seletopclusts/` directory. Each run has a similar directory. Alternatively, in `analysis/XX_caprieval_analysis` you can find `summary.tgz` with either top-models of best clusters (decompress with `tar -xf summary.tgz`), or top-10 models among all unclustered ones.
716719

717720

718721
<a class="prompt prompt-info">
@@ -754,7 +757,7 @@ To maximize the differences you can superimpose all models using a single chain.
754757
alignto 1YCR and chain A
755758
</a>
756759

757-
_**Note:**_You can hide or display a model by clicking on its name in the right panel of the PyMOL window.
760+
_**Note:**_ You can hide or display a model by clicking on its name in the right panel of the PyMOL window.
758761

759762
<details style="background-color:#DAE4E7">
760763
<summary style="bold">
@@ -845,20 +848,54 @@ We used probability threshold of 0.5 to select candidates for active residues, w
845848
Since our docking input is a mouse MDM2 model, not the human reference structure, we should align both structures in PyMOL and map residues from ARCTIC-3D stucutre to mouse MDM2 model (`AF_MDM2_26_109.pdb`).
846849

847850
As you may remember from the definition of active residues, they should be solvent accessible.
848-
Relative solvent accessibility (RSA) measures which percentage of the surface of a residue that is accessible to a solvent (usually water), which is directly related to how exposed a residue is.
849-
Buried residues are unlikely to contribute directly to binding, as they are often simply unreachabe for the docking partner.
850-
851-
851+
Relative solvent accessibility (RSA) measures the percentage of a residue’s surface that is exposed to solvent, typically water.
852+
It reflects how accessible a residue is to potential binding partners.
853+
Buried residues are unlikely to contribute directly to binding, as they are often simply unreachabe for the docking partner.
852854
Default RSA threshlod for active residues is 40%; for passive - 15%. Therse values are a suggestions, not a hard rule.
855+
853856
In our case, we chose a cutoff of 25% for the active residues.
854-
We used [FreeSASA](http://freesasa.github.io/){:target="_blank"}, an open-source tool that computes RSA and relates solvent accessibility values directly from PDB structure:
857+
Many tools are available for calculating RSA, e.g. PyMOL’s built-in function `get_sasa_relative`, the Biopython module `Bio.PDB.SASA` etc.
858+
We used [FreeSASA](http://freesasa.github.io/){:target="_blank"}, an open-source tool that computes RSA and related solvent accessibility values directly from PDB structures.
859+
860+
After installing FreeSASA, you can run it with the following command:
855861
<a class="prompt prompt-cmd">
856-
freesasa --format=rsa AF_MDM2_26_109.pdb
862+
freesasa --format=rsa AF_MDM2_26_109.pdb
857863
</a>
864+
<details style="background-color:#DAE4E7">
865+
<summary>
866+
<i>View freesasa output</i> <i class="material-icons">expand_more</i>
867+
</summary>
868+
The column of interest is `All-atoms`, sub-column `REL`
869+
<pre>
870+
REM FreeSASA 2.1.2
871+
REM Absolute and relative SASAs for AF_MDM2_26_109.pdb
872+
REM Atomic radii and reference values for relative SASA: ProtOr
873+
REM Chains: A
874+
REM Algorithm: Lee & Richards
875+
REM Probe-radius: 1.40
876+
REM Slices: 20
877+
REM RES _ NUM All-atoms Total-Side Main-Chain Non-polar All polar
878+
REM ABS REL ABS REL ABS REL ABS REL ABS REL
879+
RES PRO A 30 29.26 21.3 5.87 5.4 23.39 85.0 5.87 4.8 23.39 145.4
880+
RES LYS A 31 87.65 42.8 87.13 53.5 0.52 1.2 45.91 41.3 41.74 44.5
881+
RES PRO A 32 109.77 80.0 102.81 93.7 6.96 25.3 104.24 86.1 5.53 34.4
882+
RES LEU A 33 95.62 53.3 95.57 68.4 0.05 0.1 95.57 67.1 0.05 0.1
883+
RES LEU A 34 0.00 0.0 0.00 0.0 0.00 0.0 0.00 0.0 0.00 0.0
884+
RES LEU A 35 32.31 18.0 32.31 23.1 0.00 0.0 32.31 22.7 0.00 0.0
885+
RES LYS A 36 131.57 64.2 122.34 75.1 9.23 22.0 79.53 71.6 52.04 55.4
886+
RES LEU A 37 0.00 0.0 0.00 0.0 0.00 0.0 0.00 0.0 0.00 0.0
887+
RES LEU A 38 0.00 0.0 0.00 0.0 0.00 0.0 0.00 0.0 0.00 0.0
888+
RES LYS A 39 87.79 42.8 69.98 42.9 17.81 42.4 45.42 40.9 42.37 45.1
889+
...
890+
</pre>
891+
</details>
892+
<br>
893+
894+
858895

859896
<hr>
860897
<hr>
861898

862899
## Congratulations!
863900
You’ve reached the end of this basic protein-peptide docking tutorial! We hope it has been informative and helps you get started with your own docking projects.
864-
What more protein-pepdide docking workflow examples, this time with explisit flexibility? Check [this page](https://www.bonvinlab.org/haddock3-user-manual/docking_scenarios/prot-peptide.html){:target="_blank"}.
901+
What more protein-peptide docking workflow examples, this time with explicit flexibility? Check [this page](https://www.bonvinlab.org/haddock3-user-manual/docking_scenarios/prot-peptide.html){:target="_blank"}.

0 commit comments

Comments
 (0)