You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: scripts/gold_multi/README.md
+26-7Lines changed: 26 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,10 @@
1
-
# GOLD and multiprocessing
1
+
# GOLD and Multiprocessing
2
2
3
3
## Introduction
4
4
5
5
This repo contains a script, `gold_multi.py`, which is designed to illustrate how to use the [CSD Docking API](https://downloads.ccdc.cam.ac.uk/documentation/API/descriptive_docs/docking.html) and the standard Python [multiprocessing](https://docs.python.org/3.7/library/multiprocessing.html) module to parallelize GOLD docking. Also included is a simple example system to demonstrate the operation of the script.
6
6
7
-
In order to run the script, you will need to have both [GOLD](https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold/) and the [CSD Python API](https://downloads.ccdc.cam.ac.uk/documentation/API/) installed. A suitable licence will also be required: the CSD-Discovery, CSD-Enterprise and Research Partner suites would all be sufficient. Please cpontact [[email protected]](mailto:[email protected]) for further details on licencing.
8
-
9
-
On a multi-core workstation, this approach should be suitable for docking some hundreds or thousands of ligands depending on the rigour of the docking protocol used; please consult the GOLD USer Guide for information about speed/accuracy tradeoffs in GOLD. Noet that the script is not useful for running GOLD on an HPC compute cluster or on the Cloud: the CCDC provides the GOLD Cluster and GOLD Cloud tools for those use-cases. For further details, please contact [[email protected]](mailto:[email protected]).
7
+
On a multi-core workstation, this approach should be suitable for docking some hundreds or thousands of ligands depending on the rigour of the docking protocol used; please consult the GOLD USer Guide for information about speed/accuracy tradeoffs in GOLD. Note that the script is not useful for running GOLD on an HPC compute cluster or on the Cloud: the CCDC provides the GOLD Cluster and GOLD Cloud tools for those use-cases. For further details, please contact [[email protected]](mailto:[email protected]).
10
8
11
9
As ever when using multiprocessing techniques, increasing the number processes will at some point begin to degrade performance as available cores are saturated. At what point this happens will depend on the machine and the workload and thus can only really be determined by experimentation. A default of six was selected as the script was developed on an eight-core workstation and this seemed to give decent performance while leaving cores for other processes.
12
10
@@ -17,13 +15,21 @@ The script writes output to the directory specified in the GOLD configuration fi
17
15
The script partitions the input ligand file into chunks and uses the Docking API and multiprocessing to dock these chunks in parallel using named subdirectories for their output. The solution files for the chunks are then copied to the main output directory and the full `bestranking.lst` file compiled from the partial chunk versions. The intermediate subdirectories are currently kept, but the script could easily be modified to delete them or use anonymous temporary directories if disk usage was to be an issue.
18
16
19
17
---
18
+
## Requirements
19
+
20
+
-[GOLD](https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold/) and the [CSD Python API](https://downloads.ccdc.cam.ac.uk/documentation/API/) installed.
21
+
- Configuration File: `gold.conf`
22
+
23
+
## Licensing Requirements
24
+
25
+
CSD-Discovery, CSD-Enterprise and Research Partner suites would all be sufficient.
20
26
21
-
## Running the script
27
+
## Instructions on Running
22
28
23
29
To run the script, an environment with the CCDC Python API installed must be active. Further information is available in
24
30
the [API installation notes](https://downloads.ccdc.cam.ac.uk/documentation/API/installation_notes.html).
25
31
26
-
The script is designed to be run from the command line only (and not, for example, from within Hermes). The path to a GOLD configuration file may be provided as a command argument; if no arugunebt is provided, it is assumed there will be a file `gold.conf` in the current working directory.
32
+
The script is designed to be run from the command line only (and not, for example, from within Hermes). The path to a GOLD configuration file may be provided as a command argument; if no argument is provided, it is assumed there will be a file `gold.conf` in the current working directory.
27
33
28
34
On Windows, the command would be (in the folder where this archive was unzipped)...
29
35
@@ -41,10 +47,23 @@ $ ./gold_multi.py
41
47
42
48
In either case, add the option `--help` to show more information.
The example target provided (see the directory `target/`) is SYK tyrosine kinase ([5LMA](https://www.ebi.ac.uk/pdbe/entry/pdb/5lma)).
49
66
50
67
The ligands in `input.sdf` were built from SMILES. If the name is a PDB code, it means the SMILES corresponded to the crystallographic ligand from that structure (with conventional ionization states assigned). If the name has a suffix, the SMILES is a manually-generated analogue. Note that not all these ligands can be correctly cross-docked into 5LMA, as there are induced-fit effects in SYK that GOLD cannot reproduce.
0 commit comments