Skip to content

Commit 1daeb26

Browse files
committed
Updated ReadMe NO_JIRA
1 parent e6fb814 commit 1daeb26

File tree

1 file changed

+26
-7
lines changed

1 file changed

+26
-7
lines changed

scripts/gold_multi/README.md

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,10 @@
1-
# GOLD and multiprocessing
1+
# GOLD and Multiprocessing
22

33
## Introduction
44

55
This repo contains a script, `gold_multi.py`, which is designed to illustrate how to use the [CSD Docking API](https://downloads.ccdc.cam.ac.uk/documentation/API/descriptive_docs/docking.html) and the standard Python [multiprocessing](https://docs.python.org/3.7/library/multiprocessing.html) module to parallelize GOLD docking. Also included is a simple example system to demonstrate the operation of the script.
66

7-
In order to run the script, you will need to have both [GOLD](https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold/) and the [CSD Python API](https://downloads.ccdc.cam.ac.uk/documentation/API/) installed. A suitable licence will also be required: the CSD-Discovery, CSD-Enterprise and Research Partner suites would all be sufficient. Please cpontact [[email protected]](mailto:[email protected]) for further details on licencing.
8-
9-
On a multi-core workstation, this approach should be suitable for docking some hundreds or thousands of ligands depending on the rigour of the docking protocol used; please consult the GOLD USer Guide for information about speed/accuracy tradeoffs in GOLD. Noet that the script is not useful for running GOLD on an HPC compute cluster or on the Cloud: the CCDC provides the GOLD Cluster and GOLD Cloud tools for those use-cases. For further details, please contact [[email protected]](mailto:[email protected]).
7+
On a multi-core workstation, this approach should be suitable for docking some hundreds or thousands of ligands depending on the rigour of the docking protocol used; please consult the GOLD USer Guide for information about speed/accuracy tradeoffs in GOLD. Note that the script is not useful for running GOLD on an HPC compute cluster or on the Cloud: the CCDC provides the GOLD Cluster and GOLD Cloud tools for those use-cases. For further details, please contact [[email protected]](mailto:[email protected]).
108

119
As ever when using multiprocessing techniques, increasing the number processes will at some point begin to degrade performance as available cores are saturated. At what point this happens will depend on the machine and the workload and thus can only really be determined by experimentation. A default of six was selected as the script was developed on an eight-core workstation and this seemed to give decent performance while leaving cores for other processes.
1210

@@ -17,13 +15,21 @@ The script writes output to the directory specified in the GOLD configuration fi
1715
The script partitions the input ligand file into chunks and uses the Docking API and multiprocessing to dock these chunks in parallel using named subdirectories for their output. The solution files for the chunks are then copied to the main output directory and the full `bestranking.lst` file compiled from the partial chunk versions. The intermediate subdirectories are currently kept, but the script could easily be modified to delete them or use anonymous temporary directories if disk usage was to be an issue.
1816

1917
---
18+
## Requirements
19+
20+
- [GOLD](https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold/) and the [CSD Python API](https://downloads.ccdc.cam.ac.uk/documentation/API/) installed.
21+
- Configuration File: `gold.conf`
22+
23+
## Licensing Requirements
24+
25+
CSD-Discovery, CSD-Enterprise and Research Partner suites would all be sufficient.
2026

21-
## Running the script
27+
## Instructions on Running
2228

2329
To run the script, an environment with the CCDC Python API installed must be active. Further information is available in
2430
the [API installation notes](https://downloads.ccdc.cam.ac.uk/documentation/API/installation_notes.html).
2531

26-
The script is designed to be run from the command line only (and not, for example, from within Hermes). The path to a GOLD configuration file may be provided as a command argument; if no arugunebt is provided, it is assumed there will be a file `gold.conf` in the current working directory.
32+
The script is designed to be run from the command line only (and not, for example, from within Hermes). The path to a GOLD configuration file may be provided as a command argument; if no argument is provided, it is assumed there will be a file `gold.conf` in the current working directory.
2733

2834
On Windows, the command would be (in the folder where this archive was unzipped)...
2935

@@ -41,10 +47,23 @@ $ ./gold_multi.py
4147

4248
In either case, add the option `--help` to show more information.
4349

44-
---
50+
```cmd
51+
usage: gold_multi.py [-h] [--n_processes N_PROCESSES] [conf_file]
52+
53+
positional arguments:
54+
conf_file GOLD configuration file (default='gold.conf')
4555
56+
optional arguments:
57+
-h, --help show this help message and exit
58+
--n_processes N_PROCESSES
59+
No. of processes (default=6)
60+
```
61+
62+
---
4663
## Note on the input files provided
4764

4865
The example target provided (see the directory `target/`) is SYK tyrosine kinase ([5LMA](https://www.ebi.ac.uk/pdbe/entry/pdb/5lma)).
4966

5067
The ligands in `input.sdf` were built from SMILES. If the name is a PDB code, it means the SMILES corresponded to the crystallographic ligand from that structure (with conventional ionization states assigned). If the name has a suffix, the SMILES is a manually-generated analogue. Note that not all these ligands can be correctly cross-docked into 5LMA, as there are induced-fit effects in SYK that GOLD cannot reproduce.
68+
69+
> For feedback or to report any issues please contact [[email protected]](mailto:[email protected])

0 commit comments

Comments
 (0)