Skip to content

Commit 832486d

Browse files
authored
Merge pull request #561 from timholy/teh/docs
Minor tweaks to the documentation
2 parents 0c228c1 + fd69c81 commit 832486d

File tree

2 files changed

+6
-6
lines changed

2 files changed

+6
-6
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ boltz predict input_path --use_msa_server
4949

5050

5151
### Binding Affinity Prediction
52-
There are two main predictions in the affinity output: `affinity_pred_value` and `affinity_probability_binary`. They are trained on largely different datasets, with different supervisions, and should be used in different contexts. The `affinity_probability_binary` field should be used to detect binders from decoys, for example in a hit-discovery stage. It's value ranges from 0 to 1 and represents the predicted probability that the ligand is a binder. The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule. This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log(IC50)`, derived from an `IC50` measured in `μM`. More details on how to run affinity predictions and parse the output can be found in our [prediction instructions](docs/prediction.md).
52+
There are two main predictions in the affinity output: `affinity_pred_value` and `affinity_probability_binary`. They are trained on largely different datasets, with different supervisions, and should be used in different contexts. The `affinity_probability_binary` field should be used to detect binders from decoys, for example in a hit-discovery stage. Its value ranges from 0 to 1 and represents the predicted probability that the ligand is a binder. The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule. This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log10(IC50)`, derived from an `IC50` measured in `μM`. More details on how to run affinity predictions and parse the output can be found in our [prediction instructions](docs/prediction.md).
5353

5454
## Authentication to MSA Server
5555

docs/prediction.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,10 @@ Below is the full schema (each section is described in detail afterward):
1919
sequences:
2020
- ENTITY_TYPE:
2121
id: CHAIN_ID
22-
sequence: SEQUENCE # only for protein, dna, rna
22+
sequence: SEQUENCE # only for protein, dna, rna
2323
smiles: 'SMILES' # only for ligand, exclusive with ccd
24-
ccd: CCD # only for ligand, exclusive with smiles
25-
msa: MSA_PATH # only for protein
24+
ccd: CCD # only for ligand, exclusive with smiles
25+
msa: MSA_PATH # only for protein
2626
modifications:
2727
- position: RES_IDX # index of residue, starting from 1
2828
ccd: CCD # CCD code of the modified residue
@@ -73,7 +73,7 @@ The sequences section has one entry per unique chain or molecule.
7373
For proteins:
7474
* By default, an `msa` must be provided.
7575
* If `--use_msa_server` is set, the MSA is auto-generated (so `msa` can be omitted).
76-
* To use a precomputed custom MSA, set `msa: MSA_PATH` pointing to a `.a3m` file. To indicate pairing keys across chains, use a CSV format instead of a3m with two columns: `sequence` (protein sequence) and `key` (a unique identifier for matching rows across chains).
76+
* To use a precomputed custom MSA, set `msa: MSA_PATH` pointing to a `.a3m` file. If you have more than one protein chain, use a CSV format instead of a3m with two columns: `sequence` (protein sequence) and `key` (a unique identifier for matching rows across chains). Sequences with the same key are mutually aligned.
7777
* To force single-sequence mode (not recommended, as it reduces accuracy), set `msa: empty`.
7878

7979
The `modifications` field is optional and allows specification of modified residues in polymers (`protein`, `dna`, or `rna`).
@@ -241,7 +241,7 @@ There are two main predictions in the affinity output: `affinity_pred_value` and
241241

242242
The `affinity_probability_binary` field should be used to detect binders from decoys, for example in a hit-discovery stage. It's value ranges from 0 to 1 and represents the predicted probability that the ligand is a binder.
243243

244-
The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule (*note that this implies that it should only be used when comparing different active molecules, not inactives*). This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log(IC50)`, derived from an `IC50` measured in `μM`. Lower values indicate stronger predicted binding, for instance:
244+
The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule (*note that this implies that it should only be used when comparing different active molecules, not inactives*). This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log10(IC50)`, derived from an `IC50` measured in `μM`. Lower values indicate stronger predicted binding, for instance:
245245
- IC50 of $10^{-9}$ M $\longrightarrow$ our model outputs $-3$ (strong binder)
246246
- IC50 of $10^{-6}$ M $\longrightarrow$ our model outputs $0$ (moderate binder)
247247
- IC50 of $10^{-4}$ M $\longrightarrow$ our model outputs $2$ (weak binder / decoy)

0 commit comments

Comments
 (0)