You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are two main predictions in the affinity output: `affinity_pred_value` and `affinity_probability_binary`. They are trained on largely different datasets, with different supervisions, and should be used in different contexts. The `affinity_probability_binary` field should be used to detect binders from decoys, for example in a hit-discovery stage. It's value ranges from 0 to 1 and represents the predicted probability that the ligand is a binder. The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule. This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log(IC50)`, derived from an `IC50` measured in `μM`. More details on how to run affinity predictions and parse the output can be found in our [prediction instructions](docs/prediction.md).
52
+
There are two main predictions in the affinity output: `affinity_pred_value` and `affinity_probability_binary`. They are trained on largely different datasets, with different supervisions, and should be used in different contexts. The `affinity_probability_binary` field should be used to detect binders from decoys, for example in a hit-discovery stage. Its value ranges from 0 to 1 and represents the predicted probability that the ligand is a binder. The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule. This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log10(IC50)`, derived from an `IC50` measured in `μM`. More details on how to run affinity predictions and parse the output can be found in our [prediction instructions](docs/prediction.md).
Copy file name to clipboardExpand all lines: docs/prediction.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,10 +19,10 @@ Below is the full schema (each section is described in detail afterward):
19
19
sequences:
20
20
- ENTITY_TYPE:
21
21
id: CHAIN_ID
22
-
sequence: SEQUENCE # only for protein, dna, rna
22
+
sequence: SEQUENCE # only for protein, dna, rna
23
23
smiles: 'SMILES'# only for ligand, exclusive with ccd
24
-
ccd: CCD # only for ligand, exclusive with smiles
25
-
msa: MSA_PATH # only for protein
24
+
ccd: CCD # only for ligand, exclusive with smiles
25
+
msa: MSA_PATH # only for protein
26
26
modifications:
27
27
- position: RES_IDX # index of residue, starting from 1
28
28
ccd: CCD # CCD code of the modified residue
@@ -73,7 +73,7 @@ The sequences section has one entry per unique chain or molecule.
73
73
For proteins:
74
74
* By default, an `msa` must be provided.
75
75
* If `--use_msa_server` is set, the MSA is auto-generated (so `msa` can be omitted).
76
-
* To use a precomputed custom MSA, set `msa: MSA_PATH` pointing to a `.a3m` file. To indicate pairing keys across chains, use a CSV format instead of a3m with two columns: `sequence` (protein sequence) and `key` (a unique identifier for matching rows across chains).
76
+
* To use a precomputed custom MSA, set `msa: MSA_PATH` pointing to a `.a3m` file. If you have more than one protein chain, use a CSV format instead of a3m with two columns: `sequence` (protein sequence) and `key` (a unique identifier for matching rows across chains). Sequences with the same key are mutually aligned.
77
77
* To force single-sequence mode (not recommended, as it reduces accuracy), set `msa: empty`.
78
78
79
79
The `modifications` field is optional and allows specification of modified residues in polymers (`protein`, `dna`, or `rna`).
@@ -241,7 +241,7 @@ There are two main predictions in the affinity output: `affinity_pred_value` and
241
241
242
242
The `affinity_probability_binary` field should be used to detect binders from decoys, for example in a hit-discovery stage. It's value ranges from 0 to 1 and represents the predicted probability that the ligand is a binder.
243
243
244
-
The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule (*note that this implies that it should only be used when comparing different active molecules, not inactives*). This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log(IC50)`, derived from an `IC50` measured in `μM`. Lower values indicate stronger predicted binding, for instance:
244
+
The `affinity_pred_value` aims to measure the specific affinity of different binders and how this changes with small modifications of the molecule (*note that this implies that it should only be used when comparing different active molecules, not inactives*). This should be used in ligand optimization stages such as hit-to-lead and lead-optimization. It reports a binding affinity value as `log10(IC50)`, derived from an `IC50` measured in `μM`. Lower values indicate stronger predicted binding, for instance:
245
245
- IC50 of $10^{-9}$ M $\longrightarrow$ our model outputs $-3$ (strong binder)
246
246
- IC50 of $10^{-6}$ M $\longrightarrow$ our model outputs $0$ (moderate binder)
247
247
- IC50 of $10^{-4}$ M $\longrightarrow$ our model outputs $2$ (weak binder / decoy)
0 commit comments