You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Pilot1/ST1/README.md
+9-10Lines changed: 9 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,7 +32,7 @@ smiles_class_transformer.py
32
32
33
33
The example data sets are the same as for the CANDLE versions, and allow one to predict whether a small molecule is "drug-like" based on Lipinski criteria (classification problem), or predict the molecular weight (regression) from a SMILE string as input.
34
34
The example data sets are downloadable using the information in the `regress_default_model.txt` or `class_default_model.txt` files.
35
-
These data files must be downloaded manually and specified on the command line for execution.
35
+
These data files must be downloaded manually and specified on the command line for execution of the original versions.
36
36
37
37
```
38
38
# for regression
@@ -160,17 +160,16 @@ Epoch 00025: val_loss did not improve from 800.85254
160
160
```
161
161
162
162
163
-
## Background on the example classification problem
164
-
165
-
CHEMBL -- 1.5M training examples.. for Lipinski (1/0) (Lipinski criteria for drug likeness) validation 100K samples non-overlapping
166
-
163
+
## Example classification problem metrics
164
+
CHEMBL -- 1.5M training examples
165
+
Predicting Lipinski criteria for drug likeness (1/0)
166
+
Validation 100K samples non-overlapping
167
167
Classification validation accuracy is about 91% after 10-20 epochs
168
168
169
-
## Background on the example regression problem
170
-
171
-
CHEMBL -- 1.5M training examples (shuffled and resampled so not same 1.5M as classification) .. predicting molecular Weight validation
172
-
is also 100K samples non-overlapping.
173
-
169
+
## Example regression problem metrics
170
+
CHEMBL -- 1.5M training examples (shuffled and resampled so not same 1.5M as classification)
171
+
Predicting molecular Weight validation
172
+
Is also 100K samples non-overlapping.
174
173
Regression problem achieves R^2 about .95 after ~20 epochs.
0 commit comments