You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Pilot1/ST1/README.md
+11-3Lines changed: 11 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
# Simple transformers for classification and regression using SMILE string input
2
2
3
3
## Introduction
4
-
The ST1 benchmark represent two versions of a simple transformer, one that can perform regression and the other classification. We chose the transformer architecture to see if we could train directly on SMILE strings. This benchmark brings novel capability to the suite of Pilot1 benchmarks in two ways. First, the featureization of a small molecule is simple its SMILE string. The secone novel aspect to the set of Pilot1 benchmarks is that the model is based on the Transformer architecture, albeit this benchmark is a simpler version of the large Transformer models that train on billions and greater parameters.
4
+
The ST1 benchmark represent two versions of a simple transformer, one that can perform regression and the other classification. We chose the transformer architecture to see if we could train directly on SMILE strings. This benchmark brings novel capability to the suite of Pilot1 benchmarks in two ways. First, the featureization of a small molecule is simply its SMILE string. The second novel aspect to the set of Pilot1 benchmarks is that the model is based on the Transformer architecture, albeit this benchmark is a simpler version of the large Transformer models that train on billions and greater parameters.
5
5
6
-
Both the original code and the CANDLE versions are available. The original examples are retained and can be run as noted below. The CANDLE versions make use of the common network design in smiles_transformer.py, and implement the models in `sct_baseline2_keras.py` and `srt_baseline_keras2.py`, for classification and regression, respectively.
6
+
Both the original code and the CANDLE versions are available. The original examples are retained and can be run as noted below. The CANDLE versions make use of the common network design in `smiles_transformer.py`, and implement the models in `sct_baseline_keras2.py` and `srt_baseline_keras2.py`, for classification and regression, respectively.
7
7
8
8
The example classification problem takes as input SMILE strings and trains a model to predict whether or not a compound is 'drug-like' based on Lipinski criteria. The example regression problem takes as input SMILE strings and trains a model to predict the molecular weight. Data are freely downloadable and automatically downloaded by the CANDLE versions.
9
9
@@ -12,8 +12,10 @@ For the CANDLE versions, all the relevant arguments are contained in the respect
12
12
class_default_model.txt
13
13
python sct_baseline_keras2.py
14
14
15
+
```
15
16
and
16
17
18
+
```
17
19
regress_default_model.txt
18
20
python srt_baseline_keras2.py
19
21
```
@@ -23,12 +25,16 @@ The original code demonstrating a simple transformer regressor and a simple tran
23
25
```
24
26
smiles_regress_transformer.py
25
27
28
+
```
26
29
and
27
30
31
+
```
28
32
smiles_class_transformer.py
29
33
```
30
34
31
-
The example data sets are the same as for the CANDLE versions, and allow one to predict whether a small molecule is "drug-like" based on Lipinski criteria (classification problem), or predict the molecular weight (regression) from a SMILE string as input. The example data sets are downloadable using the information in the regress_default_model.txt or class_default_model.txt files. These data files must be downloaded manually and specified on the command line for execution.
35
+
The example data sets are the same as for the CANDLE versions, and allow one to predict whether a small molecule is "drug-like" based on Lipinski criteria (classification problem), or predict the molecular weight (regression) from a SMILE string as input.
36
+
The example data sets are downloadable using the information in the `regress_default_model.txt` or `class_default_model.txt` files.
37
+
These data files must be downloaded manually and specified on the command line for execution.
Copy file name to clipboardExpand all lines: examples/IGTD/Scripts/Examples_Of_Table_To_Image_Conversion.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@
7
7
num_col=30# Number of pixel columns in image representation
8
8
num=num_row*num_col# Number of features to be included for analysis, which is also the total number of pixels in image representation
9
9
save_image_size=3# Size of pictures (in inches) saved during the execution of IGTD algorithm.
10
-
max_step=10000# The maximum number of iterations to run the IGTD algorithm, if it does not converge.
10
+
max_step=1000# The maximum number of iterations to run the IGTD algorithm, if it does not converge.
11
11
val_step=300# The number of iterations for determining algorithm convergence. If the error reduction rate is smaller than a pre-set threshold for val_step itertions, the algorithm converges.
12
12
13
13
# Import the example data and linearly scale each feature so that its minimum and maximum values are 0 and 1, respectively.
0 commit comments