You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-2Lines changed: 10 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,10 +15,11 @@ An open-source, end-to-end software pipeline for data curation, model building,
15
15
16
16
The ATOM Modeling PipeLine (AMPL) extends the functionality of DeepChem and supports an array of machine learning and molecular featurization tools to predict key potency, safety and pharmacokinetic-relevant parameters. AMPL has been benchmarked on a large collection of pharmaceutical datasets covering a wide range of parameters. This is a living software project with active development. Check back for continued updates. Feedback is welcomed and appreciated, and the project is open to contributions! An [article describing the AMPL project](https://pubs.acs.org/doi/abs/10.1021/acs.jcim.9b01053) was published in JCIM. The AMPL pipeline documentation is available [here](https://ampl.readthedocs.io/en/latest/pipeline.html).
17
17
18
+
Check out our new tutorial series that walks through AMPL's end-to-end modeling pipeline to build a machine learning model! View them in our [docs](https://ampl.readthedocs.io/en/latest/) or as Jupyter notebooks in our [repo](https://github.com/ATOMScience-org/AMPL/tree/master/atomsci/ddm/examples/tutorials).
Check out our new tutorial series that walks through AMPL's end-to-end modeling pipeline to build a machine learning model! View them in our [docs](https://ampl.readthedocs.io/en/latest/) or as Jupyter notebooks in our [repo](https://github.com/ATOMScience-org/AMPL/tree/master/atomsci/ddm/examples/tutorials).
22
+
In addition to our written tutorials, we now provide a series of video tutorials on our YouTube channel, [ATOMScience-org](https://www.youtube.com/channel/UCOF6zZ7ltGwopYCoOGIFM-w). These videos are created to assist users in exploring and leveraging AMPL's robust capabilities.
- Install pytest, plotting packages for development, test use.
118
+
119
+
```bash
120
+
cd AMPL/pip
121
+
pip install -r dev_requirements.txt
122
+
```
116
123
#### 6. *(Optional) LLNL LC only*: if you use [model_tracker](https://ampl.readthedocs.io/en/latest/pipeline.html#module-pipeline.model_tracker), install atomsci.clients
117
124
```bash
118
125
# LLNL only: required for ATOM model_tracker
@@ -145,6 +152,7 @@ cd AMPL/pip
145
152
# If use CUDA:
146
153
# module load cuda/11.8
147
154
pip install -r cpu_requirements.txt # install cpu_requirements.txt OR cuda_requirements.txt
Copy file name to clipboardExpand all lines: atomsci/ddm/docs/PARAMETERS.md
+90-1Lines changed: 90 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -276,6 +276,14 @@ The AMPL pipeline contains many parameters and options to fit models and make pr
276
276
|*Description:*|True/False flag for setting verbosity|
277
277
|*Default:*|FALSE|
278
278
|*Type:*|Bool|
279
+
280
+
-**seed**
281
+
282
+
|||
283
+
|-|-|
284
+
|*Description:*|Seed used for initializing a random number generator to ensure results are reproducible. Default is None and a random seed will be generated.|
285
+
|*Default:*|None|
286
+
|*Type:*|int|
279
287
280
288
-**production**
281
289
@@ -529,6 +537,30 @@ the model will train for max_epochs regardless of validation error.|
529
537
|*Default:*|scaffold|
530
538
|*Type:*|str|
531
539
540
+
-**sampling_method**
541
+
542
+
|||
543
+
|-|-|
544
+
|*Description:*|The sampling method for addressing class imbalance in classification datasets. Options include 'undersampling' and 'SMOTE'.|
545
+
|*Default:*|None|
546
+
|*Type:*|str|
547
+
548
+
-**sampling_ratio**
549
+
550
+
|||
551
+
|-|-|
552
+
|*Description:*|The desired ratio of the minority class to the majority class after sampling (e.g., if str, 'minority', 'not minority'; if float, '0.2', '1.0'). |
553
+
|*Default:*|auto|
554
+
|*Type:*|str|
555
+
556
+
-**sampling_k_neighbors**
557
+
558
+
|||
559
+
|-|-|
560
+
|*Description:*|The number of nearest neighbors to consider when generating synthetic samples (e.g., 5, 7, 9). Specifically used for SMOTE sampling method.|
561
+
|*Default:*|5|
562
+
|*Type:*|int|
563
+
532
564
-**mtss\_num\_super\_scaffolds**
533
565
534
566
|||
@@ -605,6 +637,14 @@ the model will train for max_epochs regardless of validation error.|
605
637
|*Description:*|type of transformation for the response column (defaults to "normalization") TODO: Not currently implemented|
606
638
|*Default:*|normalization|
607
639
640
+
-**weight\_transform\_type**
641
+
642
+
|||
643
+
|-|-|
644
+
|*Description:*|type of transformation for class weights in a classification model loss function. Use the "balancing" option to offset the effect of imbalanced datasets. Works with NN, random forest and XGBoost models. |
645
+
|*Default:*|None|
646
+
|*Type:*|Choice|
647
+
608
648
-**transformer\_bucket**
609
649
610
650
|||
@@ -692,6 +732,20 @@ the model will train for max_epochs regardless of validation error.|
692
732
|*Description:*|Minimum loss reduction required to make a further partition on a leaf node of the tree. Can be input as a comma separated list for hyperparameter search (e.g. '0.0,0.1,0.2')|
693
733
|*Default:*|0.0|
694
734
735
+
-**xgb\_alpha**
736
+
737
+
|||
738
+
|-|-|
739
+
|*Description:*|L1 regularization term on weights. Increasing this value will make model more conservative. Can be input as a comma separated list for hyperparameter search (e.g. '0.0,0.1,0.2')|
740
+
|*Default:*|0.0|
741
+
742
+
-**xgb\_lambda**
743
+
744
+
|||
745
+
|-|-|
746
+
|*Description:*|L2 regularization term on weights. Increasing this value will make model more conservative. Can be input as a comma separated list for hyperparameter search (e.g. '0.0,0.1,0.2')|
747
+
|*Default:*|1.0|
748
+
695
749
-**xgb\_learning\_rate**
696
750
697
751
|||
@@ -710,7 +764,7 @@ the model will train for max_epochs regardless of validation error.|
710
764
711
765
|||
712
766
|-|-|
713
-
|*Description:*|Minimum sum of instance weight(hessian) needed in a child. Can be input as a comma separated list for hyperparameter search (e.g. '1.0,1.1,1.2')|
767
+
|*Description:*|Minimum sum of instance weights (hessian) needed in a child. Can be input as a comma separated list for hyperparameter search (e.g. '1.0,1.1,1.2')|
714
768
|*Default:*|1.0|
715
769
716
770
-**xgb\_n\_estimators**
@@ -1057,6 +1111,27 @@ tied to a specific model parameter. Only a subset of model parameters may be opt
1057
1111
|*Description:*|Search domain for NN model `layer_sizes` parameter in Bayesian Optimization. The format is `scheme\|num_layers\|parameters`, e.g. `uniformint\|3\|8,512`, Note that the number of layers (number between two \|) can not be changed during optimization, if you want to try different number of layers, just run several optimizations.
1058
1112
|*Default:*|None|
1059
1113
1114
+
-**ls_ratio**
1115
+
1116
+
|||
1117
+
|-|-|
1118
+
|*Description:*|Alternative method to set search domain for NN model `layer_sizes` parameter in Bayesian Optimization by specifying layer_size/previous_layer_size ratios. The format is `scheme\|ratios`, e.g. `uniform\|0.1,0.9`; the number of layers and starting layer sizes are taken from the `ls` parameter.
1119
+
|*Default:*|None|
1120
+
1121
+
-**wdp**
1122
+
1123
+
|||
1124
+
|-|-|
1125
+
|*Description:*|Search domain for NN model `weight_decay_penalty` parameter in Bayesian Optimization. The format is `scheme\|parameters`, e.g. `loguniform\|-6.908,-4.605`.
1126
+
|*Default:*|None|
1127
+
1128
+
-**wdt**
1129
+
1130
+
|||
1131
+
|-|-|
1132
+
|*Description:*|Search domain for NN model `weight_decay_penalty_type` parameter in Bayesian Optimization. The format is `scheme\|parameters`, e.g. `choice\|l1,l2`.
1133
+
|*Default:*|None|
1134
+
1060
1135
-**rfe**
1061
1136
1062
1137
|||
@@ -1085,6 +1160,20 @@ tied to a specific model parameter. Only a subset of model parameters may be opt
1085
1160
|*Description:*|Search domain for XGBoost model `xgb_gamma` parameter in Bayesian Optimization. The format is `scheme\|parameters`, e.g. `loguniform\|-9.2,-4.6`.
1086
1161
|*Default:*|None|
1087
1162
1163
+
-**xgba**
1164
+
1165
+
|||
1166
+
|-|-|
1167
+
|*Description:*|Search domain for XGBoost model `xgb_alpha` parameter in Bayesian Optimization. The format is `scheme\|parameters`, e.g. `uniform\|0,0.4`.
1168
+
|*Default:*|None|
1169
+
1170
+
-**xgbb**
1171
+
1172
+
|||
1173
+
|-|-|
1174
+
|*Description:*|Search domain for XGBoost model `xgb_lambda` parameter in Bayesian Optimization. The format is `scheme\|parameters`, e.g. `uniform\|0,0.4`.
Copy file name to clipboardExpand all lines: atomsci/ddm/docs/source/tutorials/ampl_tutorials_intro.rst
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,8 @@ properties. We have created easy to follow tutorials that walk through the steps
16
16
`AMPL <https://github.com/ATOMScience-org/AMPL>`_, curate a dataset, effectively train and evaluate a machine
17
17
learning model, and use that model to make predictions.
18
18
19
+
In addition to our written tutorials, we now provide a series of video tutorials on our YouTube channel, `ATOMScience-org <https://www.youtube.com/channel/UCOF6zZ7ltGwopYCoOGIFM-w>`_. These videos are created to assist users in exploring and leveraging AMPL's robust capabilities.
20
+
19
21
End-to-End Modeling Pipeline Tutorial Series
20
22
********************************************
21
23
@@ -50,4 +52,4 @@ Although the tutorials are designed to be run in sequence, using an example data
50
52
provided within `AMPL <https://github.com/ATOMScience-org/AMPL>`_, we have also provided copies of the intermediate files generated by each tutorial that are
51
53
required by subsequent tutorials, so that you can run them in any order.
52
54
53
-
Also, if you have issues or questions about the tutorials, please create an issue `here <https://github.com/ATOMScience-org/AMPL/issues>`_.
55
+
Also, if you have issues or questions about the tutorials, please create an issue `here <https://github.com/ATOMScience-org/AMPL/issues>`_.
0 commit comments