Skip to content

Commit 70ea830

Browse files
Merge pull request #243 from daisybio/cli
Command Line Interfaces
2 parents 5fbd5db + fd01238 commit 70ea830

30 files changed

+691
-30417
lines changed

Dockerfile

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,7 @@ COPY --from=builder /usr/local/bin /usr/local/bin
3737
# Copy all relevant code
3838

3939
COPY drevalpy ./drevalpy
40-
COPY create_report.py ./
4140
COPY README.md ./
42-
COPY run_suite.py ./
4341
COPY pyproject.toml ./
4442
COPY poetry.lock ./
4543

README.md

Lines changed: 33 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -72,15 +72,21 @@ pip install poetry-plugin-export
7272
poetry install
7373
```
7474

75+
Check your installation by running in your console:
76+
77+
```bash
78+
drevalpy --help
79+
```
80+
7581
## Quickstart
7682

7783
To run models from the catalog, you can run:
7884

7985
```bash
80-
python run_suite.py --run_id my_first_run --models NaiveTissueMeanPredictor NaiveDrugMeanPredictor --baselines NaiveMeanEffectsPredictor --dataset TOYv1 --test_mode LCO
86+
drevalpy --run_id my_first_run --models NaiveTissueMeanPredictor NaiveDrugMeanPredictor --dataset TOYv1 --test_mode LCO
8187
```
8288

83-
This will train our baseline models which just predict the drug or tissue means or the mean drug and cell line effects.
89+
This will download a small toy drug response dataset, train our baseline models which just predict the drug or tissue means or the mean drug and cell line effects.
8490
It will evaluate in "LCO" which is the leave-cell-line-out splitting strategy using 7 fold cross validation.
8591
The results will be stored in
8692

@@ -91,10 +97,10 @@ results/my_first_run/TOYv1/LCO
9197
You can visualize them using
9298

9399
```bash
94-
python create_report.py --run_id my_first_run --dataset TOYv1
100+
drevalpy-report --run_id my_first_run --dataset TOYv1
95101
```
96102

97-
This will create an index.html file which you can open in your web browser.
103+
This will create an index.html file in the results directory which you can open in your web browser.
98104

99105
You can also run a drug response experiment using Python:
100106

@@ -103,56 +109,38 @@ from drevalpy.experiment import drug_response_experiment
103109
from drevalpy.models import MODEL_FACTORY
104110
from drevalpy.datasets import AVAILABLE_DATASETS
105111

106-
naive_mean = MODEL_FACTORY["NaiveMeanEffectsPredictor"]
107-
rf = MODEL_FACTORY["RandomForest"]
108-
simple_nn = MODEL_FACTORY["SimpleNeuralNetwork"]
112+
from drevalpy.experiment import drug_response_experiment
113+
114+
naive_mean = MODEL_FACTORY["NaivePredictor"] # a naive model that just predicts the training mean
115+
enet = MODEL_FACTORY["ElasticNet"] # An Elastic Net based on drug fingerprints and gene expression of 1000 landmark genes
116+
simple_nn = MODEL_FACTORY["SimpleNeuralNetwork"] # A neural network based on drug fingerprints and gene expression of 1000 landmark genes
109117

110-
toyv2 = AVAILABLE_DATASETS["TOYv2"](path_data="data", measure="LN_IC50_curvecurator")
118+
toyv1 = AVAILABLE_DATASETS["TOYv1"](path_data="data")
111119

112120
drug_response_experiment(
113-
models=[rf, simple_nn],
114-
baselines=[naive_mean],
115-
response_data=toyv2,
116-
metric="RMSE",
117-
n_cv_splits=7,
118-
test_mode="LCO",
119-
run_id="my_second_run",
120-
path_data="data",
121-
hyperparameter_tuning=False,
122-
)
121+
models=[enet, simple_nn],
122+
baselines=[naive_mean], # Ablation studies and robustness tests are not run for baselines.
123+
response_data=toyv1,
124+
n_cv_splits=2, # the number of cross validation splits. Should be higher in practice :)
125+
test_mode="LCO", # LCO means Leave-Cell-Line out. This means that the test and validation splits only contain unseed cell lines.
126+
run_id="my_first_run",
127+
path_data="data", # where the downloaded drug response and feature data is stored
128+
path_out="results", # results are stored here :)
129+
hyperparameter_tuning=False) # if True (default), hyperparameters of the models and baselines are tuned.
123130
```
124131

125132
This will run the Random Forest and Simple Neural Network models on the CTRPv2 dataset, using the Naive Mean Effects Predictor as a baseline. The results will be stored in `results/my_second_run/CTRPv2/LCO`.
126133
To obtain evaluation metrics, you can use:
127134

128135
```python
129-
from drevalpy.visualization.utils import parse_results, prep_results, write_results
130-
import pathlib
131-
132-
# load data, evaluate per CV run
133-
(
134-
evaluation_results,
135-
evaluation_results_per_drug,
136-
evaluation_results_per_cell_line,
137-
true_vs_pred,
138-
) = parse_results(path_to_results="results/my_second_run", dataset='TOYv2')
139-
# reformat, calculate normalized metrics
140-
(
141-
evaluation_results,
142-
evaluation_results_per_drug,
143-
evaluation_results_per_cell_line,
144-
true_vs_pred,
145-
) = prep_results(
146-
evaluation_results, evaluation_results_per_drug, evaluation_results_per_cell_line, true_vs_pred, pathlib.Path("data")
147-
)
148-
149-
write_results(
150-
path_out="results/my_second_run",
151-
eval_results=evaluation_results,
152-
eval_results_per_drug=evaluation_results_per_drug,
153-
eval_results_per_cl=evaluation_results_per_cell_line,
154-
t_vs_p=true_vs_pred,
155-
)
136+
from drevalpy.visualization.create_report import create_report
137+
138+
create_report(
139+
run_id="my_first_run",
140+
dataset=toyv1.dataset_name,
141+
path_data= "data",
142+
result_path="results",
143+
)
156144
```
157145

158146
We recommend the use of our Nextflow pipeline for computational demanding runs and for improved reproducibility.

create_report.py

Lines changed: 0 additions & 115 deletions
This file was deleted.

docs/contributing.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,13 +40,13 @@ How to set up your development environment
4040

4141
.. code:: console
4242
43-
$ python run_suite.py --run_id my_first_run --models NaiveDrugMeanPredictor ElasticNet --dataset TOYv1 --test_mode LCO
43+
$ drevalpy --run_id my_first_run --models NaiveDrugMeanPredictor ElasticNet --dataset TOYv1 --test_mode LCO
4444
4545
6. Visualize the results by running the following command:
4646

4747
.. code:: console
4848
49-
$ python create_report.py --run_id my_first_run --dataset TOYv1
49+
$ drevalpy-report --run_id my_first_run --dataset TOYv1
5050
5151
How to test the project
5252
-----------------------

docs/installation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,4 +78,4 @@ To install DrEvalPy from source, clone the repository and install the package us
7878
pip install poetry-plugin-export
7979
poetry install
8080
81-
Now, you can test the functionality by referring to the `Quickstart <./quickstart.html>`_ documentation.
81+
Now, you can test the functionality quickly via `drevalpy --help`. Or take a look at the `Quickstart <./quickstart.html>`_ documentation.

docs/quickstart.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ dataset with the LCO test mode.
88

99
.. code-block:: bash
1010
11-
python run_suite.py --run_id my_first_run --models NaiveTissueMeanPredictor NaiveDrugMeanPredictor --baselines NaiveMeanEffectsPredictor --dataset TOYv1 --test_mode LCO
11+
drevalpy --run_id my_first_run --models NaiveTissueMeanPredictor NaiveDrugMeanPredictor --baselines NaiveMeanEffectsPredictor --dataset TOYv1 --test_mode LCO
1212
1313
This will train the three baseline models to predict LN_IC50 values of our Toy dataset which is a subset of CTRPv2.
1414
It will evaluate in "LCO" which is the leave-cell-line-out splitting strategy
@@ -23,14 +23,15 @@ You can visualize them using
2323

2424
.. code-block:: bash
2525
26-
python create_report.py --run_id my_first_run --dataset TOYv1
26+
drevalpy-report --run_id my_first_run --dataset TOYv1
2727
2828
This creates an index.html file which you can open in your browser to see the results of your run.
2929

3030
We recommend the use of our nextflow pipeline for computational demanding runs and for improved reproducibility. No
3131
knowledge of nextflow is required to run it. The nextflow pipeline is available on the `nf-core GitHub
3232
<https://github.com/nf-core/drugresponseeval.git>`_, the documentation can be found `here <https://nf-co.re/drugresponseeval/dev/>`_.
3333

34+
- Want to test if your own model outperforms the baselines? See `Run Your Model <./runyourmodel.html>`_.
3435
- Discuss usage, development and issues on `GitHub <https://github.com/daisybio/drevalpy>`_.
3536
- Check the `Contributor Guide <./contributing.html>`_ if you want to participate in developing.
3637
- If you use drevalpy for your work, `please cite us <./reference.html>`_.

docs/runyourmodel.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ Update the ``MULTI_DRUG_MODEL_FACTORY`` if your model is a global model for mult
217217
Now you can run your model using the DrEvalPy pipeline. cd to the drevalpy root directory and run the following command:
218218

219219
.. code-block:: shell
220-
python -m run_suite.py --model YourModel --dataset CTRPv2 --data_path data
220+
drevalpy --model YourModel --dataset CTRPv2 --data_path data
221221
222222
223223
To contribute the model, so that the community can build on it, please also write appropriate tests in ``tests/models`` and documentation in ``docs/``
@@ -543,4 +543,4 @@ Now you can run the model using the DrEvalPy pipeline.
543543
To run the model, navigate to the DrEvalPy root directory and execute the following command:
544544
.. code-block:: shell
545545
546-
python -m run_suite.py --model ProteomicsRandomForest --dataset CTRPv2 --data_path data
546+
drevalpy --model ProteomicsRandomForest --dataset CTRPv2 --data_path data

0 commit comments

Comments
 (0)