Skip to content

Commit 431d252

Browse files
Merge pull request #124 from daisybio/development
New release
2 parents 9fd758b + 621e2d8 commit 431d252

File tree

21 files changed

+532
-33
lines changed

21 files changed

+532
-33
lines changed

.github/workflows/labeler.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,6 @@ jobs:
1313
uses: actions/checkout@v4
1414

1515
- name: Run Labeler
16-
uses: crazy-max/ghaction-github-labeler@v5.1.0
16+
uses: crazy-max/ghaction-github-labeler@v5.2.0
1717
with:
1818
skip-delete: true

docs/conf.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,9 +56,9 @@
5656
# the built documents.
5757
#
5858
# The short X.Y version.
59-
version = "1.1.3"
59+
version = "1.1.4"
6060
# The full version, including alpha/beta/rc tags.
61-
release = "1.1.3"
61+
release = "1.1.4"
6262

6363
# The language for content autogenerated by Sphinx. Refer to documentation
6464
# for a list of supported languages.

docs/usage.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,9 @@ prediction. This is why we also offer the possibility to compare your model to a
100100
the mean IC50 of all drugs in the training set. We also offer two more advanced naive predictors:
101101
**NaiveCellLineMeanPredictor** and **NaiveDrugMeanPredictor**. The former predicts the mean IC50 of a cell line in
102102
the training set and the latter predicts the mean IC50 of a drug in the training set.
103+
Finally, as the strongest naive baseline we offer the **NaiveMeanEffectPredictor**
104+
which combines the effects of cell lines and drugs.
105+
It is equivalent to the **NaiveCellLineMeanPredictor** and **NaiveDrugMeanPredictor** for the LDO and LPO settings, respectively.
103106
104107
Available Models
105108
------------------
@@ -119,6 +122,8 @@ For ``--models``, you can also perform randomization and robustness tests. The `
119122
+----------------------------+----------------------------+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
120123
| NaiveDrugMeanPredictor | Baseline Method | Multi-Drug Model | Predicts the mean response of a drug in the training set. |
121124
+----------------------------+----------------------------+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
125+
| NaiveMeanEffectPredictor | Baseline Method | Multi-Drug Model | Predicts using ANOVA-like mean effect model of cell lines and drugs |
126+
+----------------------------+----------------------------+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
122127
| ElasticNet | Baseline Method | Multi-Drug Model | Fits an `Sklearn Elastic Net <https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html>`_, `Lasso <https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html>`_, or `Ridge <https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html>`_ model on gene expression data and drug fingerprints (concatenated input matrix). |
123128
+----------------------------+----------------------------+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
124129
| GradientBoosting | Baseline Method | Multi-Drug Model | Fits an `Sklearn Gradient Boosting Regressor <https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html>`_ gene expression data and drug fingerprints. |

drevalpy/.DS_Store

-6 KB
Binary file not shown.

drevalpy/datasets/utils.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
"""Utility functions for datasets."""
22

3+
import os
34
import zipfile
45
from pathlib import Path
56
from typing import Any
@@ -60,7 +61,7 @@ def download_dataset(
6061
with zipfile.ZipFile(file_path, "r") as z:
6162
for member in z.infolist():
6263
if not member.filename.startswith("__MACOSX/"):
63-
z.extract(member, data_path)
64+
z.extract(member, os.path.join(data_path, dataset_name))
6465
file_path.unlink() # Remove zip file after extraction
6566

6667
print(f"{dataset_name} data downloaded and extracted to {data_path}")

drevalpy/experiment.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ def drug_response_experiment(
9494
if baselines is None:
9595
baselines = []
9696
cross_study_datasets = cross_study_datasets or []
97-
result_path = os.path.join(path_out, run_id, test_mode)
97+
result_path = os.path.join(path_out, run_id, response_data._name, test_mode)
9898
split_path = os.path.join(result_path, "splits")
9999
result_folder_exists = os.path.exists(result_path)
100100
if result_folder_exists and overwrite:
@@ -903,10 +903,11 @@ def train_and_predict(
903903

904904
train_dataset.reduce_to(cell_line_ids=cell_lines_to_keep, drug_ids=drugs_to_keep)
905905
prediction_dataset.reduce_to(cell_line_ids=cell_lines_to_keep, drug_ids=drugs_to_keep)
906-
print(f"Reduced training dataset from {len_train_before} to {len(train_dataset)}, because of missing features")
907-
print(
908-
f"Reduced prediction dataset from {len_pred_before} to {len(prediction_dataset)}, because of missing features"
909-
)
906+
if len(train_dataset) < len_train_before or len(prediction_dataset) < len_pred_before:
907+
print(f"Reduced training dataset from {len_train_before} to {len(train_dataset)}, due to missing features")
908+
print(
909+
f"Reduced prediction dataset from {len_pred_before} to {len(prediction_dataset)}, due to missing features"
910+
)
910911

911912
if early_stopping_dataset is not None:
912913
len_es_before = len(early_stopping_dataset)
@@ -1142,8 +1143,7 @@ def make_model_list(models: list[type[DRPModel]], response_data: DrugResponseDat
11421143

11431144
@pipeline_function
11441145
def get_model_name_and_drug_id(model_name: str) -> tuple[str, str | None]:
1145-
"""
1146-
Get the model name and drug id from the model name.
1146+
"""Get the model name and drug id from the model name.
11471147
11481148
:param model_name: model name, e.g., SimpleNeuralNetwork or MOLIR.Afatinib
11491149
:returns: tuple of model name and, potentially drug id if it is a single drug model

drevalpy/models/.DS_Store

-8 KB
Binary file not shown.

drevalpy/models/SimpleNeuralNetwork/simple_neural_network.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,10 @@ def train(
9494
"ignore",
9595
message=".*does not have many workers which may be a bottleneck.*",
9696
)
97+
warnings.filterwarnings(
98+
"ignore",
99+
message="Starting from v1\\.9\\.0, `tensorboardX` has been removed.*",
100+
)
97101
self.model.fit(
98102
output_train=output,
99103
cell_line_input=cell_line_input,

drevalpy/models/SimpleNeuralNetwork/utils.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
"""Utility functions for the simple neural network models."""
22

3+
import os
34
import secrets
45
from typing import Any
56

@@ -229,11 +230,14 @@ def fit(
229230
monitor = "train_loss" if (val_loader is None) else "val_loss"
230231

231232
early_stop_callback = EarlyStopping(monitor=monitor, mode="min", patience=patience)
232-
name = "version-" + "".join(
233-
[secrets.choice("0123456789abcdef") for i in range(20)]
234-
) # preventing conflicts of filenames
233+
234+
unique_subfolder = os.path.join(model_checkpoint_dir, "run_" + secrets.token_hex(8))
235+
os.makedirs(unique_subfolder, exist_ok=True)
236+
237+
# prevent conflicts
238+
name = "version-" + "".join([secrets.choice("0123456789abcdef") for _ in range(10)])
235239
self.checkpoint_callback = pl.callbacks.ModelCheckpoint(
236-
dirpath=model_checkpoint_dir,
240+
dirpath=unique_subfolder,
237241
monitor=monitor,
238242
mode="min",
239243
save_top_k=1,
@@ -262,7 +266,7 @@ def fit(
262266

263267
# load best model
264268
if self.checkpoint_callback.best_model_path is not None:
265-
checkpoint = torch.load(self.checkpoint_callback.best_model_path) # noqa: S614
269+
checkpoint = torch.load(self.checkpoint_callback.best_model_path, weights_only=True) # noqa: S614
266270
self.load_state_dict(checkpoint["state_dict"])
267271
else:
268272
print("checkpoint_callback: No best model found, using the last model.")

drevalpy/models/__init__.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,16 @@
44
"NaivePredictor",
55
"NaiveDrugMeanPredictor",
66
"NaiveCellLineMeanPredictor",
7+
"NaiveMeanEffectsPredictor",
78
"ElasticNetModel",
89
"RandomForest",
910
"SVMRegressor",
1011
"SimpleNeuralNetwork",
1112
"MultiOmicsNeuralNetwork",
1213
"MultiOmicsRandomForest",
1314
"SingleDrugRandomForest",
15+
"SingleDrugElasticNet",
16+
"SingleDrugProteomicsElasticNet",
1417
"SRMF",
1518
"GradientBoosting",
1619
"MOLIR",
@@ -22,7 +25,13 @@
2225
]
2326

2427
from .baselines.multi_omics_random_forest import MultiOmicsRandomForest
25-
from .baselines.naive_pred import NaiveCellLineMeanPredictor, NaiveDrugMeanPredictor, NaivePredictor
28+
from .baselines.naive_pred import (
29+
NaiveCellLineMeanPredictor,
30+
NaiveDrugMeanPredictor,
31+
NaiveMeanEffectsPredictor,
32+
NaivePredictor,
33+
)
34+
from .baselines.singledrug_elastic_net import SingleDrugElasticNet, SingleDrugProteomicsElasticNet
2635
from .baselines.singledrug_random_forest import SingleDrugRandomForest
2736
from .baselines.sklearn_models import ElasticNetModel, GradientBoosting, RandomForest, SVMRegressor
2837
from .DIPK.dipk import DIPKModel
@@ -38,13 +47,16 @@
3847
"SingleDrugRandomForest": SingleDrugRandomForest,
3948
"MOLIR": MOLIR,
4049
"SuperFELTR": SuperFELTR,
50+
"SingleDrugElasticNet": SingleDrugElasticNet,
51+
"SingleDrugProteomicsElasticNet": SingleDrugProteomicsElasticNet,
4152
}
4253

4354
# MULTI_DRUG_MODEL_FACTORY is used in the pipeline!
4455
MULTI_DRUG_MODEL_FACTORY: dict[str, type[DRPModel]] = {
4556
"NaivePredictor": NaivePredictor,
4657
"NaiveDrugMeanPredictor": NaiveDrugMeanPredictor,
4758
"NaiveCellLineMeanPredictor": NaiveCellLineMeanPredictor,
59+
"NaiveMeanEffectsPredictor": NaiveMeanEffectsPredictor,
4860
"ElasticNet": ElasticNetModel,
4961
"RandomForest": RandomForest,
5062
"SVR": SVMRegressor,

0 commit comments

Comments
 (0)