|
| 1 | +## Uno: Predicting Tumor Dose Response across Multiple Data Sources |
| 2 | + |
| 3 | +#### Example output |
| 4 | +Uno can be trained with a subset of dose response data sources. Here is an command line example of training with all 6 sources: CCLE, CTRP, gCSI, GDSC, NCI60 single drug response, ALMANAC drug pair response. |
| 5 | + |
| 6 | +``` |
| 7 | +uno_baseline_keras2.py --train_sources all --cache cache/all --use_landmark_genes --preprocess_rnaseq source_scale --no_feature_source --no_response_source |
| 8 | +Using TensorFlow backend. |
| 9 | +Params: {'activation': 'relu', 'batch_size': 32, 'dense': [1000, 1000, 1000], 'dense_feature_layers': [1000, 1000, 1000], 'drop': 0, 'epochs': 10, 'learning_rate': None, 'loss': |
| 10 | +'mse', 'optimizer': 'adam', 'residual': False, 'rng_seed': 2018, 'save': 'save/uno', 'scaling': 'std', 'feature_subsample': 0, 'validation_split': 0.2, 'solr_root': '', 'timeout' |
| 11 | +: -1, 'train_sources': ['all'], 'test_sources': ['train'], 'cell_types': None, 'cell_features': ['rnaseq'], 'drug_features': ['descriptors', 'fingerprints'], 'cv': 1, 'max_val_lo |
| 12 | +ss': 1.0, 'base_lr': None, 'reduce_lr': False, 'warmup_lr': False, 'batch_normalization': False, 'no_gen': False, 'config_file': '/raid/fangfang/Benchmarks/Pilot1/Uno/uno_default |
| 13 | +_model.txt', 'verbose': False, 'logfile': None, 'train_bool': True, 'shuffle': True, 'alpha_dropout': False, 'gpus': [], 'experiment_id': 'EXP.000', 'run_id': 'RUN.000', 'by_cell |
| 14 | +': None, 'by_drug': None, 'drug_median_response_min': -1, 'drug_median_response_max': 1, 'no_feature_source': True, 'no_response_source': True, 'use_landmark_genes': True, 'use_f |
| 15 | +iltered_genes': False, 'preprocess_rnaseq': 'source_scale', 'cp': False, 'tb': False, 'partition_by': None, 'cache': 'cache/ALL', 'single': False, 'export_data': None, 'growth_bi |
| 16 | +ns': 0, 'datatype': <class 'numpy.float32'>} |
| 17 | +Cache parameter file does not exist: cache/ALL.params.json |
| 18 | +Loading data from scratch ... |
| 19 | +Loaded 27769716 single drug dose response measurements |
| 20 | +Loaded 3686475 drug pair dose response measurements |
| 21 | +Combined dose response data contains sources: ['CCLE' 'CTRP' 'gCSI' 'GDSC' 'NCI60' 'SCL' 'SCLC' 'ALMANAC.FG' |
| 22 | + 'ALMANAC.FF' 'ALMANAC.1A'] |
| 23 | +Summary of combined dose response by source: |
| 24 | + Growth Sample Drug1 Drug2 MedianDose |
| 25 | +Source |
| 26 | +ALMANAC.1A 208605 60 102 102 7.000000 |
| 27 | +ALMANAC.FF 2062098 60 92 71 6.698970 |
| 28 | +ALMANAC.FG 1415772 60 100 29 6.522879 |
| 29 | +CCLE 93251 504 24 0 6.602060 |
| 30 | +CTRP 6171005 887 544 0 6.585027 |
| 31 | +GDSC 1894212 1075 249 0 6.505150 |
| 32 | +NCI60 18862308 59 52671 0 6.000000 |
| 33 | +SCL 301336 65 445 0 6.908485 |
| 34 | +SCLC 389510 70 526 0 6.908485 |
| 35 | +gCSI 58094 409 16 0 7.430334 |
| 36 | +Combined raw dose response data has 3070 unique samples and 53520 unique drugs |
| 37 | +Limiting drugs to those with response min <= 1, max >= -1, span >= 0, median_min <= -1, median_max >= 1 ... |
| 38 | +Selected 47005 drugs from 53520 |
| 39 | +Loaded combined RNAseq data: (15198, 943) |
| 40 | +Loaded combined dragon7 drug descriptors: (53507, 5271) |
| 41 | +Loaded combined dragon7 drug fingerprints: (53507, 2049) |
| 42 | +Filtering drug response data... |
| 43 | + 2375 molecular samples with feature and response data |
| 44 | + 46837 selected drugs with feature and response data |
| 45 | +Summary of filtered dose response by source: |
| 46 | + Growth Sample Drug1 Drug2 MedianDose |
| 47 | +Source |
| 48 | +ALMANAC.1A 206580 60 101 101 7.000000 |
| 49 | +ALMANAC.FF 2062098 60 92 71 6.698970 |
| 50 | +ALMANAC.FG 1293465 60 98 27 6.522879 |
| 51 | +CCLE 80213 474 22 0 6.602060 |
| 52 | +CTRP 3397103 812 311 0 6.585027 |
| 53 | +GDSC 1022204 672 213 0 6.505150 |
| 54 | +NCI60 17190561 59 46272 0 6.000000 |
| 55 | +gCSI 50822 357 16 0 7.430334 |
| 56 | +Grouped response data by drug_pair: 51763 groups |
| 57 | +Input features shapes: |
| 58 | + dose1: (1,) |
| 59 | + dose2: (1,) |
| 60 | + cell.rnaseq: (942,) |
| 61 | + drug1.descriptors: (5270,) |
| 62 | + drug1.fingerprints: (2048,) |
| 63 | + drug2.descriptors: (5270,) |
| 64 | + drug2.fingerprints: (2048,) |
| 65 | +Total input dimensions: 15580 |
| 66 | +Saved data to cache: cache/all.pkl |
| 67 | +Combined model: |
| 68 | +__________________________________________________________________________________________________ |
| 69 | +Layer (type) Output Shape Param # Connected to |
| 70 | +================================================================================================== |
| 71 | +input.cell.rnaseq (InputLayer) (None, 942) 0 |
| 72 | +__________________________________________________________________________________________________ |
| 73 | +input.drug1.descriptors (InputL (None, 5270) 0 |
| 74 | +__________________________________________________________________________________________________ |
| 75 | +input.drug1.fingerprints (Input (None, 2048) 0 |
| 76 | +__________________________________________________________________________________________________ |
| 77 | +input.drug2.descriptors (InputL (None, 5270) 0 |
| 78 | +__________________________________________________________________________________________________ |
| 79 | +input.drug2.fingerprints (Input (None, 2048) 0 |
| 80 | +__________________________________________________________________________________________________ |
| 81 | +input.dose1 (InputLayer) (None, 1) 0 |
| 82 | +__________________________________________________________________________________________________ |
| 83 | +input.dose2 (InputLayer) (None, 1) 0 |
| 84 | +__________________________________________________________________________________________________ |
| 85 | +cell.rnaseq (Model) (None, 1000) 2945000 input.cell.rnaseq[0][0] |
| 86 | +__________________________________________________________________________________________________ |
| 87 | +drug.descriptors (Model) (None, 1000) 7273000 input.drug1.descriptors[0][0] |
| 88 | + input.drug2.descriptors[0][0] |
| 89 | +__________________________________________________________________________________________________ |
| 90 | +drug.fingerprints (Model) (None, 1000) 4051000 input.drug1.fingerprints[0][0] |
| 91 | + input.drug2.fingerprints[0][0] |
| 92 | +__________________________________________________________________________________________________ |
| 93 | +concatenate_1 (Concatenate) (None, 5002) 0 input.dose1[0][0] |
| 94 | + input.dose2[0][0] |
| 95 | + cell.rnaseq[1][0] |
| 96 | + drug.descriptors[1][0] |
| 97 | + drug.fingerprints[1][0] |
| 98 | + drug.descriptors[2][0] |
| 99 | + drug.fingerprints[2][0] |
| 100 | +__________________________________________________________________________________________________ |
| 101 | +dense_10 (Dense) (None, 1000) 5003000 concatenate_1[0][0] |
| 102 | +__________________________________________________________________________________________________ |
| 103 | +dense_11 (Dense) (None, 1000) 1001000 dense_10[0][0] |
| 104 | +__________________________________________________________________________________________________ |
| 105 | +dense_12 (Dense) (None, 1000) 1001000 dense_11[0][0] |
| 106 | +__________________________________________________________________________________________________ |
| 107 | +dense_13 (Dense) (None, 1) 1001 dense_12[0][0] |
| 108 | +================================================================================================== |
| 109 | +Total params: 21,275,001 |
| 110 | +Trainable params: 21,275,001 |
| 111 | +Non-trainable params: 0 |
| 112 | +__________________________________________________________________________________________________ |
| 113 | +Between random pairs in y_val: |
| 114 | + mse: 0.6069 |
| 115 | + mae: 0.5458 |
| 116 | + r2: -0.9998 |
| 117 | + corr: 0.0001 |
| 118 | +Data points per epoch: train = 20158325, val = 5144721 |
| 119 | +Steps per epoch: train = 629948, val = 160773 |
| 120 | +Epoch 1/10 |
| 121 | + 8078/629948 [..............................] - ETA: 50:20:54 - loss: 0.1955 - mae: 0.2982 - r2: 0.2964 |
| 122 | +``` |
0 commit comments