You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/M16/README.md
+13-4Lines changed: 13 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ And the performance can be evaluated from multiple aspects, such as the predicti
10
10
The data preprocessing methods can generate data partitions to enable flexible cross-validation analysis, normalize and remove batch effects from gene expression data of cancer cells, and generate genomic representations at the gene set level for cancer cells.
11
11
The feature selection methods can filter features based on missing values and variations, and perform feature decorrelation.
12
12
Features without much variation might not be useful for prediction and highly-correlated features are not necessary to be all included in the prediction model.
13
-
We also implement and extend the co-expression extrapolation (COXEN) gene selection method for Pilot 1 project, which can select predictive and generalizable genes for predicting drug response in the precision oncology applications.
13
+
We also implement and extend the co-expression extrapolation (COXEN) gene selection method for Pilot 1 project[3], which can select predictive and generalizable genes for predicting drug response in the precision oncology applications.
14
14
15
15
## General Data Preprocessing Functions
16
16
@@ -22,11 +22,11 @@ To flexibly generate data partitions for cross-validation analysis, such as part
22
22
23
23
```quantile_normalizationa```
24
24
25
-
To perform quantile normalization of genomic data [8] with tolerance of missing values.
25
+
To perform quantile normalization of genomic data [1] with tolerance of missing values.
26
26
27
27
```combat_batch_effect_removal```
28
28
29
-
To perform ComBat analysis [9] on gene expression data to remove batch effects.
29
+
To perform ComBat analysis [2] on gene expression data to remove batch effects.
30
30
31
31
```generate_gene_set_data```
32
32
@@ -50,7 +50,7 @@ To select a subset of features that are not identical or highly correlated with
50
50
51
51
```coxen_single_drug_gene_selection```
52
52
53
-
To perform co-expression extrapolation (COXEN) analysis that selects predictive and generalizable genes for predicting the response of tumor cells to a specific drug.
53
+
To perform co-expression extrapolation (COXEN) analysis [3]that selects predictive and generalizable genes for predicting the response of tumor cells to a specific drug.
54
54
55
55
```coxen_multi_drug_gene_selection```
56
56
@@ -430,3 +430,12 @@ Average third quartile of CCLE cell lines is 4.83
430
430
Average median of CCLE cell lines is 2.72
431
431
Average first quartile of CCLE cell lines is 0.13
432
432
```
433
+
434
+
# References
435
+
436
+
1. Bolstad BM, Irizarry RA, Astrand M, et al. \(2003\) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003 Jan 22;19\(2\):185-93.
437
+
438
+
2. Johnson WE, Rabinovic A, and Li C \(2007\) Adjusting batch effects in microarray expression data using Empirical Bayes methods. Biostatistics 8\(1\):118-127.
439
+
440
+
3. Lee JK, Havaleshko DM, Cho H, et al. \(2007\) A strategy for predicting the chemosensitivity of human cancers and its application to drug discovery. Proc Natl Acad Sci USA, 2007 Aug 7; 104\(32\):13086-91. Epub 2007 Jul 31
0 commit comments