You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/M16/README.md
+27-4Lines changed: 27 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,15 +10,15 @@ And the performance can be evaluated from multiple aspects, such as the predicti
10
10
The data preprocessing methods can generate data partitions to enable flexible cross-validation analysis, normalize and remove batch effects from gene expression data of cancer cells, and generate genomic representations at the gene set level for cancer cells.
11
11
The feature selection methods can filter features based on missing values and variations, and perform feature decorrelation.
12
12
Features without much variation might not be useful for prediction and highly-correlated features are not necessary to be all included in the prediction model.
13
-
We also implement and extend the co-expression extrapolation (COXEN) gene selection method for Pilot 1 project[10], which can select predictive and generalizable genes for predicting drug response in the precision oncology applications.
13
+
We also implement and extend the co-expression extrapolation (COXEN) gene selection method for Pilot 1 project, which can select predictive and generalizable genes for predicting drug response in the precision oncology applications.
14
14
15
-
# General Data Preprocessing Functions
15
+
##General Data Preprocessing Functions
16
16
17
17
```generate_cross_validation_partition```
18
18
19
19
To flexibly generate data partitions for cross-validation analysis, such as partitioning of grouped samples into sets that do not share groups.
20
20
21
-
# Data Preprocessing Functions Specific to Pilot 1 Applications
21
+
##Data Preprocessing Functions Specific to Pilot 1 Applications
22
22
23
23
```quantile_normalizationa```
24
24
@@ -32,8 +32,31 @@ To perform ComBat analysis [9] on gene expression data to remove batch effects.
32
32
33
33
To calculate genomic representations at gene set level, such as the average expression values of genes in a pathway and the total number of SNP mutations in a genetic pathway.
34
34
35
+
## General Feature Selection Functions
35
36
36
-
# Feature Selection examples
37
+
```select_features_by_missing_values```
38
+
39
+
To remove features with (many) missing values.
40
+
41
+
```select_features_by_variation```
42
+
43
+
To remove features with no or small variations.
44
+
45
+
```select_decorrelated_features```
46
+
47
+
To select a subset of features that are not identical or highly correlated with each other.
48
+
49
+
## Feature (Gene) Selection Functions Specific to Pilot 1 Applications
50
+
51
+
```coxen_single_drug_gene_selection```
52
+
53
+
To perform co-expression extrapolation (COXEN) analysis that selects predictive and generalizable genes for predicting the response of tumor cells to a specific drug.
54
+
55
+
```coxen_multi_drug_gene_selection```
56
+
57
+
To extend the COXEN approach for selecting genes to predict the response of tumor cells to multiple drugs in precision oncology applications.
58
+
59
+
# Running the example
37
60
38
61
The code demonstrates feature selection methods that CANDLE provides.
0 commit comments