Skip to content

Commit 5d88d54

Browse files
committed
More detail on functions
1 parent ab0ca03 commit 5d88d54

File tree

1 file changed

+27
-4
lines changed

1 file changed

+27
-4
lines changed

examples/M16/README.md

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@ And the performance can be evaluated from multiple aspects, such as the predicti
1010
The data preprocessing methods can generate data partitions to enable flexible cross-validation analysis, normalize and remove batch effects from gene expression data of cancer cells, and generate genomic representations at the gene set level for cancer cells.
1111
The feature selection methods can filter features based on missing values and variations, and perform feature decorrelation.
1212
Features without much variation might not be useful for prediction and highly-correlated features are not necessary to be all included in the prediction model.
13-
We also implement and extend the co-expression extrapolation (COXEN) gene selection method for Pilot 1 project [10], which can select predictive and generalizable genes for predicting drug response in the precision oncology applications.
13+
We also implement and extend the co-expression extrapolation (COXEN) gene selection method for Pilot 1 project, which can select predictive and generalizable genes for predicting drug response in the precision oncology applications.
1414

15-
# General Data Preprocessing Functions
15+
## General Data Preprocessing Functions
1616

1717
```generate_cross_validation_partition```
1818

1919
To flexibly generate data partitions for cross-validation analysis, such as partitioning of grouped samples into sets that do not share groups.
2020

21-
# Data Preprocessing Functions Specific to Pilot 1 Applications
21+
## Data Preprocessing Functions Specific to Pilot 1 Applications
2222

2323
```quantile_normalizationa```
2424

@@ -32,8 +32,31 @@ To perform ComBat analysis [9] on gene expression data to remove batch effects.
3232

3333
To calculate genomic representations at gene set level, such as the average expression values of genes in a pathway and the total number of SNP mutations in a genetic pathway.
3434

35+
## General Feature Selection Functions
3536

36-
# Feature Selection examples
37+
```select_features_by_missing_values```
38+
39+
To remove features with (many) missing values.
40+
41+
```select_features_by_variation```
42+
43+
To remove features with no or small variations.
44+
45+
```select_decorrelated_features```
46+
47+
To select a subset of features that are not identical or highly correlated with each other.
48+
49+
## Feature (Gene) Selection Functions Specific to Pilot 1 Applications
50+
51+
```coxen_single_drug_gene_selection```
52+
53+
To perform co-expression extrapolation (COXEN) analysis that selects predictive and generalizable genes for predicting the response of tumor cells to a specific drug.
54+
55+
```coxen_multi_drug_gene_selection```
56+
57+
To extend the COXEN approach for selecting genes to predict the response of tumor cells to multiple drugs in precision oncology applications.
58+
59+
# Running the example
3760

3861
The code demonstrates feature selection methods that CANDLE provides.
3962

0 commit comments

Comments
 (0)