|
1 | 1 | # GSEA with clusterProfiler |
2 | 2 |
|
3 | 3 |
|
4 | | -[clusterProfiler](https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html) is a comprehensive suite of enrichment tools. It has inbuilt functions to run ORA (`enrich<DB>`) or GSEA (`gse<DB>`) over commonly used databases (GO, KEGG, KEGG Modules, DAVID, Pathway Commons, WikiPathways) as well as generic functions to perform ORA (`enricher`) or GSEA (`gsea`) with custom gene sets. |
| 4 | +[clusterProfiler](https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html) is a comprehensive suite of enrichment tools. It has functions to run ORA or GSEA over commonly used databases (GO, KEGG, KEGG Modules, DAVID, Pathway Commons, WikiPathways) as well as generic functions to perform ORA or GSEA with custom gene sets. |
5 | 5 |
|
6 | | -It has a companion plotting package `enrichPlot` dedicated to plotting `clusterProfiler` results. |
| 6 | +It has a companion plotting package [enrichplot](https://www.bioconductor.org/packages/release/bioc/html/enrichplot.html) dedicated to plotting enrichment results. |
7 | 7 |
|
8 | 8 | The `clusterProfiler` user guide can be found [here](https://bioconductor.org/packages/devel/bioc/manuals/clusterProfiler/man/clusterProfiler.pdf). |
9 | 9 |
|
10 | 10 | The `enrichplot` user guide can be found [here](https://www.bioconductor.org/packages/devel/bioc/manuals/enrichplot/man/enrichplot.pdf). |
11 | 11 |
|
| 12 | +One of the challenges when working with `clusterProfiler` for FEA is that each enrichment function has different supported organisms and different namespace requirements, so you can not necessarily use all of the functions over the same gene list. In this activity, we will review the FEA functions and investigate their requirements, before performing a gene ID conversion with the `bitr` function to enable compatability with our ([Pezzini et al 2016](https://link.springer.com/article/10.1007/s10571-016-0403-y)) dataset. |
12 | 13 |
|
13 | | -## Input data |
14 | | - |
15 | | -We will use the same RNAseq dataset from the previous activity ([Pezzini et al 2016](https://link.springer.com/article/10.1007/s10571-016-0403-y)). |
16 | | - |
| 14 | +<p> </p> <!-- insert blank line --> |
17 | 15 |
|
18 | 16 | ## Activity overview |
19 | 17 |
|
20 | | -** THINKING TO DROP GSEGO ENTIERLY, AND JUST DO GESKEGG |
21 | | - |
22 | | -PAT A: A REVIEW OF SUPPORTED DBS, DIFFERENT SUPPORTED ORGANISMS AND NAMESPACES DEPENDING ON DB, AND HOW TO FIND OUT WHAT IS REQUIRED/APPLICABLE PER EACH FUNCTION |
23 | | -PART B - GSEKEGG WHICH REQUIRES A GENE ID CONVERSION USIN GBITR, THEN RUN GSEA, THEN SOME PPLOTS |
24 | | -PART C - INCLUDE HERE OR IN NOTEBOOK 4 - TERM2GENE AND TERM2NAME |
25 | | - |
26 | | -REASON: GO IS DONE TO DEATH, AND THE TRICK OF THIS PACKAGE IS THAT EACH FUNCTION HAS ITS OWN LIST OF SUPPORTED SPECIES AND NAMESPACES DEPENDING ON THE DATABASE, THE PDF IS NOT SUPER CLEAR ON WHICH IS WHICH. |
| 18 | +1. Explore the functions of `clusterProfiler` including which FEA functions support which organisms and which namespaces |
| 19 | +2. Load input dataset (a gene matrix with adjusted P values and log2 fold change values) |
| 20 | +3. Extract the gene IDs and sort by log2 fold change to create the GSEA gene list R object |
| 21 | +4. Use `bitr` to convert gene IDs from ENSEMBL to ENTREZ for comptability with `gseKEGG` |
| 22 | +5. Perform GSEA with `gseKEGG` |
| 23 | +6. Visualise results with `enrichplot` |
27 | 24 |
|
28 | | -WOULD LIKE MORE TIME TO DO THE TERM2GENE ADN TERM2NAME AS MANY APPLICANTS MENTIONED THIS EITHER INDIRECTLY VIA REQUEST FOR NON MODEL OR DIRECTLY. |
| 25 | +<p> </p> <!-- insert blank line --> |
29 | 26 |
|
30 | | -1. Load input dataset (a gene matrix with adjusted P values and log2 fold change values) |
31 | | -2. Extract the gene IDs and sort by log2 fold change to create the GSEA gene list |
32 | | -3. Use the `gseGO` function to run GSEA over GO MF |
33 | | -4. Visualise GO results with `enrichplot` |
34 | | -5. Use `bitr` to convert gene IDs then use the `gseKEGG` function to run GSEA over KEGG |
35 | | -6. Visualise KEGG results with `enrichplot` |
36 | | -7. Review `enricher` and `gsea` generic functions to perform ORA and GSEA |
| 27 | +➤ Go back to your RStudio interface and clear your environment by selecting `Session` → `Quit session` → `Dont save` →`Start mew session` |
37 | 28 |
|
38 | 29 |
|
39 | | -➤ Refresh your Rstudio workspace with option 1 or option 2 |
| 30 | +<p> </p> <!-- insert blank line --> |
40 | 31 |
|
41 | | -***Option 1: close and re-open RStudio*** |
| 32 | +➤ Open the `clusterProfiler.Rmd` notebook using `File` → `Open file`, or use the keyboard shortcut `ctrl + o`. |
42 | 33 |
|
43 | | -Close RStudio, and if asked `Save workspace image to ~/R.Data?` select `Don't Save`. Then, re-open RStudio. |
44 | 34 |
|
45 | | -***Option 2: manualy clear environment and history*** |
| 35 | +**Instructions for the analysis will continue from the notebook.** |
46 | 36 |
|
47 | | -Close the Rmd file, clear the command history by selecting the broom icon in the history pane, then clear all objects from the environment by entering the following R command in the console: |
| 37 | +<p> </p> <!-- insert blank line --> |
48 | 38 |
|
49 | | -```{r} |
50 | | -rm(list = ls()) |
51 | | -``` |
52 | | - |
53 | | -➤ Open the `clusterProfiler.Rmd` notebook in RStudio |
54 | | - |
55 | | -**Instructions and information for the rest of this activity will continue from the notebook.** |
| 39 | +## End of activity summary |
56 | 40 |
|
| 41 | +- We have explored the supported organisms and namespaces of the `clusterProfiler`enrichment functions |
| 42 | +- We have extracted a ranked gene list for GSEA and converted the gene IDs for compatability with `gseKEGG` |
| 43 | +- We have performed GSEA on the KEGG database with `gseKEGG` and visualised the results with multiple plot types |
| 44 | +- We have captured all version details relevant to the session within the R notebook |
57 | 45 |
|
58 | | -## End of activity summary |
| 46 | +<p> </p> <!-- insert blank line --> |
59 | 47 |
|
| 48 | +## Poll |
60 | 49 |
|
| 50 | +:question: What was your favourite plot? :thinking: |
61 | 51 |
|
| 52 | +This may be the one you found most informative, easiest to interpret, most eye-catching... |
62 | 53 |
|
0 commit comments