Skip to content

Commit 753bd5a

Browse files
committed
Added more details
1 parent c204b19 commit 753bd5a

File tree

5 files changed

+63
-45
lines changed

5 files changed

+63
-45
lines changed

09-r-environment-setup.Rmd

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ Saving the workspace image saves all objects from the session such as your varia
168168

169169
Now that we have a clear workspace, we will prepare for the first analysis activity by opening the notebook and checking our working directory.
170170

171-
You have previosuly downloaded an unzipped `Functional_enrichment_workshop_2024`. This contains a folder `day_2`.
171+
You have previosuly downloaded `Functional_enrichment_workshop_2024`. This contains a folder `day_2`.
172172

173173
➤ On the `Files` pane, open the `day_2` folder and confirm that it contains the input data file `Pezzini_DE.txt`
174174

@@ -194,12 +194,14 @@ Scroll down to the code chunk labelled `Load input data`. Note that the filepath
194194

195195
Immediately above the `Load input data` is a code chunk labelled `Load R packages`. This contains all of the R packages required to run the analysis contained within the workbook. Loading all required packages within the notebook, rather than directly via the console, ensures that anyone running your notebook does not encounter errors if they forget to load a required package.
196196

197-
Note that the packages that are loaded to the session with the R `library` command must first be installed; this has already been been done for you on these VMs. Attempts to load a package that is not installed will meet a fatal error, and installation can then be peformed (not difficult in R) before resuming.
197+
Note that the packages that are loaded to the session with the R `library` command must first be installed; this has already been been done for you on these VMs. Attempts to load a package that is not installed will produce an error, and installation can then be peformed (not difficult in R) before resuming.
198198

199-
Note that the code chunk label also contains the text `include=FALSE`. This prevents the loading of libraries, which can at times have verbose output, from cluttering up your rendered notebook when it is previewed or knit.
199+
Note that the code chunk label also contains the text `include=FALSE`. This prevents the loading of libraries (which can at times have verbose output) from cluttering up your rendered notebook when it is previewed or knit.
200200

201201
<p>&nbsp;</p> <!-- insert blank line -->
202202

203203
&#x27A4; Run the `Load R packages` code chunk.
204204

205-
Please let us know if you have any errors loading the packages. Don't be alarmed that the output is red! :relaxed:
205+
Please let us know if you have any errors loading the packages :raised_hand:
206+
207+
Don't be alarmed that the output is <span style="color: red;">red</span>! :slightly_smiling_face:

10-gprofiler2.Rmd

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,23 @@
11
# ORA with gprofiler2
22

3-
[gprofiler2](https://cran.r-project.org/web/packages/gprofiler2/index.html) is the R interface to the 'g:Profiler' toolset that you used in day 1 of the workshop.
3+
[gprofiler2](https://cran.r-project.org/web/packages/gprofiler2/index.html) is the R interface to the `g:Profiler` web-based toolset that you used in day 1 of the workshop.
44

5-
Like the web interface, gprofiler2 performs ORA with `g:GOSt` against multiple databases simultaneously.
5+
Like the web interface, `gprofiler2` performs ORA with `g:GOSt` against multiple databases simultaneously.
66

77
It supports all the same organisms, namespaces and data sources as the web tool. The list of organisms and corresponding data sources is available [here](https://biit.cs.ut.ee/gprofiler/page/organism-list) (n = 984).
8-
The full list of namespaces that g:Profiler recognizes is available [here](https://biit.cs.ut.ee/gprofiler/page/namespaces-list).
8+
9+
The full list of supported namespaces is available [here](https://biit.cs.ut.ee/gprofiler/page/namespaces-list).
10+
11+
The `gprofiler2` user guide can be found [here](https://cran.r-project.org/web/packages/gprofiler2/gprofiler2.pdf).
12+
13+
<p>&nbsp;</p> <!-- insert blank line -->
914

1015
## Input data
1116

1217
Since we are doing ORA, we will need a filtered gene list, and a background gene list. We will continue with the RNAseq dataset from [Pezzini et al 2016](https://link.springer.com/article/10.1007/s10571-016-0403-y) introduced yesterday.
1318

19+
<p>&nbsp;</p> <!-- insert blank line -->
20+
1421
## Activity overview
1522

1623
1. Load input dataset (a gene matrix with adjusted P values and log2 fold change values)
@@ -19,13 +26,16 @@ Since we are doing ORA, we will need a filtered gene list, and a background gene
1926
4. Run ORA with `gost` function
2027
5. Save the tabular results to a file
2128
6. Visualise the results
22-
7. Run a gost multi-query for up-regulated and down-regulated genes
23-
8. Compare gprofiler2 R results to the g:Profiler web results
29+
7. Run a `gost` multi-query for up-regulated and down-regulated genes
30+
8. Compare `gprofiler2` R results to the `g:Profiler` web results
31+
32+
<p>&nbsp;</p> <!-- insert blank line -->
2433

2534
&#x27A4; Go back to your RStudio interface, where we have opened the `gprofiler2.Rmd` notebook and loaded the required R packages.
2635

27-
**Instructions and information for the rest of this activity will continue from the notebook.**
36+
**Instructions for the analysis will continue from the notebook.**
2837

38+
<p>&nbsp;</p> <!-- insert blank line -->
2939

3040
## End of activity summary
3141

@@ -37,8 +47,10 @@ Since we are doing ORA, we will need a filtered gene list, and a background gene
3747

3848
The last task is to `knit` the notebook. Our notebook is editable, and can be changed. Deleting code deletes the output, so we could lose valuable details. If we knit the notebook to HTML, we have a permanent static copy of the work.
3949

50+
<p>&nbsp;</p> <!-- insert blank line -->
51+
4052
&#x27A4; Knit the notebook to HTML
4153

42-
Note that the notebook will only knit if there are no errors in the code. If your knit fails, please ask for assistance resolving the errors.
54+
Note that the notebook will only knit if there are no errors in the code. If your knit fails, please ask for assistance resolving the errors :raised_hand:
4355

4456

11-clusterprofiler.Rmd

Lines changed: 25 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,53 @@
11
# GSEA with clusterProfiler
22

33

4-
[clusterProfiler](https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html) is a comprehensive suite of enrichment tools. It has inbuilt functions to run ORA (`enrich<DB>`) or GSEA (`gse<DB>`) over commonly used databases (GO, KEGG, KEGG Modules, DAVID, Pathway Commons, WikiPathways) as well as generic functions to perform ORA (`enricher`) or GSEA (`gsea`) with custom gene sets.
4+
[clusterProfiler](https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html) is a comprehensive suite of enrichment tools. It has functions to run ORA or GSEA over commonly used databases (GO, KEGG, KEGG Modules, DAVID, Pathway Commons, WikiPathways) as well as generic functions to perform ORA or GSEA with custom gene sets.
55

6-
It has a companion plotting package `enrichPlot` dedicated to plotting `clusterProfiler` results.
6+
It has a companion plotting package [enrichplot](https://www.bioconductor.org/packages/release/bioc/html/enrichplot.html) dedicated to plotting enrichment results.
77

88
The `clusterProfiler` user guide can be found [here](https://bioconductor.org/packages/devel/bioc/manuals/clusterProfiler/man/clusterProfiler.pdf).
99

1010
The `enrichplot` user guide can be found [here](https://www.bioconductor.org/packages/devel/bioc/manuals/enrichplot/man/enrichplot.pdf).
1111

12+
One of the challenges when working with `clusterProfiler` for FEA is that each enrichment function has different supported organisms and different namespace requirements, so you can not necessarily use all of the functions over the same gene list. In this activity, we will review the FEA functions and investigate their requirements, before performing a gene ID conversion with the `bitr` function to enable compatability with our ([Pezzini et al 2016](https://link.springer.com/article/10.1007/s10571-016-0403-y)) dataset.
1213

13-
## Input data
14-
15-
We will use the same RNAseq dataset from the previous activity ([Pezzini et al 2016](https://link.springer.com/article/10.1007/s10571-016-0403-y)).
16-
14+
<p>&nbsp;</p> <!-- insert blank line -->
1715

1816
## Activity overview
1917

20-
** THINKING TO DROP GSEGO ENTIERLY, AND JUST DO GESKEGG
21-
22-
PAT A: A REVIEW OF SUPPORTED DBS, DIFFERENT SUPPORTED ORGANISMS AND NAMESPACES DEPENDING ON DB, AND HOW TO FIND OUT WHAT IS REQUIRED/APPLICABLE PER EACH FUNCTION
23-
PART B - GSEKEGG WHICH REQUIRES A GENE ID CONVERSION USIN GBITR, THEN RUN GSEA, THEN SOME PPLOTS
24-
PART C - INCLUDE HERE OR IN NOTEBOOK 4 - TERM2GENE AND TERM2NAME
25-
26-
REASON: GO IS DONE TO DEATH, AND THE TRICK OF THIS PACKAGE IS THAT EACH FUNCTION HAS ITS OWN LIST OF SUPPORTED SPECIES AND NAMESPACES DEPENDING ON THE DATABASE, THE PDF IS NOT SUPER CLEAR ON WHICH IS WHICH.
18+
1. Explore the functions of `clusterProfiler` including which FEA functions support which organisms and which namespaces
19+
2. Load input dataset (a gene matrix with adjusted P values and log2 fold change values)
20+
3. Extract the gene IDs and sort by log2 fold change to create the GSEA gene list R object
21+
4. Use `bitr` to convert gene IDs from ENSEMBL to ENTREZ for comptability with `gseKEGG`
22+
5. Perform GSEA with `gseKEGG`
23+
6. Visualise results with `enrichplot`
2724

28-
WOULD LIKE MORE TIME TO DO THE TERM2GENE ADN TERM2NAME AS MANY APPLICANTS MENTIONED THIS EITHER INDIRECTLY VIA REQUEST FOR NON MODEL OR DIRECTLY.
25+
<p>&nbsp;</p> <!-- insert blank line -->
2926

30-
1. Load input dataset (a gene matrix with adjusted P values and log2 fold change values)
31-
2. Extract the gene IDs and sort by log2 fold change to create the GSEA gene list
32-
3. Use the `gseGO` function to run GSEA over GO MF
33-
4. Visualise GO results with `enrichplot`
34-
5. Use `bitr` to convert gene IDs then use the `gseKEGG` function to run GSEA over KEGG
35-
6. Visualise KEGG results with `enrichplot`
36-
7. Review `enricher` and `gsea` generic functions to perform ORA and GSEA
27+
&#x27A4; Go back to your RStudio interface and clear your environment by selecting `Session` &rarr; `Quit session` &rarr; `Dont save` &rarr;`Start mew session`
3728

3829

39-
&#x27A4; Refresh your Rstudio workspace with option 1 or option 2
30+
<p>&nbsp;</p> <!-- insert blank line -->
4031

41-
***Option 1: close and re-open RStudio***
32+
&#x27A4; Open the `clusterProfiler.Rmd` notebook using `File` &rarr; `Open file`, or use the keyboard shortcut `ctrl + o`.
4233

43-
Close RStudio, and if asked `Save workspace image to ~/R.Data?` select `Don't Save`. Then, re-open RStudio.
4434

45-
***Option 2: manualy clear environment and history***
35+
**Instructions for the analysis will continue from the notebook.**
4636

47-
Close the Rmd file, clear the command history by selecting the broom icon in the history pane, then clear all objects from the environment by entering the following R command in the console:
37+
<p>&nbsp;</p> <!-- insert blank line -->
4838

49-
```{r}
50-
rm(list = ls())
51-
```
52-
53-
&#x27A4; Open the `clusterProfiler.Rmd` notebook in RStudio
54-
55-
**Instructions and information for the rest of this activity will continue from the notebook.**
39+
## End of activity summary
5640

41+
- We have explored the supported organisms and namespaces of the `clusterProfiler`enrichment functions
42+
- We have extracted a ranked gene list for GSEA and converted the gene IDs for compatability with `gseKEGG`
43+
- We have performed GSEA on the KEGG database with `gseKEGG` and visualised the results with multiple plot types
44+
- We have captured all version details relevant to the session within the R notebook
5745

58-
## End of activity summary
46+
<p>&nbsp;</p> <!-- insert blank line -->
5947

48+
## Poll
6049

50+
:question: What was your favourite plot? :thinking:
6151

52+
This may be the one you found most informative, easiest to interpret, most eye-catching...
6253

12-webgestaltr.Rmd

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# WebGestaltR
2+
3+
LIST OF AMAZING THINGS ABOUT WEBGESTALTR:
4+
- makes great html reports with interactive plots and links to external dbs
5+
- saves the results to disk when running, no need to export stuf and save files manually
6+
- many dbs and gene lists supported (n = 70)
7+
- supports metabolomics, with 15 different ID types, see new paper https://academic.oup.com/nar/article/52/W1/W415/7684598#google_vignette
8+
- can be used for novel species, but i havent tried it yet...
9+
- does ORA, GSEA, and NTA. I wonder if the NTA works at all for novel species???!
10+
- super easy to run. many supported namespaces (n = 73), does not require conversions for different functions like clusterProfiler, can even have different napesapce for ORA gene list and background list
11+
- "Multiple databases in a vector are supported for ORA and GSEA"

13-novel-species.Rmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,8 @@ There are then provided to the universal enrichment functions `GSEA` and `enrich
7474

7575
In RStudio, we will extract these file formats from the eggNOG annotations file for axolotl and proceed with FEA.
7676

77+
Acknowledgement to [Armin Dadras](https://github.com/dadrasarmin) for sharing his [code](https://github.com/dadrasarmin/enrichment_analysis_for_non_model_organism]) to extract `TERM2GENE` and `TERM2NAME` from `emapper` output.
78+
7779
### WebGestaltR
7880

7981
This tool can perform ORA or GSEA for any organism with the provision of custom `GMT` and `description` file.

0 commit comments

Comments
 (0)