Skip to content

Commit 81ddf23

Browse files
committed
gProfiler adde
1 parent 734e8c1 commit 81ddf23

File tree

4 files changed

+631
-4
lines changed

4 files changed

+631
-4
lines changed

.DS_Store

0 Bytes
Binary file not shown.

.Rhistory

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
"Hands-on with gProfiler (breakout rooms)",
2-
"Break",
3-
"STRING (https://string-db.org/)",
41
"Reactome (https://reactome.org/)",
52
"GSEA (GenePattern) (https://cloud.genepattern.org/gp/pages/index.jsf)",
63
"3 hrs",
@@ -510,3 +507,6 @@ knitr::include_graphics("images/go_structure.png")
510507
knitr::include_graphics("images/NOTCH_signaling_pathway_reactome.png")
511508
knitr::include_graphics("images/GSEA-homegraphic.gif")
512509
getwd()
510+
knitr::include_graphics("images/rstudio-new-notebook.jpg")
511+
library(gprofiler)
512+
library(gprofiler2)

06-web-tools.Rmd

Lines changed: 96 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,97 @@
1-
# Online tools
1+
# Online Tools
2+
3+
Functional enrichment analysis can be performed using various web-based tools, each of which is designed to meet specific analytical needs. These tools often vary in the databases they use, their statistical approaches, and their capabilities to perform different types of analysis, such as Over-Representation Analysis (ORA) or Gene Set Enrichment Analysis (GSEA).
4+
5+
In this workshop, we will explore several popular tools for functional enrichment analysis, including gProfiler, STRING, Reactome, and MSigDB GSEA. Each tool offers unique features and insights, providing flexibility in selecting the right method for diverse datasets and research questions.
6+
7+
8+
## FEA in [g:Profiler](https://biit.cs.ut.ee/gprofiler/gost)
9+
gProfiler is known for its integration of numerous species and databases. It supports both ORA and GSEA, enabling users to assess Gene Ontology (GO), biological pathways, regulatory motifs and protein databases. With gProfile one can
10+
11+
### Steps to perform ORA in g:Profiler:
12+
13+
<span style="color:orange;">- Prepare Input List:</span> Ensure your input is formatted as one gene per line or in a suitable format for g:Profiler.
14+
15+
<span style="color:orange;">- Input Gene List:</span> Paste your prepared gene list directly into the input box on the g:Profiler web page or upload a file containing your list.
16+
17+
<span style="color:orange;">- Select Organism:</span> Choose the appropriate organism from the `Organism` dropdown menu (e.g., *Homo sapiens* for human data).
18+
19+
<span style="color:orange;">- Choose Statistical Domain Scope:</span> Under `Advanced options`, select your preferred statistical background from the `Statistical domain scope` menu.If you choose "Custom" background, provide your custom background list by pasting or uploading the relevant file.
20+
21+
<span style="color:orange;">- Set Significance Threshold:</span> Select the desired significance threshold method, such as *g:SCS*, *Bonferroni*, or *Benjamini-Hochberg*.
22+
- Specify the threshold value (e.g., 0.05, 0.1, etc.).
23+
24+
<span style="color:orange;">- Select Functional Annotation Databases: </span> Navigate to the `Data sources` tab and choose one or more databases for analysis. Available options include:
25+
26+
- *Gene Ontology (GO)*: Biological Process, Molecular Function, and Cellular Component.
27+
- *KEGG Pathways*
28+
- *Reactome Pathways*
29+
- *WikiPathways*
30+
- *TRANSFAC*
31+
- *mirTarBase*
32+
- *Human Protein Atlas*
33+
- *CORUM*
34+
- *Human Phenotype Ontology (HP)*
35+
36+
<span style="color:orange;">- Run Query:</span> Run the analysis and review the enriched terms, pathways, and visual outputs. Download the results as needed for further exploration.
37+
38+
#### Browse the Results
39+
40+
- **Overview**:
41+
The analysis provided a comprehensive list of enriched terms across selected databases, highlighting significant GO. The results give a high-level summary of pathways or terms most relevant to the input data.
42+
43+
- **Detailed Results**:
44+
The detailed results section includes a tabulated format with enriched terms, adjusted p-values, and relevant statistics. Each entry provides information such as the enrichment score, associated genes, and functional annotations, allowing for an in-depth understanding of biological significance.
45+
46+
- **GO Context**:
47+
The Gene Ontology (GO) context is divided into three main categories: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). The analysis identifies which GO terms are significantly enriched, offering insights into the broader biological implications of the gene set. This helps in pinpointing processes such as cellular responses, metabolic pathways, and molecular interactions.
48+
49+
- **Query Info**:
50+
This section includes specifics about the input data, including the total number of queried genes and any identifiers not recognized or mapped. It also details the statistical background used, the chosen organism, and other analysis settings, ensuring transparency and reproducibility of the results.
51+
52+
53+
#### {-}
54+
55+
#### Different Backgrounds
56+
57+
#### **Challenge:** How different backgrounds impact the output? {- .challenge}
58+
59+
Use 'All known genes' in one analysis and 'Custom' background in another. Download the results by clicking on CSV button. Browse the results in the spreadsheets and find out the difference between two.
60+
61+
#### **Question 1** {- .rationale}
62+
63+
Which background would you use in your analysis?
64+
65+
#### **Question 2** {- .rationale}
66+
67+
How can one perform Under Representation Analysis in gProfiler?
68+
69+
### Steps to perform GSEA in g:Profiler:
70+
<span style="color:orange;">- Prepare Your Pre-ranked List:</span> Steps to provide a ranked gene list are given [here](degust.html).
71+
72+
<span style="color:orange;">- Input Gene List:</span> Paste your prepared gene list directly into the input box on the g:Profiler web page or upload a file containing your list.
73+
74+
<span style="color:orange;">- Select Organism:</span> Same as above.
75+
76+
<span style="color:orange;">- Select Ordered query:</span> The "Ordered query" option in g:Profiler is designed to work with pre-ranked gene lists.
77+
78+
<span style="color:orange;">- Set Significance Threshold:</span> Same as above.
79+
80+
<span style="color:orange;">- Provide a Custom GMT:</span> This GMT file can be downloaded from MSigDB.
81+
82+
<span style="color:orange;">- Run Query:</span> Same as above.
83+
84+
#### **Challenge:** GSEA with gProfiler {- .challenge}
85+
Download the Hallmark gene sets (h.all.v2024.1.Hs.symbols.gmt) from [MSigDB](https://www.gsea-msigdb.org/gsea/msigdb/index.jsp) and use it as Custom GMT.
86+
87+
88+
## STRING
89+
STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is a resource for exploring protein-protein interaction (PPI) networks. It combines experimental data, predictions, and curated information to build networks that highlight functional relationships, helping to reveal shared pathways or biological processes within gene or protein lists.
90+
91+
## Reactome
92+
Reactome is an open-source database of curated biological pathways across species, offering pathway maps and enrichment tools to analyze gene lists in a pathway-focused context. It’s ideal for visualising data within established biochemical and cellular processes.
93+
94+
## MSigDB GSEA
95+
MSigDB (Molecular Signatures Database) is a collection of gene sets for Gene Set Enrichment Analysis (GSEA), representing pathways and gene signatures linked to biological states or diseases. It helps identify enriched gene sets, aiding the analysis of gene expression changes and key pathways in experimental data.
96+
297

degust.html

Lines changed: 532 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)