MonashBioinformaticsPlatform
diff --git a/‎.DS_Store‎
0 Bytes b/‎.DS_Store‎
0 Bytes
diff --git a/‎06-web-tools.Rmd‎
Lines changed: 139 additions & 8 deletions b/‎06-web-tools.Rmd‎
Lines changed: 139 additions & 8 deletions
diff --git a/‎degust.html‎
Lines changed: 1215 additions & 60 deletions b/‎degust.html‎
Lines changed: 1215 additions & 60 deletions
diff --git a/‎images/Browse_Modules_gsea.png‎
152 KB b/‎images/Browse_Modules_gsea.png‎
152 KB
diff --git a/‎images/GSEA-logo.gif‎
8.49 KB b/‎images/GSEA-logo.gif‎
8.49 KB
diff --git a/‎images/GenePattern-Jobs.png‎
14.8 KB b/‎images/GenePattern-Jobs.png‎
14.8 KB
diff --git a/‎images/GenePattern-Run.png‎
79.6 KB b/‎images/GenePattern-Run.png‎
79.6 KB
diff --git a/‎images/GenePattern-logo.png‎
9.08 KB b/‎images/GenePattern-logo.png‎
9.08 KB
diff --git a/‎images/MSigDB_GSEA_GUI.png‎
1.17 MB b/‎images/MSigDB_GSEA_GUI.png‎
1.17 MB
diff --git a/‎style.css‎
Lines changed: 11 additions & 0 deletions b/‎style.css‎
Lines changed: 11 additions & 0 deletions
@@ -37,7 +37,7 @@ gProfiler is known for its integration of numerous species and databases. It sup
 
 <span style="color:orange;">- Run Query:</span> Run the analysis and review the enriched terms, pathways, and visual outputs. Download the results as needed for further exploration.
 
-#### Browse the Results
+#### Browse the gProfiler Results
 
 - **Overview**:
   The analysis provided a comprehensive list of enriched terms across selected databases, highlighting significant GO. The results give a high-level summary of pathways or terms most relevant to the input data.
@@ -64,7 +64,7 @@ Use 'All known genes' in one analysis and 'Custom' background in another. Downlo
 
 Which background would you use in your analysis?
 
-How is multi-query provided to gProfiler?
+How is multi-query support implemented in gProfiler?
 
 How can one perform Under Representation Analysis in gProfiler?
 
@@ -111,9 +111,9 @@ NOTE: In cases where long list of features is provided, STRING may chnage some o
  - previews of protein structures are not shown
  - the network edges show interaction confidence only
 
-### Browse the Results
+### Browse the STRING Results
 
-STRING come up with a number of tabs as the outputs.
+STRING generates multiple tabs as output, shown here:
 
 ![](images/string-results-tabs.png){ width=100% }
 
@@ -177,11 +177,142 @@ The `Clusters` tab essentially provides three different types of clustering algo
 
 Clusters can be downloaded in `.tsv` format.
 
+#### **Question** {- .rationale}
+What was the overlap in enrichment terms between gProfiler and STRING at FDR ≤ 0.05?
 
-## Reactome
-Reactome is an open-source database of curated biological pathways across species, offering pathway maps and enrichment tools to analyze gene lists in a pathway-focused context. It’s ideal for visualising data within established biochemical and cellular processes.
+<!-- ## FEA in [GenePattern](https://www.genepattern.org/#gsc.tab=0) -->
+## FEA in GenePattern <a href="https://www.genepattern.org/#gsc.tab=0" target="_blank"><img src="images/GenePattern-logo.png" alt="g:Profiler Logo" style="height:35px; vertical-align:middle;"></a>
+GenePattern, an online platform developed by the Broad Institute, offers a suite of tools for analyzing and visualizing genomic data, making bioinformatics accessible to researchers through a user-friendly, no-programming interface. Among its supported tools is Gene Set Enrichment Analysis (GSEA), which implements [MSigDB GSEA](https://www.gsea-msigdb.org/gsea/index.jsp) analysis for identifying enriched gene sets in genomic data. 
+
+MSigDB (Molecular Signatures Database) is a collection of gene sets for Gene Set Enrichment Analysis, representing pathways and gene signatures linked to biological states or diseases. It helps identify enriched gene sets, aiding the analysis of gene expression changes and key pathways in experimental data.
+
+### Steps to Locate GSEA Module in GenePattern:
+
+- Click on the Run button and then the Public Server
+
+![](images/GenePattern-Run.png)
+
+- Sign in to GenePattern or Enter as Guest
+
+- Under `Modules` tab hit `Browse Modules`
+
+- Find gsea in the Browse Modules by Category page and hit GSEA
+
+![](images/Browse_Modules_gsea.png){ width=100% }
+
+### Steps to Perform GSEA:
+<!-- https://cloud.genepattern.org/gp/pages/index.jsf?jobid=613752&openVisualizers=true&openNewWindow=false -->
+
+1. Basic Parameters
+
+\- Create both `.gct` and `.cls` files following [this scrit in R](degust.html)
+
+\- Load the `.gct` input file in the `expression dataset` tab and `.cls` file in the `phenotype labels` tab
+
+\- Select a `.gmt` file (Gene Matrix Transposed) from the `gene sets database` tab
+
+\- Set permutation under `number of permutations` tab
+
+\- Type of the permuattaion to be set under `permutation type` tab
+
+\- Select an appropriate DNA Chip annotation file from `chip platform file` tab
+
+\- Name the output file in `output file name` tab
+
+2. Advanced Parameters
+
+\- Scoring Scheme:
+
+  - K-S: The score increment is the same for all genes in *S* regardless of their ranking or correlation strength.
+ 
+  - WEighted: the score increment for each gene in *S* is weighted by its correlation with the phenotype, typically the absolute value of the correlation or ranking metric.
+
+\- Metric for ranking genes: Ranking metric of interest can be chosen from drop down menu. A detailed description od the metrics is given on [GSEA-MSigDB Documentation](https://docs.gsea-msigdb.org/#GSEA/GSEA_User_Guide/#metrics-for-ranking-genes).
+
+  - Categorical Phenotypes: Signal-to-Noise Ratio, t-Test, Ratio of Classes, Log2 Ratio of Classes
+  
+  - Continuous Phenotypes: Pearson Correlation, Spearman Correlation
 
-## MSigDB GSEA
-MSigDB (Molecular Signatures Database) is a collection of gene sets for Gene Set Enrichment Analysis (GSEA), representing pathways and gene signatures linked to biological states or diseases. It helps identify enriched gene sets, aiding the analysis of gene expression changes and key pathways in experimental data.
+\- Minimum and Maximum size of gene sets can be set using `max gene set size` and `min gene set size` tabs
+<!-- <GeneSetName> <Description> <Gene1> <Gene2> <Gene3> ... -->
 
+#### Browse the GSEA results
+
+Once the job has been queued and successfully run, the output will be listd on the left panel under `Jobs` tab:
+
+```{r, echo=FALSE, fig.align = "center", fig.cap="Job status in GenePattern"} 
+#  out.width="50%",
+knitr::include_graphics("images/GenePattern-Jobs.png")
+```
+
+Of the most important files is the `.zip` file that was earlier specified under `output file name` tab in Basic parameters section which includes all the results. The results can also be navigated using the single files listed under the job id.
+
+For Pezzini experiment, two `html` files generated for each of up- and down-regulated gene sets, something like:
+
+  - gsea_report_for_Diff_1731388275794.html
+  
+  - gsea_report_for_Nodiff_1731388275794.html
+
+The tabulated versions of the results are given in `.tsv` format:
+
+  - gsea_report_for_Diff_1731388275794.tsv
+  
+  - gsea_report_for_Nodiff_1731388275794.tsv
+
+
+<!-- ```{css, echo=FALSE} -->
+<!-- table { -->
+<!--   width: 75%;                   -->
+<!--   margin: auto;                 -->
+<!--   border-collapse: collapse;    -->
+<!-- } -->
+
+<!-- th, td { -->
+<!--   padding: 5px;                 -->
+<!--   text-align: left; -->
+<!--   border: 1px solid #ddd;       -->
+<!-- } -->
+<!-- ``` -->
+
+The GSEA result tables have the following header and below is given details of one gene set:
+
+```{r, echo=FALSE, message=FALSE}
+data <- data.frame(
+  Parameter = c("GS (follow link to MSigDB)", "GS DETAILS", "SIZE", "ES", "NES", "NOM p-val", "FDR q-val", "FWER p-val", "RANK AT MAX", "LEADING EDGE"),
+  Value = c(
+    "[REACTOME_FRS_MEDIATED_FGFR2_SIGNALING](https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/REACTOME_FRS_MEDIATED_FGFR2_SIGNALING)", 
+    "Details ...",  # Link formatted in Markdown
+    "16", 
+    "0.83905387", 
+    "1.7128055", 
+    "0", 
+    "0.03902518", 
+    "0.648", 
+    "995", 
+    "tags=38%, list=7%, signal=40%"
+  )
+)
+
+# Display the data as a table
+kable(data, caption = "Summary of GSEA Results for REACTOME_FRS_MEDIATED_FGFR2_SIGNALING Gene Set")
+```
+
+The leading edge column has three values:
+  
+  - tags: 38% of the genes in the gene set are key to the enrichment result.
+  - list: These genes make up 7% of the total gene list being analyzed.
+  - signal: They contribute 40% of the enrichment signal, highlighting their importance in driving the association between this gene set and the biological phenotype being studied.
+
+
+#### **Challenge:** How different ranking metrics impact the output? {- .challenge}
+
+Run GSEA analysis using Hallmark gene sets with two metrics (SignaltoNoise and tTest). How do these differ in reporting enriched terms?
+
+#### **Question ** {- .rationale}
+
+Which gene set category (or categories) offers the most valuable insights for a cell differentiation experiment?
+
+
+## Reactome
+Reactome is an open-source database of curated biological pathways across species, offering pathway maps and enrichment tools to analyse gene lists in a pathway-focused context. It’s ideal for visualising data within established biochemical and cellular processes.
 
@@ -1,3 +1,14 @@
+/*
+table {
+  max-width: 500px;
+  border-spacing: 0px;
+  width: auto;
+}
+
+th, td {
+  padding: 2px 5px;
+}
+*/
 
 /* Target links inside paragraphs and lists (i.e., typical body text links) */
 p a, li a {