-
Notifications
You must be signed in to change notification settings - Fork 31
Description
This issue is part of the nf-core hackathon.
I've taken the challenge of connecting the gsea output files (as they come from the nf-core differentialabundance pipeline) to the shinyngs app so they appear as enrichment results.
There are few minor changes that I need to implement, and I would like your feedback/blessing if possible.
Reading GSEA input files.
The GSEA input files that I get are splitted by direction, so I have two tsv files for each contrast analyzed.
The parsing code currently only accepts one csv file.
Lines 683 to 692 in d2ca950
| if ("gene_set_analyses" %in% names(exp)) { | |
| ese_list$gene_set_analyses <- lapply(exp$gene_set_analyses, function(assay) { | |
| lapply(assay, function(gene_set_type) { | |
| lapply(gene_set_type, function(contrast) { | |
| read.csv(contrast, check.names = FALSE, stringsAsFactors = FALSE, row.names = 1) | |
| }) | |
| }) | |
| }) | |
| } |
My suggested change is that contrast in the code fragment above can be one or two paths to files. If it is of length one, we use the current code path. If it is of length two, it is expected to be a named list (or a character vector) like:
contrast <- c("Up" = "/path/to/up.csv", "Down" = "/path/to/down.csv")
Both CSV files would get loaded, the Direction column would be added and the dataframes would be concatenated.
Using/Parsing the GSEA tables
The app filters the enrichment tables depending on the p-value, the FDR and Direction columns.
However it expects specific column names so the interoperability is limited. I believe you already faced this issue and used a quick fix here:
shinyngs/R/genesetanalysistable.R
Lines 230 to 245 in d2ca950
| getGeneSetAnalysis <- reactive({ | |
| validate(need(input$pval, "Waiting for p value"), need(input$fdr, "Waiting for FDR value")) | |
| ese <- getExperiment() | |
| assay <- getAssay() | |
| gene_set_types <- getGeneSetTypes() | |
| selected_contrasts <- getSelectedContrastNumbers()[[1]] | |
| gst <- ese@gene_set_analyses[[assay]][[gene_set_types]][[as.numeric(selected_contrasts)]] | |
| # Rename p value if we have PValue from mroast etc() | |
| colnames(gst) <- sub("PValue", "p value", colnames(gst)) | |
| # Select out specific gene sets if they've been provided |
I would like to be able to provide the column names for the p-value, FDR and Direction columns. This is similar to what is going on with --pval_column and --qval_column arguments but for enrichment tables instead of differential expression tables.
My suggestion to implement this is to add arguments and pass them along in the same way as the --pval_column argument in make_app_from_files.
With these changes, I can load GSEA tables in the app:
Roadmap
- Get these changes implemented and merged here
- Update the nf-core module for shinyngs
- Update the differentialabundance pipeline
Your feedback @pinin4fjords would be very welcome :-)
(Hi! @suzannejin and @SusiJo!)