Skip to content

Commit 2361091

Browse files
committed
added rummagene
1 parent 3f09262 commit 2361091

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

src/pages/tools/Rummagene.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
icon: https://cfde-drc.s3.us-east-2.amazonaws.com/assets/img/rummagene_logo.webp
3+
image: https://cfde-drc.s3.us-east-2.amazonaws.com/assets/img/rummagene-screenshot.png
4+
label: Rummagene
5+
layout: '@/layouts/Tools.astro'
6+
short_description: Access automatically extracted gene sets from supporting tables of PMC publications
7+
url: https://rummagene.com/
8+
doi: https://doi.org/10.1038/s42003-024-06177-7
9+
---
10+
Many biomedical research papers are published every day with a portion of them containing supporting tables with data about genes, transcripts, variants, and proteins. For example, supporting tables may contain differentially expressed genes and proteins from transcriptomics and proteomics assays, targets of transcription factors from ChIP-seq experiments, hits from genome-wide CRISPR screens, or genes identified to harbor mutations from GWAS studies. Because these gene sets are commonly buried in the supplemental tables of research publications, they are not widely available for search and reused. Rummagene is a web server application that provides access to hundreds of thousands human and mouse gene sets extracted from supporting materials of publications listed on PubMed Central (PMC). To create Rummagene, we first developed a softbot that extracts human and mouse gene sets from supporting tables of PMC publications. So far, the softbot scanned 6,859,227 PMC articles to find 164,319 articles that contain 878,345 gene sets. These gene sets are served for enrichment analysis, free text and table title search. Users of Rummagene can submit their own gene sets to find matching gene sets ranked by their overlap with the input gene set. In addition to providing the extracted gene sets for search, we investigated the massive corpus of these gene sets for statistical patterns. We show how Rummagene can be used for transcription factor and kinase enrichment analyses, for universal predictions of cell types for single cell RNA-seq data, and for gene function predictions. Finally, by combining gene set similarity with abstract similarity, Rummagene can be used to find surprising relationships between unexpected biological processes, concepts, and named entities.

0 commit comments

Comments
 (0)