DE-LIMP: Differential Expression & Limpa Proteomics

Find which proteins are significantly different between your experimental conditions -- upload a DIA-NN output file and get interactive volcano plots, heatmaps, pathway enrichment, and AI-powered interpretation, all without writing code.

Built on R Shiny with the limpa pipeline for normalization and protein quantification, and limma for statistical testing with FDR correction. See USER_GUIDE.md for methodology details.

Input: DIA-NN report.parquet | Not for: DDA data, TMT/iTRAQ, Spectronaut/MaxQuant output

Not sure if your data is DIA? If your core facility used DIA-NN to process your samples, you have DIA data. Look for a report.parquet file in your results folder. If your data was processed with MaxQuant, Spectronaut, or Proteome Discoverer, or if you used isobaric labels (TMT, iTRAQ), DE-LIMP is not the right tool.

Try it now: huggingface.co/spaces/brettsp/de-limp-proteomics -- no installation required

Project Website: bsphinney.github.io/DE-LIMP | Docs: USER_GUIDE.md | CLAUDE.md

What's New in v3.7.0

NCBI Proteome Download -- Search and download RefSeq protein FASTA databases from NCBI Datasets, with automatic gene symbol mapping via E-utilities. Supports all organisms with NCBI reference proteomes, complementing the existing UniProt download for non-model organisms.

Contaminant Analysis -- New subtab in Data Overview with summary cards (contaminant count, % of total, median intensity ratio, keratin count), per-sample stacked bar chart, top contaminants table with keratin flagging, and contaminant heatmap. Signal Distribution and Expression Grid also highlight contaminants.

Data Explorer -- Quartile-based abundance profiles and sample-sample scatter plots for exploring data without requiring DE analysis. Variable proteins that shift 2+ quartiles across samples are flagged. Works with no-replicates mode.

SSH File Browser -- Visual directory browser for remote HPC navigation. Clickable breadcrumbs, color-coded entries, file type filtering. Replaces manual path entry for raw data and FASTA directories.

Load from HPC -- One-click button to download and analyze completed search results from the cluster via the SSH file browser.

Docker Launcher for Windows -- One-click batch file (Launch_DE-LIMP_Docker.bat) handles SSH key detection, shared PC accounts, container startup, and browser launch. Docker + SSH to HPC is now the recommended Windows deployment.

No-Replicates Mode -- Quantification completes normally with n=1 per group (normalization, protein aggregation, PCA, Expression Grid). DE analysis is gracefully skipped with an informational message.

SSH Auto-Connect & Environment Badge -- Auto-connects to HPC on startup when an SSH key is detected. Colored navbar badge shows deployment mode (Docker/HPC/Local/HF).

Previous highlights: v3.5 Run Comparator, Search & Analysis History, Chromatography QC, smart HPC partitions. v3.1 UI overhaul, Core Facility Mode. v3.0 MOFA2, Docker search, phosphoproteomics, GSEA.

See CHANGELOG.md for full release history.

Key Features

Analysis & Visualization

Volcano Plots -- Interactive (Plotly), click or box-select proteins to highlight across all views; all pairwise contrasts available
Heatmaps -- Z-score heatmaps of selected or significant proteins (ComplexHeatmap)
Contaminant Analysis -- Summary cards, per-sample stacked bar chart, top contaminants table with keratin flagging, and contaminant heatmap; Signal Distribution and Expression Grid also highlight contaminants
Data Explorer -- Quartile-based abundance profiles and sample-sample scatter plots for exploring data without DE analysis
QC Sample Metrics -- Faceted trend plot (Precursors, Proteins, MS1 Signal, Data Completeness) with LOESS smoother for drift detection and group average lines
MDS & DPC Plots -- Sample clustering and normalization diagnostics
Covariates -- Include batch, sex, diet, or custom covariates in the linear model
XIC Chromatogram Viewer -- Fragment-level chromatogram validation, MS2 intensity alignment (Spectronaut-style), ion mobility/mobilogram support for timsTOF, DIA-NN v1/v2 formats (local/HPC only)
CV Analysis (Robust Changes) -- Identify highly reproducible DE proteins via coefficient of variation analysis across replicates

Phosphoproteomics

Auto-detection of phospho-enriched data on upload (scans for UniMod:21 in Modified.Sequence)
Phosphosite-level DE via limma (independent from protein-level analysis); supports DIA-NN site_matrix_*.parquet or parsed from report.parquet
KSEA (Kinase-Substrate Enrichment Analysis) -- infer upstream kinase activity from phosphosite fold-changes using PhosphoSitePlus + NetworKIN databases
Motif analysis -- sequence logos (ggseqlogo) of flanking residues around regulated phosphosites
Abundance correction -- subtract protein-level logFC from site logFC to isolate phosphorylation stoichiometry changes

Gene Set Enrichment & Multi-Omics

GSEA -- GO (BP/MF/CC) and KEGG pathways via clusterProfiler; per-ontology caching; automatic organism detection (12 species via UniProt REST API or protein ID suffix)
MOFA2 (Multi-Omics Factor Analysis) -- unsupervised integration of 2-6 data views (e.g., proteomics + phosphoproteomics + transcriptomics). Import from RDS, CSV, TSV, or Parquet. Variance explained heatmap, factor weights, sample scores, Factor-DE correlation. Built-in example datasets (Mouse Brain, TCGA Breast Cancer)

AI-Powered Analysis (Google Gemini)

Requires a free Gemini API key. Get one at Google AI Studio and paste it into the DE-LIMP sidebar.

AI Summary -- Analyzes all contrasts simultaneously, identifying top DE proteins per comparison, cross-comparison biomarkers, and CV-based stability metrics. AI Summary sends only summary statistics (protein names, logFC, adj.P.Val); Data Chat sends per-sample expression data for top DE proteins to enable interactive Q&A
Export for Claude -- Download your complete analysis as a .zip optimized for deep analysis with Claude, ChatGPT, or other AI assistants (includes DE results, expression matrix, QC metrics, GSEA, methods text, and more)
AI Summary HTML Export -- Styled standalone HTML report with gradient header and markdown formatting, suitable for sharing with collaborators
Interactive Data Chat -- Conversational interface with Google Gemini, auto-injecting QC stats and 100-800 top DE proteins as context. Phospho context (top 20 sites + KSEA kinase results) auto-included when phospho analysis is active
Interactive AI + plot connection -- Select proteins in volcano/table to set AI context; AI can highlight proteins in plots via [[SELECT: protein1; protein2]] syntax
Auto-Analyze button for one-click dataset analysis; Save Chat to download conversation as plain text
Auto-generated methodology text for methods sections

Run Comparator

Cross-tool comparison -- Compare your DE-LIMP analysis against a second DE-LIMP run, Spectronaut export, or FragPipe output to understand how tool choice affects your results
4 diagnostic layers -- Settings Diff (parameter-by-parameter comparison), Protein Universe (overlap analysis), Quantification (log2 intensity correlation, per-sample concordance, systematic bias detection), DE Concordance (3x3 Up/Down/NS matrix, volcano overlay, discordant protein table)
7-rule hypothesis engine -- For each discordant protein, assigns a tool-aware hypothesis explaining why the tools disagree (direction reversal, normalization offset, variance estimation, missing values, peptide count, FC magnitude, or borderline significance)
Optional DIA-NN log upload -- Enrich Mode A comparisons with search-derived parameters (pg-level quantification, proteoforms, library precursor counts, pipeline step)
Optional MOFA2 decomposition -- Treats the two runs as views and decomposes joint variance to find hidden patterns among discordant proteins
AI integration -- Tool-aware Gemini prompt and Claude ZIP export for deeper analysis

Chromatography QC

Pre-search quality check -- Extract TIC traces from timsTOF .d files before committing to hours-long DIA-NN searches
Three views -- Faceted panels (per-run with median overlay), Overlay (all runs normalized 0-1 on one axis), Metrics (AUC bar chart + diagnostics table)
Automated diagnostics -- Shape deviation (Pearson r vs median trace), RT shift, loading anomaly (AUC outlier), file size outlier, late elution, elevated baseline, narrow gradient
SSH support -- SCP downloads analysis.tdf from remote .d directories, extracts locally

DIA-NN Search Integration

Three backends -- Local, Docker, and HPC (SSH/SLURM)
Parallel 5-step SLURM pipeline -- Optimized search with dependency chaining and array jobs for maximum HPC throughput
SSH file browser -- Visual directory browser for navigating remote HPC filesystems with clickable breadcrumbs, color-coded entries, and file type filtering
SSH auto-connect -- Automatically connects to HPC on startup when an SSH key is detected; environment badge shows deployment mode
UniProt FASTA download -- Search and download proteome databases directly; 6 bundled contaminant libraries
NCBI proteome download -- Download RefSeq protein FASTA from NCBI Datasets with automatic gene symbol mapping for non-model organisms
Load from HPC -- One-click button to browse, download, and analyze completed search results from the cluster
Spectral library caching -- Reuse predicted libraries across searches to save compute time
Custom FASTA sequences -- Add custom protein sequences inline when submitting searches
Smart partition selection -- Detects per-user SLURM CPU limits, auto-switches to public queue when at capacity
FASTA database library -- Shared catalog with auto-upload to HPC, fragment m/z range tracking, path validation
Cluster resource indicator -- Real-time HPC CPU usage monitoring with traffic-light display (green/yellow/red)
Windows Docker launcher -- One-click .bat file runs DE-LIMP + DIA-NN with zero R installation, shared PC support (guide)
Non-blocking job queue -- Submit multiple searches, results auto-load on completion
Phospho mode -- Auto-configures DIA-NN for phospho analysis (STY modification, --phospho-output)
Organized search logs -- SLURM .out/.err and local .log files written to {output_dir}/logs/

DIA-NN License: DIA-NN is developed by Vadim Demichev and is free for academic/non-commercial use. It is not open source and cannot be redistributed. DE-LIMP does not bundle DIA-NN. See the DIA-NN license.

Core Facility Mode (Optional)

Staff YAML profiles auto-fill SSH, SLURM, and instrument settings
SQLite job tracking with searchable history (6 filters), one-click result loading and report generation
Instrument QC dashboard with protein/precursor/TIC trends and control lines
Quarto HTML reports with QC bracket, volcanos, DE stats, and top proteins

Activated by setting DELIMP_CORE_DIR. Not visible on standard installations.

Session Management & History

Unified activity log -- Single audit trail for all DIA-NN searches and pipeline runs, with remote activity log via SSH for multi-user visibility
Search History -- Full audit trail for every DIA-NN search (26 parameters). Import Settings to reuse parameters; Import Results to load completed search output directly. View Log shows search metadata. Cross-reference links to Analysis History.
Analysis History & Projects -- Track every pipeline run with expandable detail rows. Assign analyses to projects for organized grouping with summary cards.
About tab -- Community stats dashboard with GitHub stars, forks, visitors, and clones (14-day trend sparklines), GitHub Discussions feed, version info, and project links
No-replicates mode -- Quantification without DE for n=1 experiments; PCA, Expression Grid, and Data Explorer still available
Save/load full analysis state as .rds; export reproducibility R code log
One-click example data (Affinisep vs Evosep comparison)
Group assignment templates (CSV export/import)
Embedded proteomics resources, UC Davis Proteomics videos, short course links

Which Installation Should I Use?

Platform	Method	DIA-NN Search?	Guide
Any (just exploring)	Web browser	No	Hugging Face
Windows	Docker + SSH to HPC	Yes (via HPC)	WINDOWS_DOCKER_INSTALL.md
Mac / Linux	R/RStudio (native)	Via HPC or Docker	See Installation below
HPC cluster	Apptainer/Singularity	Via SLURM	HPC_DEPLOYMENT.md

Installation

Requirements: R 4.5+ (for limpa), Bioconductor 3.22+ (auto-configured with R 4.5+)

git clone https://github.com/bsphinney/DE-LIMP.git
cd DE-LIMP

shiny::runApp('.', port=3838, launch.browser=TRUE)

All dependencies install automatically on first run:

# Core: shiny, bslib, plotly, DT, rhandsontable, shinyjs
# Data: dplyr, tidyr, stringr, readr, arrow
# Stats: limpa, limma, ComplexHeatmap, clusterProfiler
#        org.Hs.eg.db, org.Mm.eg.db, AnnotationDbi
#        KSEAapp, ggseqlogo, MOFA2, basilisk, callr
# Viz:  ggplot2, ggrepel, ggridges, enrichplot
# AI:   httr2, curl

Usage

Load Data -- Upload a DIA-NN report.parquet output file, or click "Load Example Data" for a demo HeLa dataset
Assign Groups & Run -- Auto-guess groups from filenames or manually assign; optionally add covariates (batch, etc.); click "Run Pipeline" to execute DPC-CN normalization, DPC-Quant protein quantification, and limma DE
Explore Results -- Data Overview, QC, DE Dashboard (Volcano/Table/PCA/CV Analysis), Phospho, GSEA, MOFA2, AI Analysis, XIC Viewer (local/HPC)
Export -- Download reproducibility log (.R), save session (.rds), export tables and plots

Methodology

Step	Method
Normalization	Data Point Correspondence - Cyclic Normalization (DPC-CN) via `limpa::dpcCN()`
Quantification	DPC-Quant (Detection Probability Curve Quantification): precursor-to-protein rollup via probabilistic missing-value modelling, via `limpa::dpcQuant()`
DE model	Linear model fit via `limpa::dpcDE()` + `limma::contrasts.fit()`
Moderation	Empirical Bayes moderated t-statistics via `limma::eBayes()`
FDR	Benjamini-Hochberg adjusted p-values
Phospho DE	Same limma pipeline at the phosphosite level (independent from protein-level)

Key Citations:

limpa -- Bioconductor package for DIA proteomics (bioconductor.org/packages/limpa)
limma -- Ritchie ME et al. (2015) Nucleic Acids Res 43(7):e47 (doi:10.1093/nar/gkv007)
DIA-NN -- Demichev V et al. (2020) Nat Methods 17:41-44 (doi:10.1038/s41592-019-0638-x)
MOFA2 -- Argelaguet R et al. (2020) Genome Biol 21:111 (doi:10.1186/s13059-020-02015-1)
KSEA -- Wiredja DD et al. (2017) Bioinformatics 33:3489-3491; Casado P et al. (2013) Sci Signaling 6:rs6
clusterProfiler -- Wu T et al. (2021) Innovation 2(3):100141

Resources

Project Website: bsphinney.github.io/DE-LIMP
Discussions: github.com/bsphinney/DE-LIMP/discussions -- Q&A, feature ideas, and announcements
Video Tutorials: UC Davis Proteomics YouTube
Training: Hands-On Proteomics Short Course
Core Facility: proteomics.ucdavis.edu

License

This project is open source. See repository for license details.

Contributing

Issues, pull requests, and Discussions welcome! See CLAUDE.md for development documentation.

Developer: Brett Phinney, UC Davis Proteomics Core Facility | Contact: GitHub Issues

Example Data

Demo dataset: Affinisep vs Evosep SPE column comparison using 50 ng Thermo HeLa protein digest standard (DIA, Orbitrap). Available at github.com/bsphinney/DE-LIMP/releases.

Name		Name	Last commit message	Last commit date
Latest commit History 552 Commits
.github/workflows		.github/workflows
R		R
contaminants		contaminants
data/ssh		data/ssh
docs		docs
stats		stats
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CORE_FACILITY_PHASE2_SPEC.md		CORE_FACILITY_PHASE2_SPEC.md
Dockerfile		Dockerfile
Dockerfile.base		Dockerfile.base
Dockerfile.search		Dockerfile.search
HPC_DEPLOYMENT.md		HPC_DEPLOYMENT.md
Launch_DE-LIMP.bat		Launch_DE-LIMP.bat
Launch_DE-LIMP_Docker.bat		Launch_DE-LIMP_Docker.bat
README.md		README.md
README_GITHUB.md		README_GITHUB.md
README_HF.md		README_HF.md
USER_GUIDE.md		USER_GUIDE.md
VERSION		VERSION
WINDOWS_DOCKER_INSTALL.md		WINDOWS_DOCKER_INSTALL.md
app.R		app.R
build_diann_docker.ps1		build_diann_docker.ps1
build_diann_docker.sh		build_diann_docker.sh
docker-compose.yml		docker-compose.yml
hpc_setup.sh		hpc_setup.sh
launch_delimp.ps1		launch_delimp.ps1
launch_delimp.sh		launch_delimp.sh
qc_monitor.R		qc_monitor.R
report_template.qmd		report_template.qmd
seed_test_db.R		seed_test_db.R
update_docker.ps1		update_docker.ps1
update_docker.sh		update_docker.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DE-LIMP: Differential Expression & Limpa Proteomics

What's New in v3.7.0

Key Features

Analysis & Visualization

Phosphoproteomics

Gene Set Enrichment & Multi-Omics

AI-Powered Analysis (Google Gemini)

Run Comparator

Chromatography QC

DIA-NN Search Integration

Core Facility Mode (Optional)

Session Management & History

Which Installation Should I Use?

Installation

Usage

Methodology

Resources

License

Contributing

Example Data

About

Uh oh!

Releases 10

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DE-LIMP: Differential Expression & Limpa Proteomics

What's New in v3.7.0

Key Features

Analysis & Visualization

Phosphoproteomics

Gene Set Enrichment & Multi-Omics

AI-Powered Analysis (Google Gemini)

Run Comparator

Chromatography QC

DIA-NN Search Integration

Core Facility Mode (Optional)

Session Management & History

Which Installation Should I Use?

Installation

Usage

Methodology

Resources

License

Contributing

Example Data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages