Find which proteins are significantly different between your experimental conditions -- upload a DIA-NN output file and get interactive volcano plots, heatmaps, pathway enrichment, and AI-powered interpretation, all without writing code.
Built on R Shiny with the limpa pipeline for normalization and protein quantification, and limma for statistical testing with FDR correction. See USER_GUIDE.md for methodology details.
Input: DIA-NN report.parquet | Not for: DDA data, TMT/iTRAQ, Spectronaut/MaxQuant output
Not sure if your data is DIA? If your core facility used DIA-NN to process your samples, you have DIA data. Look for a
report.parquetfile in your results folder. If your data was processed with MaxQuant, Spectronaut, or Proteome Discoverer, or if you used isobaric labels (TMT, iTRAQ), DE-LIMP is not the right tool.
Try it now: huggingface.co/spaces/brettsp/de-limp-proteomics -- no installation required
Project Website: bsphinney.github.io/DE-LIMP | Docs: USER_GUIDE.md | CLAUDE.md
NCBI Proteome Download -- Search and download RefSeq protein FASTA databases from NCBI Datasets, with automatic gene symbol mapping via E-utilities. Supports all organisms with NCBI reference proteomes, complementing the existing UniProt download for non-model organisms.
Contaminant Analysis -- New subtab in Data Overview with summary cards (contaminant count, % of total, median intensity ratio, keratin count), per-sample stacked bar chart, top contaminants table with keratin flagging, and contaminant heatmap. Signal Distribution and Expression Grid also highlight contaminants.
Data Explorer -- Quartile-based abundance profiles and sample-sample scatter plots for exploring data without requiring DE analysis. Variable proteins that shift 2+ quartiles across samples are flagged. Works with no-replicates mode.
SSH File Browser -- Visual directory browser for remote HPC navigation. Clickable breadcrumbs, color-coded entries, file type filtering. Replaces manual path entry for raw data and FASTA directories.
Load from HPC -- One-click button to download and analyze completed search results from the cluster via the SSH file browser.
Docker Launcher for Windows -- One-click batch file (Launch_DE-LIMP_Docker.bat) handles SSH key detection, shared PC accounts, container startup, and browser launch. Docker + SSH to HPC is now the recommended Windows deployment.
No-Replicates Mode -- Quantification completes normally with n=1 per group (normalization, protein aggregation, PCA, Expression Grid). DE analysis is gracefully skipped with an informational message.
SSH Auto-Connect & Environment Badge -- Auto-connects to HPC on startup when an SSH key is detected. Colored navbar badge shows deployment mode (Docker/HPC/Local/HF).
Previous highlights: v3.5 Run Comparator, Search & Analysis History, Chromatography QC, smart HPC partitions. v3.1 UI overhaul, Core Facility Mode. v3.0 MOFA2, Docker search, phosphoproteomics, GSEA.
See CHANGELOG.md for full release history.
- Volcano Plots -- Interactive (Plotly), click or box-select proteins to highlight across all views; all pairwise contrasts available
- Heatmaps -- Z-score heatmaps of selected or significant proteins (ComplexHeatmap)
- Contaminant Analysis -- Summary cards, per-sample stacked bar chart, top contaminants table with keratin flagging, and contaminant heatmap; Signal Distribution and Expression Grid also highlight contaminants
- Data Explorer -- Quartile-based abundance profiles and sample-sample scatter plots for exploring data without DE analysis
- QC Sample Metrics -- Faceted trend plot (Precursors, Proteins, MS1 Signal, Data Completeness) with LOESS smoother for drift detection and group average lines
- MDS & DPC Plots -- Sample clustering and normalization diagnostics
- Covariates -- Include batch, sex, diet, or custom covariates in the linear model
- XIC Chromatogram Viewer -- Fragment-level chromatogram validation, MS2 intensity alignment (Spectronaut-style), ion mobility/mobilogram support for timsTOF, DIA-NN v1/v2 formats (local/HPC only)
- CV Analysis (Robust Changes) -- Identify highly reproducible DE proteins via coefficient of variation analysis across replicates
- Auto-detection of phospho-enriched data on upload (scans for UniMod:21 in Modified.Sequence)
- Phosphosite-level DE via limma (independent from protein-level analysis); supports DIA-NN
site_matrix_*.parquetor parsed fromreport.parquet - KSEA (Kinase-Substrate Enrichment Analysis) -- infer upstream kinase activity from phosphosite fold-changes using PhosphoSitePlus + NetworKIN databases
- Motif analysis -- sequence logos (ggseqlogo) of flanking residues around regulated phosphosites
- Abundance correction -- subtract protein-level logFC from site logFC to isolate phosphorylation stoichiometry changes
- GSEA -- GO (BP/MF/CC) and KEGG pathways via clusterProfiler; per-ontology caching; automatic organism detection (12 species via UniProt REST API or protein ID suffix)
- MOFA2 (Multi-Omics Factor Analysis) -- unsupervised integration of 2-6 data views (e.g., proteomics + phosphoproteomics + transcriptomics). Import from RDS, CSV, TSV, or Parquet. Variance explained heatmap, factor weights, sample scores, Factor-DE correlation. Built-in example datasets (Mouse Brain, TCGA Breast Cancer)
Requires a free Gemini API key. Get one at Google AI Studio and paste it into the DE-LIMP sidebar.
- AI Summary -- Analyzes all contrasts simultaneously, identifying top DE proteins per comparison, cross-comparison biomarkers, and CV-based stability metrics. AI Summary sends only summary statistics (protein names, logFC, adj.P.Val); Data Chat sends per-sample expression data for top DE proteins to enable interactive Q&A
- Export for Claude -- Download your complete analysis as a .zip optimized for deep analysis with Claude, ChatGPT, or other AI assistants (includes DE results, expression matrix, QC metrics, GSEA, methods text, and more)
- AI Summary HTML Export -- Styled standalone HTML report with gradient header and markdown formatting, suitable for sharing with collaborators
- Interactive Data Chat -- Conversational interface with Google Gemini, auto-injecting QC stats and 100-800 top DE proteins as context. Phospho context (top 20 sites + KSEA kinase results) auto-included when phospho analysis is active
- Interactive AI + plot connection -- Select proteins in volcano/table to set AI context; AI can highlight proteins in plots via
[[SELECT: protein1; protein2]]syntax - Auto-Analyze button for one-click dataset analysis; Save Chat to download conversation as plain text
- Auto-generated methodology text for methods sections
- Cross-tool comparison -- Compare your DE-LIMP analysis against a second DE-LIMP run, Spectronaut export, or FragPipe output to understand how tool choice affects your results
- 4 diagnostic layers -- Settings Diff (parameter-by-parameter comparison), Protein Universe (overlap analysis), Quantification (log2 intensity correlation, per-sample concordance, systematic bias detection), DE Concordance (3x3 Up/Down/NS matrix, volcano overlay, discordant protein table)
- 7-rule hypothesis engine -- For each discordant protein, assigns a tool-aware hypothesis explaining why the tools disagree (direction reversal, normalization offset, variance estimation, missing values, peptide count, FC magnitude, or borderline significance)
- Optional DIA-NN log upload -- Enrich Mode A comparisons with search-derived parameters (pg-level quantification, proteoforms, library precursor counts, pipeline step)
- Optional MOFA2 decomposition -- Treats the two runs as views and decomposes joint variance to find hidden patterns among discordant proteins
- AI integration -- Tool-aware Gemini prompt and Claude ZIP export for deeper analysis
- Pre-search quality check -- Extract TIC traces from timsTOF .d files before committing to hours-long DIA-NN searches
- Three views -- Faceted panels (per-run with median overlay), Overlay (all runs normalized 0-1 on one axis), Metrics (AUC bar chart + diagnostics table)
- Automated diagnostics -- Shape deviation (Pearson r vs median trace), RT shift, loading anomaly (AUC outlier), file size outlier, late elution, elevated baseline, narrow gradient
- SSH support -- SCP downloads analysis.tdf from remote .d directories, extracts locally
- Three backends -- Local, Docker, and HPC (SSH/SLURM)
- Parallel 5-step SLURM pipeline -- Optimized search with dependency chaining and array jobs for maximum HPC throughput
- SSH file browser -- Visual directory browser for navigating remote HPC filesystems with clickable breadcrumbs, color-coded entries, and file type filtering
- SSH auto-connect -- Automatically connects to HPC on startup when an SSH key is detected; environment badge shows deployment mode
- UniProt FASTA download -- Search and download proteome databases directly; 6 bundled contaminant libraries
- NCBI proteome download -- Download RefSeq protein FASTA from NCBI Datasets with automatic gene symbol mapping for non-model organisms
- Load from HPC -- One-click button to browse, download, and analyze completed search results from the cluster
- Spectral library caching -- Reuse predicted libraries across searches to save compute time
- Custom FASTA sequences -- Add custom protein sequences inline when submitting searches
- Smart partition selection -- Detects per-user SLURM CPU limits, auto-switches to public queue when at capacity
- FASTA database library -- Shared catalog with auto-upload to HPC, fragment m/z range tracking, path validation
- Cluster resource indicator -- Real-time HPC CPU usage monitoring with traffic-light display (green/yellow/red)
- Windows Docker launcher -- One-click
.batfile runs DE-LIMP + DIA-NN with zero R installation, shared PC support (guide) - Non-blocking job queue -- Submit multiple searches, results auto-load on completion
- Phospho mode -- Auto-configures DIA-NN for phospho analysis (STY modification,
--phospho-output) - Organized search logs -- SLURM
.out/.errand local.logfiles written to{output_dir}/logs/
DIA-NN License: DIA-NN is developed by Vadim Demichev and is free for academic/non-commercial use. It is not open source and cannot be redistributed. DE-LIMP does not bundle DIA-NN. See the DIA-NN license.
- Staff YAML profiles auto-fill SSH, SLURM, and instrument settings
- SQLite job tracking with searchable history (6 filters), one-click result loading and report generation
- Instrument QC dashboard with protein/precursor/TIC trends and control lines
- Quarto HTML reports with QC bracket, volcanos, DE stats, and top proteins
Activated by setting
DELIMP_CORE_DIR. Not visible on standard installations.
- Unified activity log -- Single audit trail for all DIA-NN searches and pipeline runs, with remote activity log via SSH for multi-user visibility
- Search History -- Full audit trail for every DIA-NN search (26 parameters). Import Settings to reuse parameters; Import Results to load completed search output directly. View Log shows search metadata. Cross-reference links to Analysis History.
- Analysis History & Projects -- Track every pipeline run with expandable detail rows. Assign analyses to projects for organized grouping with summary cards.
- About tab -- Community stats dashboard with GitHub stars, forks, visitors, and clones (14-day trend sparklines), GitHub Discussions feed, version info, and project links
- No-replicates mode -- Quantification without DE for n=1 experiments; PCA, Expression Grid, and Data Explorer still available
- Save/load full analysis state as
.rds; export reproducibility R code log - One-click example data (Affinisep vs Evosep comparison)
- Group assignment templates (CSV export/import)
- Embedded proteomics resources, UC Davis Proteomics videos, short course links
| Platform | Method | DIA-NN Search? | Guide |
|---|---|---|---|
| Any (just exploring) | Web browser | No | Hugging Face |
| Windows | Docker + SSH to HPC | Yes (via HPC) | WINDOWS_DOCKER_INSTALL.md |
| Mac / Linux | R/RStudio (native) | Via HPC or Docker | See Installation below |
| HPC cluster | Apptainer/Singularity | Via SLURM | HPC_DEPLOYMENT.md |
Requirements: R 4.5+ (for limpa), Bioconductor 3.22+ (auto-configured with R 4.5+)
git clone https://github.com/bsphinney/DE-LIMP.git
cd DE-LIMPshiny::runApp('.', port=3838, launch.browser=TRUE)All dependencies install automatically on first run:
# Core: shiny, bslib, plotly, DT, rhandsontable, shinyjs
# Data: dplyr, tidyr, stringr, readr, arrow
# Stats: limpa, limma, ComplexHeatmap, clusterProfiler
# org.Hs.eg.db, org.Mm.eg.db, AnnotationDbi
# KSEAapp, ggseqlogo, MOFA2, basilisk, callr
# Viz: ggplot2, ggrepel, ggridges, enrichplot
# AI: httr2, curl- Load Data -- Upload a DIA-NN
report.parquetoutput file, or click "Load Example Data" for a demo HeLa dataset - Assign Groups & Run -- Auto-guess groups from filenames or manually assign; optionally add covariates (batch, etc.); click "Run Pipeline" to execute DPC-CN normalization, DPC-Quant protein quantification, and limma DE
- Explore Results -- Data Overview, QC, DE Dashboard (Volcano/Table/PCA/CV Analysis), Phospho, GSEA, MOFA2, AI Analysis, XIC Viewer (local/HPC)
- Export -- Download reproducibility log (.R), save session (.rds), export tables and plots
| Step | Method |
|---|---|
| Normalization | Data Point Correspondence - Cyclic Normalization (DPC-CN) via limpa::dpcCN() |
| Quantification | DPC-Quant (Detection Probability Curve Quantification): precursor-to-protein rollup via probabilistic missing-value modelling, via limpa::dpcQuant() |
| DE model | Linear model fit via limpa::dpcDE() + limma::contrasts.fit() |
| Moderation | Empirical Bayes moderated t-statistics via limma::eBayes() |
| FDR | Benjamini-Hochberg adjusted p-values |
| Phospho DE | Same limma pipeline at the phosphosite level (independent from protein-level) |
Key Citations:
- limpa -- Bioconductor package for DIA proteomics (bioconductor.org/packages/limpa)
- limma -- Ritchie ME et al. (2015) Nucleic Acids Res 43(7):e47 (doi:10.1093/nar/gkv007)
- DIA-NN -- Demichev V et al. (2020) Nat Methods 17:41-44 (doi:10.1038/s41592-019-0638-x)
- MOFA2 -- Argelaguet R et al. (2020) Genome Biol 21:111 (doi:10.1186/s13059-020-02015-1)
- KSEA -- Wiredja DD et al. (2017) Bioinformatics 33:3489-3491; Casado P et al. (2013) Sci Signaling 6:rs6
- clusterProfiler -- Wu T et al. (2021) Innovation 2(3):100141
- Project Website: bsphinney.github.io/DE-LIMP
- Discussions: github.com/bsphinney/DE-LIMP/discussions -- Q&A, feature ideas, and announcements
- Video Tutorials: UC Davis Proteomics YouTube
- Training: Hands-On Proteomics Short Course
- Core Facility: proteomics.ucdavis.edu
This project is open source. See repository for license details.
Issues, pull requests, and Discussions welcome! See CLAUDE.md for development documentation.
Developer: Brett Phinney, UC Davis Proteomics Core Facility | Contact: GitHub Issues
Demo dataset: Affinisep vs Evosep SPE column comparison using 50 ng Thermo HeLa protein digest standard (DIA, Orbitrap). Available at github.com/bsphinney/DE-LIMP/releases.