Skip to content

maven2306/Analysis-of-Protein-Expression

Repository files navigation

Analysis of False Discovery Rate Strategies in Shotgun Proteomics

This repository contains the R code and results for a project investigating the impact of different database search strategies on False Discovery Rate (FDR) estimation in shotgun proteomics. The analysis compares an emulated concatenated search strategy against a separate, un-concatenated target-decoy search.

The primary finding is that the concatenated approach, which incorporates a competition model, identifies approximately 1.25 times more confident peptide-spectrum matches (PSMs) at a 1% FDR threshold than the more conservative un-concatenated method.

View the Final Report

Prerequisites

Software

  • SearchGUI: (Version 4.3.17 or similar) for running the Comet search engine. Download here.
  • R: (Version 4.0 or later) for data analysis. Download here.

R Packages

Run the following commands in your R console to install the necessary packages.

# For reading FASTA files
install.packages("microseq")

# For data manipulation and plotting
install.packages("dplyr")
install.packages("ggplot2")
install.packages("stringr")

# For reading mass spectrometry data files (from Bioconductor)
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install(c("mzR", "Spectra", "MsBackendMzR", "Biostrings"))

How to Reproduce the Analysis

Step 1: Data Acquisition

The raw mass spectrometry file (JD_06232014_sample1-A.raw) is too large to be included in this repository.

  1. Download the file from the ProteomeXchange repository: PXD015300.
  2. Convert the .raw file to .mzML format using a tool like ProteoWizard's msConvert.
  3. Place the resulting JD_06232014_sample1-A.mzML file in the root directory of this project.

Step 2: Running the SearchGUI Analysis

Three separate searches must be performed using SearchGUI and the Comet search engine.

Search 1: Target-Only (Global FDR)

  • Spectrum File(s): JD_06232014_sample1-A.mzML
  • Database File: data/iPRG2015.fasta
  • Output Folder: Create and select search_results/target_results/
  • Search Parameters:
    • Precursor Tolerance: 10.0 ppm
    • Fragment Tolerance: 0.02 Da
    • Enzyme: Trypsin (Specific), Max 2 Missed Cleavages
    • Fixed Modifications: Carbamidomethylation of C
    • Variable Modifications: Oxidation of M, Acetylation of protein N-term
    • Comet Advanced Settings -> Number of Spectrum Matches: 1

Search 2: Decoy-Only (Global FDR)

  • Spectrum File(s): JD_06232014_sample1-A.mzML
  • Database File: decoy.fasta (This file is generated by the R script in the first chunk).
  • Output Folder: Create and select search_results/decoy_results/
  • Search Parameters: Use the exact same settings as the Target-Only search.

Search 3: Top-100 Target (Local FDR)

  • Note: This search can only be run after generating the top100.mzML file using the R script.
  • Spectrum File(s): top100.mzML
  • Database File: data/iPRG2015.fasta
  • Output Folder: Create and select search_results/top100_target_results/
  • Search Parameters (with changes):
    • Precursor Tolerance: 100.0 ppm (Wider tolerance)
    • Fragment Tolerance: 0.02 Da
    • Enzyme: Trypsin (Specific), Max 2 Missed Cleavages
    • Fixed/Variable Modifications: Same as above.
    • Comet Advanced Settings -> Number of Spectrum Matches: 100

Step 3: Running the R Script

  1. Open the APE_Project_Analysis.Rmd file in RStudio.
  2. Ensure all prerequisite packages are installed.
  3. Run the code chunks sequentially or click the "Knit" button to generate the HTML report and all figures. The script will automatically load the search results, perform the FDR calculations, and generate the plots used in the final report.

About

This project evaluates False Discovery Rate strategies in shotgun proteomics by comparing concatenated and un-concatenated database search approaches to determine their impact on peptide identification confidence.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors