This repository contains a professional bioinformatics pipeline for the analysis and integration of Transcriptomics (TRC), regular RNA-Seq (TPM), and Proteomics data. The pipeline encompasses differential expression analysis, correlation analyses between multi-omics layers, and time-shift/half-life analysis, primarily applied to time-course experimental data.
A clean, modular architecture allows for reproducible runs:
analysis_trc/
├── analysis_trc.Rproj # RStudio project file
├── config.R # Global paths and experimental parameters
├── utils.R # Common helper functions
├── environment.yml # Conda environment definition for reproducibility
├── README.md # Project documentation
├── LICENSE # MIT License
├── pipeline/ # Core analysis pipeline scripts
│ ├── 01_DESeq2_RNA_Seq.R
│ ├── 02_Limma_DEG_Analysis.R
│ ├── 03_Correlation_TPM_vs_Protein.R
│ ├── 04_Correlation_TRC_vs_Protein.R
│ ├── 05_Compare_Correlations.R
│ ├── 06_Correlation_With_Shifts.R
│ └── 07_Time_Shift_Analysis.R
├── analyses/ # Specialized and auxiliary analyses modules
├── data/ # [Ignored] Raw and processed datasets
├── results/ # [Ignored] Analysis outputs (TSV, RDA)
├── figures/ # [Ignored] Output figures (PDF, PNG)
└── archive/ # Historical script versions
We highly recommend using Conda to manage your dependencies to ensure reproducibility.
-
Clone the repository:
git clone https://github.com/jochotecoa/analysis_trc.git cd analysis_trc -
Create the Conda environment: We have provided an
environment.ymlto automatically install R, Bioconductor packages, and CRAN dependencies.conda env create -f environment.yml
-
Activate the environment:
conda activate analysis_trc
Before running the pipeline, update config.R with the correct paths for your local machine or high-performance computing (HPC) environment. You need to adjust variables like:
BASE_ANALYSIS_PATHBASE_NGS_PATHPROTEOMICS_BASE_PATHSALMON_INPUT_PATHTRC_OUTPUT_PATH
The scripts are designed to be run sequentially from 01 to 07.
- Differential Expression Analysis
- Run
pipeline/01_DESeq2_RNA_Seq.Randpipeline/02_Limma_DEG_Analysis.Rto compute differentially expressed genes/transcripts.
- Run
- Correlation Profiling
- Run
03and04scripts in thepipeline/directory to measure basic correlation coefficients between RNA layers and Proteome layers.
- Run
- Advanced Time-Shift Modeling
- Run
05,06, and07inpipeline/to analyze how long it takes for a transcriptomic shift to reflect at the proteomic level, integrating half-life calculations.
- Run
(You can run them in RStudio via analysis_trc.Rproj or from the terminal using Rscript pipeline/01_DESeq2_RNA_Seq.R, etc.)
All generated tables, lists of DEGs/DEPs, and intermediate .Rda files are saved into a specified results/ folder (configured in config.R). All visualizations and plots are placed into figures/. (Note: These folders are ignored by git to keep the repository lightweight).