Skip to content

micheladellalma/multiomics-fermentation-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

multiomics-fermentation-pipeline

MuliOmicsFermentation is a workflow for identifying the dynamics of microorganisms, pathways and metabolites throughout the fermentation process in a Picolit variety. The final output is a multi-layered network, where each layer corresponds to a different time point during fermentation.

This workflow leverages multiple programming languages, including R, Python, and Bash.

Below is an image illustrating the initial inputs and the final table used to construct the network:

Screenshot 2025-12-04 at 09 24 06

🔀 Workflow overview

In reality in the middle, there are many more steps... summarized in the following diagram:

image

Among the different steps of the pipeline:

  1. Preprocessing of the data
  2. Taxonomic classification and differential abundance of taxa between time points
  3. Discovering of the pathways potentially expressed and definition of the pathways enriched for each time point
  4. Metabolites classification into chemical groups and search for enriched metabolites per time point
  5. Integration of all the data and Network analysis
image

📥 📤 Pipeline Inputs, Outputs and Dependencies

📥 Input

  • pair-end FASTQ files
  • metadata table
  • metabolomics tables
  • reference databases when needed

📤 Output

  • taxonomic abundance tables
  • KEGG/COG aggregated tables
  • diversity plots
  • differential abundance results
  • correlation and network objects/figures

🧰 Software/dependencies

  • FastQC
  • MultiQC
  • KneadData
  • Kraken2 + Braken
  • MEGAHIT
  • Prodigal
  • CD-HIT
  • eggNOG-mapper
  • R (vegan, phyloseq, clusterProfiler)
  • Python
  • Bash

⚙️ How to use

The pipeline is controlled by a master script that orchestrates all analysis steps.

Before running the workflow, users must:

  • specify the input files within the master script
  • create the required directory structure starting from the working directory.

Detailed instructions for directory organization and input parameters are provided directly in the master script before each command.

🗃️ Repository structure

  • /scripts folder contains only executable codes from the command line.
  • /notebook folder contains Rmd and Jupyter analysis with descriptive parts and figures.
  • /example_data folder contains all the data. (available only in the private version of this repository)
  • /results

🔑 Key analysis implemented

  • taxonomic profiling of bacterial and fungal communities
  • functional pathway reconstruction from metagenomic data
  • integration of taxonomic, functional, and metabolic layers
  • diversity, ordination, and differential abundance analyses
  • microbial-metabolite-pathway network reconstruction

About

Integration of metagenomics and metabolomics data to explore the dynamics during the alcoholic fermentation process

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages