Overview

Predicting and analysing yeast growth phenotypes from genotype data across multiple yeast crosses. This repository brings together three analysis pipelines that cover machine learning, metabolic modelling, and statistical/biological interpretation.

Repository Structure

Directory	Description
`Boosting/`	ML model training and evaluation (LightGBM, SVM, RF, Elastic Net, etc.) with Optuna hyperparameter tuning and SHAP analysis
`MATLAB/`	Genome-scale metabolic modelling — strain-specific GIMME models, pFBA, and flux sampling
`R_analysis/`	Statistical analysis, visualisation, and biological interpretation of results from the other pipelines

Each directory is self-contained with its own README, dependencies, and configuration. All scripts within a directory assume that directory is the working directory.

Getting Started

Clone the repo and navigate into the pipeline you want to run:

git clone https://github.com/Rajeeva-IITM/yeast_growth_analysis.git
cd yeast_growth_analysis

# Example: run the ML pipeline
cd Boosting
# follow Boosting/README.md

# Example: run metabolic modelling
cd MATLAB
# follow MATLAB/README.md

# Example: run R analysis
cd R_analysis
# follow R_analysis/README.md

Refer to the README inside each subdirectory for environment setup, dependencies, and usage instructions. Each pipeline must be run from its own subdirectory (Boosting/, MATLAB/, R_analysis/).

Note: Some scripts contain hardcoded values (e.g., number of parallel cores, LP solver choice, file indices) that were configured for the original development environment. Please review the configuration notes in each subdirectory's README before running.

Data

All data required to reproduce the analyses will be available on Zenodo. Download and extract the data archive into the repository root.

Data Flow

Boosting (ML models, SHAP values)
    |
    v
R_analysis (statistical analysis, visualisation)
    ^
    |
MATLAB (metabolic models, flux sampling)

The R analysis pipeline consumes outputs from both the Boosting and MATLAB pipelines. Run those first if you need to regenerate results from scratch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Repository Structure

Getting Started

Data

Data Flow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Boosting		Boosting
MATLAB		MATLAB
R_analysis		R_analysis
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Overview

Repository Structure

Getting Started

Data

Data Flow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages