🌾 SSR Genetic Diversity & Population Structure Analysis Pipeline

A reproducible computational framework for analyzing SSR (Simple Sequence Repeat) marker datasets in plant breeding and population genetics.

This repository provides a fully automated pipeline for:

Marker quality control
Genetic diversity estimation
Population structure analysis
Linkage disequilibrium network inference
Genetic differentiation and gene flow analysis

The workflow is designed for crop genetics, molecular breeding, and population genomics studies.

🔬 Scientific Applications

This pipeline is suitable for:

Genetic diversity analysis
Germplasm characterization
Population structure studies
Molecular breeding programs
Marker-assisted selection
Plant population genetics

Example organisms:

Rice
Wheat
Maize
Barley
Other crop species with SSR datasets

⚙️ Key Analytical Modules

Module	Description
Data Processing	Load and validate SSR marker datasets
Genetic Diversity	Estimate MAF, PIC, He, Shannon and Simpson indices
Population Structure	Compute genetic distances and multivariate ordination
Network Analysis	Construct linkage disequilibrium networks
Genetic Differentiation	Estimate Fst and gene flow (Nm)

📂 Repository Structure

SSR-Genetic-Diversity-Pipeline
│
├── data
│   └── example_ssr_dataset.xlsx
│
├── pipeline
│   ├── data_processing.py
│   ├── genetic_diversity.py
│   ├── population_structure.py
│   ├── network_analysis.py
│   └── genetic_differentiation.py
│
├── notebooks
│   └── SSR_analysis_colab.ipynb
│
├── results
│
├── README.md
├── requirements.txt
└── LICENSE

🧬 Computational Workflow

SSR Marker Dataset
        │
        ▼
Marker Quality Control
        │
        ▼
Genetic Diversity Analysis
        │
        ▼
Genetic Distance Estimation
        │
        ▼
Population Structure Analysis
   ├── PCA
   ├── MDS
   └── Hierarchical Clustering
        │
        ▼
Linkage Disequilibrium Network
        │
        ▼
Genetic Differentiation
   ├── Fst
   └── Gene Flow (Nm)

📊 Expected Outputs

The pipeline automatically generates:

results/

diversity_indices.csv
fst_nm_results.csv

dendrogram.png
PCA.png
MDS.png
LD_network.png

These outputs enable comprehensive interpretation of genetic diversity and population structure.

📦 Installation

Clone the repository:

git clone https://github.com/yourusername/SSR-Genetic-Diversity-Pipeline.git
cd SSR-Genetic-Diversity-Pipeline

Install dependencies:

pip install -r requirements.txt

▶️ Running the Pipeline

Example execution in Python:

from pipeline.data_processing import *
from pipeline.genetic_diversity import *
from pipeline.population_structure import *
from pipeline.network_analysis import *
from pipeline.genetic_differentiation import *

df = load_ssr_data("data/example_ssr_dataset.xlsx")

df_clean, dropped, mono = validate_markers(df)

diversity = analyze_ssr_diversity(df_clean)

dist = compute_jaccard(df_clean)

plot_dendrogram(dist, df_clean.index, "results")

pca_analysis(df_clean, "results")

mds_analysis(dist, "results")

ld_df = ld_analysis(df_clean)

plot_ld_network(ld_df, "results")

fst_nm = calculate_fst_nm(df_clean)

📚 Citation

If you use this pipeline in your research, please cite:

SSR Genetic Diversity & Population Structure Pipeline
GitHub Repository

👨‍🔬 Author

Md Rezve Research Assistant — Plant Protection Lab Khulna University, Bangladesh

Research interests:

Plant molecular genetics
Population genomics
Omics-driven breeding
Computational biology

📜 License

This project is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Pipeline		Pipeline
LICENSE		LICENSE
Notebook		Notebook
README.md		README.md
SSR_Genetic_Pipeline_AutoML.ipynb		SSR_Genetic_Pipeline_AutoML.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌾 SSR Genetic Diversity & Population Structure Analysis Pipeline

🔬 Scientific Applications

⚙️ Key Analytical Modules

📂 Repository Structure

🧬 Computational Workflow

📊 Expected Outputs

📦 Installation

▶️ Running the Pipeline

📚 Citation

👨‍🔬 Author

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌾 SSR Genetic Diversity & Population Structure Analysis Pipeline

🔬 Scientific Applications

⚙️ Key Analytical Modules

📂 Repository Structure

🧬 Computational Workflow

📊 Expected Outputs

📦 Installation

▶️ Running the Pipeline

📚 Citation

👨‍🔬 Author

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages