Skip to content

engkinandatama/primerlab-genomic

Repository files navigation

🧬 PrimerLab Genomic

A modular bioinformatics framework for automated primer and probe design, built with clean architecture and reproducible workflows.

Python License Tests Docker Docs DeepWiki PyPI Status

πŸ”° Latest Release: v1.0.0 - Stable Release πŸŽ‰


πŸ“‹ Overview

PrimerLab Genomic is a Python-based toolkit for automated primer and probe design in molecular biology workflows. It provides a structured and reproducible framework for:

  • PCR β€” Standard primer design with quality control
  • qPCR β€” Probe design with thermodynamic checks
  • Off-target Check β€” BLAST-based specificity analysis
  • In-silico PCR β€” Virtual PCR simulation and validation

PrimerLab focuses on deterministic, transparent bioinformatics, following strict modularity and best practices.

πŸ”‘ Key Features

  • End-to-End Workflow: Sequence input β†’ Primer/Probe design β†’ QC β†’ Report
  • Thermodynamic Validation: Secondary structure prediction via ViennaRNA
  • QC Framework: Hairpins, dimers, GC%, Tm ranges, amplicon checks
  • qPCR Support: TaqMan-style probe design with efficiency estimation
  • Safe Execution: Timeout protection for complex sequences
  • Structured Output: JSON + Markdown + HTML reports with interpretable metrics

πŸ“¦ Feature Highlights

Category Features
Primer Design PCR, qPCR, Nested PCR, Semi-Nested PCR
Analysis BLAST off-target, In-silico PCR, Dimer matrix
qPCR Tools TaqMan probe design, Melt curve, Efficiency calc
Quality Control Hairpin, Homodimer, Heterodimer, Tm balance
Species Check Cross-reactivity, Multi-species comparison
Batch Processing Parallel processing, SQLite caching, CSV export
Visualization Coverage maps, Melt curves, Dimer heatmaps
Export JSON, Markdown, HTML, Excel, IDT plate format

πŸ“š Documentation

Resource Link
Getting Started Installation & Quick Start
CLI Reference Command Reference
API Reference Python API
Tutorials Step-by-Step Guides
Changelog Version History

πŸš€ Quick Start

Installation

Option 1: PyPI (Recommended)

pip install primerlab-genomic

Option 2: Docker (No setup required)

# Pull and run
docker pull ghcr.io/engkinandatama/primerlab-genomic:1.0.0
docker run ghcr.io/engkinandatama/primerlab-genomic:1.0.0 --version

# Run with your config
docker run -v $(pwd):/data ghcr.io/engkinandatama/primerlab-genomic:1.0.0 run pcr --config /data/config.yaml

Option 3: From Source (For Development)

git clone https://github.com/engkinandatama/primerlab-genomic.git
cd primerlab-genomic
pip install -e .

Optional: ViennaRNA (for Secondary Structure)

# Via pip (recommended)
pip install viennarna

# Via Conda
conda install -c bioconda viennarna

Without ViennaRNA, PrimerLab uses a fallback estimation method.

Once installed, verify the installation:

primerlab --version

πŸ”§ Usage

Command-Line Interface (CLI)

PCR Workflow:

primerlab run pcr --config test_pcr.yaml

qPCR Workflow:

primerlab run qpcr --config test_qpcr.yaml

Sequence Stats (v0.1.6):

# Check sequence before design
primerlab stats input.fasta

# JSON output for pipelines
primerlab stats input.fasta --json

Quiet Mode (v0.1.6):

# Suppress warnings for scripted pipelines
primerlab run pcr --config test_pcr.yaml --quiet

In-silico PCR Simulation (v0.2.0):

# Validate primers against template
primerlab insilico -p primers.json -t template.fasta

# With custom output directory
primerlab insilico -p primers.json -t template.fasta -o results/

# JSON output for pipelines
primerlab insilico -p primers.json -t template.fasta --json

Example primers.json:

{
  "forward": "ATGGTGAGCAAGGGCGAGGAG",
  "reverse": "TTACTTGTACAGCTCGTCCATGCC"
}

Primer Compatibility Check (v0.4.0):

# Check if multiple primer pairs can work together
primerlab check-compat --primers primer_set.json

# With custom output directory
primerlab check-compat --primers primer_set.json --output results/

# Integrated with PCR design (auto-check after design)
primerlab run pcr --config design.yaml --check-compat

Example primer_set.json:

[
  {"name": "GAPDH", "fwd": "ATGGGGAAGGTGAAGGTCGG", "rev": "GGATCTCGCTCCTGGAAGATG", "tm": 60.0},
  {"name": "ACTB", "fwd": "CATGTACGTTGCTATCCAGGC", "rev": "CTCCTTAATGTCACGCACGAT", "tm": 59.0}
]

qPCR Analysis Commands (v0.6.0):

# Check TaqMan probe binding
primerlab probe-check --probe ATGCGATCGATCGATCGATCG

# Predict SYBR melt curve
primerlab melt-curve --amplicon ATGCGATCGATCGATCGATCGATCGATCGATCG --format svg

# Validate qPCR amplicon quality
primerlab amplicon-qc --amplicon ATGCGATCGATCGATCGATCGATCGATCGATCG

# Generate melt plot during workflow (v0.6.1)
primerlab run qpcr --config design.yaml --plot-melt --plot-format png

PCR Variants (v0.7.0):

# Design Nested PCR primers
primerlab nested-design --sequence "ATGC..." --outer-size 400-600 --inner-size 150-250

# Design Semi-Nested PCR (shared forward primer)
primerlab seminested-design --sequence "ATGC..." --shared forward

Analysis Tools (v0.7.1):

# Analyze primer dimer matrix
primerlab dimer-matrix --primers primers.json --format svg

# Compare batch design runs
primerlab compare-batch result1.json result2.json --format markdown

Visualization (v0.7.2):

# Generate coverage map
primerlab coverage-map --result result.json --format svg

qPCR Efficiency (v0.7.4):

# Calculate efficiency from standard curve
primerlab qpcr-efficiency calculate --data curve.json

# Predict primer efficiency
primerlab qpcr-efficiency predict --forward "ATGCATGC..." --reverse "GCATGCAT..."

Programmatic API (Python)

For integration into your own Python scripts:

from primerlab.api.public import design_pcr_primers, design_qpcr_assays

# PCR primer design
sequence = "ATGAGTAAAGGAGAAGAACTTTTCACTGGAGT..."
result = design_pcr_primers(sequence)

print(f"Forward: {result.primers['forward'].sequence}")
print(f"Reverse: {result.primers['reverse'].sequence}")
print(f"Amplicon: {result.amplicons[0].length} bp")

# qPCR assay design (with custom parameters)
config = {
    "parameters": {
        "product_size_range": [[70, 200]],
        "probe": {"tm": {"min": 68.0, "opt": 70.0, "max": 72.0}}
    }
}
result = design_qpcr_assays(sequence, config)

print(f"Probe: {result.primers['probe'].sequence}")
print(f"Efficiency: {result.efficiency}%")

πŸ“– Documentation

Full documentation is available in the docs/ directory:

Section Description
Getting Started Installation and first steps
CLI Reference All 25+ commands
Configuration YAML config reference
Presets Pre-configured parameter sets
API Reference Programmatic interface
Features Advanced features
Troubleshooting Common issues and solutions

Additional Resources:


πŸ§ͺ Example Configurations

PCR Configuration

workflow: pcr

input:
  sequence: "ATGAGTAAAGGAGAAGAACTTTTCACTGGAGT..."  # Or use sequence_path: "input.fasta"

parameters:
  primer_size: {min: 18, opt: 20, max: 24}
  tm: {min: 58.0, opt: 60.0, max: 62.0}
  product_size: {min: 200, opt: 400, max: 600}  # v0.1.1: Simplified syntax

output:
  directory: "output_pcr"

qPCR Configuration (TaqMan - Default)

workflow: qpcr
# mode: taqman (default - includes probe design)

input:
  sequence: "ATGGGGAAGGTGAAGGTCGGAGT..."

parameters:
  primer_size: {min: 18, opt: 20, max: 24}
  tm: {min: 55.0, opt: 60.0, max: 65.0}
  
  probe:
    size: {min: 18, opt: 24, max: 30}
    tm: {min: 68.0, opt: 70.0, max: 72.0}

output:
  directory: "output_qpcr"

qPCR Configuration (SYBR Green)

workflow: qpcr

parameters:
  mode: sybr  # v0.1.1: Disables probe design automatically
  
  primer_size: {min: 18, opt: 20, max: 24}
  tm: {min: 58.0, opt: 60.0, max: 62.0}
  product_size: {min: 70, opt: 100, max: 150}

output:
  directory: "output_qpcr_sybr"

πŸ“Š Output Overview

PrimerLab generates a structured report containing:

  • Primer & Probe Details β€” Sequences, GC%, Tm, positions
  • qPCR Metrics β€” Estimated amplification efficiency
  • Amplicon Properties β€” Length, GC%, suitability
  • QC Checks β€” Dimers, hairpins, Tm balance
  • Warnings β€” Optimization suggestions

Run a workflow to generate your own report!


πŸ—οΈ Project Structure

primerlab-genomic/
β”œβ”€β”€ primerlab/
β”‚   β”œβ”€β”€ cli/              # Command-line interface
β”‚   β”œβ”€β”€ core/             # Reusable utilities
β”‚   β”‚   β”œβ”€β”€ insilico/     # In-silico PCR simulation (v0.2.0)
β”‚   β”‚   └── tools/        # Primer3, ViennaRNA wrappers
β”‚   β”œβ”€β”€ workflows/        # Workflow modules
β”‚   β”‚   β”œβ”€β”€ pcr/          # PCR workflow
β”‚   β”‚   └── qpcr/         # qPCR workflow
β”‚   β”œβ”€β”€ api/              # Public API
β”‚   └── config/           # Default configurations
β”œβ”€β”€ tests/                # 1286 automated tests
β”œβ”€β”€ docs/                 # User documentation
β”œβ”€β”€ examples/             # Example files
β”‚   └── insilico/         # In-silico PCR examples
└── .dev/                 # Internal dev docs

πŸ“Œ Development Status

βœ… v1.0.0 (Current)

  • Performance Optimization (core/cache.py):
    • LRU caching for Tm, GC, and Ξ”G calculations
    • 2-5x speedup for repeated computations
  • Model Standardization (v0.8.2):
    • to_dict() methods for 10+ dataclasses
    • Comprehensive STRUCTURE.md documentation
  • Code Quality Foundation (v0.8.0):
    • Type hints infrastructure (mypy config)
    • Exception testing (20+ tests)
    • Flake8 fixes (8,600+ fixes)
  • 1286 Tests - Comprehensive test coverage

v0.7.x Features (PCR Variants & qPCR Advanced)

  • Nested PCR Design (core/variants/nested.py)
  • Semi-Nested PCR (core/variants/seminested.py)
  • Dimer Matrix Analysis (core/analysis/dimer_matrix.py)
  • Batch Comparison (core/analysis/batch_compare.py)
  • Coverage Map (core/visualization/coverage_map.py)
  • qPCR Efficiency (core/qpcr/efficiency.py)
  • Advanced qPCR (core/qpcr/advanced.py): HRM, dPCR, quencher recommendations

v0.6.x Features (Genotyping & Visualization)

  • Allele Discrimination (core/genotyping/)
  • RT-qPCR Validation (core/rtpcr/)
  • Melt Curve Visualization
  • CLI Commands: probe-check, melt-curve, amplicon-qc

v0.5.0 Features

  • Probe Binding Simulation (TaqMan Tm calculation)
  • qPCR Amplicon Validation (Length/GC/structure)
  • SYBR Melt Curve Prediction

v0.4.x Features

  • Primer Compatibility Check (v0.4.0)
  • Amplicon Analysis (v0.4.1)
  • Species Specificity (v0.4.2)
  • Tm Gradient Simulation (v0.4.3)

Earlier Versions

  • v0.6.x: Allele discrimination, RT-qPCR, melt curve visualization
  • v0.3.x: BLAST off-target, reporting, Tm correction
  • v0.2.x: In-silico PCR simulation
  • v0.1.x: Core design, stats, batch processing

πŸ› οΈ Requirements

  • Python 3.10+
  • Primer3 (primer3-py)
  • ViennaRNA for secondary structure prediction
  • WSL recommended for Windows users

🀝 Contributing

We welcome contributions! Please read our guidelines first:

πŸ“„ CONTRIBUTING.md β€” How to contribute, coding standards, PR checklist

Key principles:

  • No cross-layer imports
  • Deterministic, reproducible outputs
  • All features need tests

πŸ“„ License

This project is licensed under the GNU General Public License v2.0 (GPL-2.0). See the LICENSE file for details.

Note on Dependencies: This project depends on primer3-py which is licensed under GPL-2.0. As such, PrimerLab Genomic adopts the compatible GPL-2.0 license to ensure compliance and freedom for end users.

Β© 2025–present β€” Engki Nandatama


πŸ™ Acknowledgments

  • Primer3 β€” Primary primer design engine
  • ViennaRNA β€” Thermodynamic folding & secondary structure analysis

πŸ“¬ Contact

For issues, suggestions, or contributions:

➑️ Open an issue on GitHub

About

A modular genomic toolkit focused on automated primer design, sequence validation, and workflow orchestration. Built for reliability, reproducibility, and scalable bioinformatics workflows.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages