HiC-sm

A Snakemake workflow for processing Hi-C sequencing data from raw FASTQ files to analysis-ready contact matrices and downstream analyses.

Overview

HiC-sm is a comprehensive, modular Snakemake pipeline designed for Hi-C data processing. It supports multiple alignment strategies, quality control, contact matrix generation, and downstream analyses including loop detection and distance-dependent contact analysis.

Features

Multiple Alignment Strategies: Supports BWA-MEM, BWA-MEM2, BWA-MEME, Bowtie2 (with rescue mapping), and Chromap
Quality Control: Integrated fastp for read trimming and MultiQC for comprehensive QC reporting
Flexible Processing:
- Automatic aggregation of multiple runs per sample
- Support for paired-end and single-end reads
- Configurable ligation site handling
Contact Matrix Generation:
- Cooler format (.mcool) with multiple resolutions
- Optional .hic format output
- Multiple filtering strategies
Downstream Analysis:
- Mustache loop detection
- Distance-dependent contact analysis
- Scaling analysis
Modular Architecture: Clean separation of concerns with domain-specific rule files
PEP Integration: Uses Portable Encapsulated Project (PEP) for sample metadata management

Requirements

Snakemake >= 8.20.1
Conda or Mamba for environment management
Python 3.11+
Genome reference files (FASTA, BWA/Bowtie2 indices, chromosome sizes)

Installation

Clone the repository:

git clone <repository-url>
cd hic_sm

Install Snakemake (if not already installed):

conda install -c conda-forge -c bioconda snakemake
# or
mamba install -c conda-forge -c bioconda snakemake

The workflow uses conda environments defined in workflow/envs/. These will be automatically created when you run the workflow.

Quick Start

1. Prepare Sample Sheet

Create a CSV file with sample information (see config/hg38_test_sample.csv for example):

sample_name,run,R1,R2,ligation_site,skip_ligation
sample1,1,/path/to/sample1_R1.fastq.gz,/path/to/sample1_R2.fastq.gz,GATCGATC,false
sample2,1,/path/to/sample2_R1.fastq.gz,/path/to/sample2_R2.fastq.gz,GATCGATC,false

Required columns:

sample_name: Unique sample identifier
run: Run number (integer)
R1, R2: Paths to forward and reverse FASTQ files (or fastq1, fastq2)

Optional columns:

ligation_site: Ligation site sequence (overrides config default)
skip_ligation: Boolean to skip ligation site processing
passqc: QC flag (1 = pass, 0 = fail)

2. Configure Workflow

Edit config/test_config.yml to set:

Genome assembly and reference paths
Mapper selection (bwa-mem, bwa-mem2, bwa-meme, bowtie2, chromap)
Output directories
Resolution bins
Filtering options

3. Set Up PEP Configuration

Configure your PEP project file (default: pep/project_config.yaml) to define:

Sample table path
Path variables (e.g., database_dir, process_dir)

4. Run the Workflow

Dry-run to check the workflow:

snakemake -n

Execute the workflow:

snakemake --use-conda --cores <number_of_cores>

For cluster execution:

snakemake --use-conda --profile <profile> --cores <number_of_cores>

Contributing

Contributions are welcome! Please ensure code follows the modular structure and includes appropriate documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
config		config
pep		pep
schemas		schemas
workflow		workflow
.gitignore		.gitignore
.snakemake-workflow-catalog.yml		.snakemake-workflow-catalog.yml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiC-sm

Overview

Features

Requirements

Installation

Quick Start

1. Prepare Sample Sheet

2. Configure Workflow

3. Set Up PEP Configuration

4. Run the Workflow

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HiC-sm

Overview

Features

Requirements

Installation

Quick Start

1. Prepare Sample Sheet

2. Configure Workflow

3. Set Up PEP Configuration

4. Run the Workflow

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages