A simple Snakemake demo

A minimal pipeline: one input CSV, several processing steps, and a plot.

Input

data/input.csv is a small table with columns category and value.

Pipeline

clean (scripts/clean.py) : Read data/input.csv, drop any nulls, and ensure value entries are numeric. Produces data/cleaned.csv as the clean dataset.
transform (scripts/transform.py) : Add column to insert normalized value and another column to add the rank. Produces results/transformed.csv as the result.
summarize (scripts/summarize.py) : Write summary stats to results/summary.txt file.
plot (scripts/plot.py) : Create bar chart of category vs value as results/plot.png and results/plot_unsorted.png image files.

Each step is implemented in a Python script under scripts/; the Snakefile invokes them with input and output paths taken as input arguments.

Setup

init/setup.sh contains the steps to set up a conda environment that contains python, snakemake, and any dependencies for this pipeline. Please note that the setup is specifically configured for the Roar Collab cluster. To run on other clusters, some minor modifications must be made.

Run

From this directory (workflowtools_intro/):

Local:

snakemake -j 1

Local with a profile: The profiles/local/config.yaml file sets some defaults to limit the resources used the snakemake jobs.

snakemake --profile profiles/local

Slurm: The profiles/slurm/config.yaml file sets slurm-specific settings that enable snakemake to submit jobs via sbatch.

snakemake --profile profiles/slurm

Reset

init/reset.sh removes all outputs and puts the repo back to a clean state after a snakemake run. Run with the following:

./init/reset.sh

Outputs

data/cleaned.csv
results/transformed.csv
results/summary.txt
results/plot.png

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
benchmarks		benchmarks
data		data
init		init
logs		logs
profiles		profiles
results		results
scripts		scripts
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A simple Snakemake demo

Input

Pipeline

Setup

Run

Reset

Outputs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A simple Snakemake demo

Input

Pipeline

Setup

Run

Reset

Outputs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages