Skip to content

Commit bb14d5c

Browse files
authored
fix: add missing template, minor formatting improvements (#36)
* fix: restored triggering catalog generation on push * fix: re-added missing template, closes #35 * fix: a couple of minor formatting imnprovements, closes #34 * feat: added citation CFF file * fix: restructured docs to have most information on a single page
1 parent eb34568 commit bb14d5c

File tree

12 files changed

+85
-98
lines changed

12 files changed

+85
-98
lines changed

.github/workflows/generate.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@ name: Generate catalog
33
on:
44
schedule:
55
- cron: 0 5 * * 1
6+
push:
7+
branches: [main, dev]
68

79
jobs:
810
generate-catalog:

CITATION.cff

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
cff-version: 1.2.0
2+
message: "If you use this workflow catalog in your research, please cite it using the following metadata."
3+
title: "Snakemake Workflow Catalog"
4+
authors:
5+
- family-names: "Koester"
6+
given-names: "Johannes"
7+
orcid: "https://orcid.org/0000-0001-9818-9320"
8+
- family-names: "Jahn"
9+
given-names: "Michael"
10+
orcid: "https://orcid.org/0000-0002-3913-153X"
11+
abstract: >
12+
The Snakemake Workflow Catalog is a collection of standardized workflows
13+
for reproducible and scalable data analysis using Snakemake. Workflows
14+
are automatically retrieved from Github and tested for compliance.
15+
repository-code: "https://github.com/snakemake/snakemake-workflow-catalog"
16+
version: "1.0.0"
17+
date-released: "2025-03-17"
18+
keywords:
19+
- snakemake
20+
- workflow
21+
- reproducibility
22+
- data Analysis
23+
license: "MIT"

scripts/common.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import time
77

88
from ratelimit import limits, sleep_and_retry
9+
from jinja2 import Environment, FileSystemLoader, select_autoescape
910
from github import Github
1011
from github.ContentFile import ContentFile
1112
from github.GithubException import UnknownObjectException, RateLimitExceededException
@@ -15,6 +16,10 @@
1516
test_repo = os.environ.get("TEST_REPO")
1617
offset = int(os.environ.get("OFFSET", 0))
1718

19+
env = Environment(
20+
autoescape=select_autoescape(["html"]), loader=FileSystemLoader("templates")
21+
)
22+
1823
# do not clone LFS files
1924
os.environ["GIT_LFS_SKIP_SMUDGE"] = "1"
2025
g = Github(os.environ["GITHUB_TOKEN"])

source/_templates/workflow_page.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,12 @@
1010
![](https://img.shields.io/github/{{ badge }}/{{ wf["full_name"] }}?style=flat&label={{ badge }})
1111
:::
1212
{% endfor %}
13+
:::{grid-item}
14+
:columns: auto
15+
:margin: 0
16+
:padding: 1
17+
[![](https://img.shields.io/badge/GitHub%20page-blue?style=flat)](https://github.com/{{ wf["full_name"] }})
18+
:::
1319
::::
1420

1521
{{ wf["description"] }}
@@ -29,7 +35,7 @@
2935
{bdg-danger}`linting: failed`
3036
{%- endif -%},
3137
**Formatting:**
32-
{%- if wf["formatting"] == None -%}
38+
{% if wf["formatting"] == None -%}
3339
{bdg-success}`formatting: passed`
3440
{%- else -%}
3541
{bdg-danger}`formatting: failed`
@@ -133,12 +139,24 @@ _The following section is imported from the workflow's `config/README.md`_.
133139

134140
### Linting results
135141

142+
{% if wf["linting"] == None %}
143+
```
144+
All tests passed!
145+
```
146+
{%- else -%}
136147
```
137148
{{ wf["linting"] }}
138149
```
150+
{% endif %}
139151

140152
### Formatting results
141153

154+
{% if wf["formatting"] == None %}
155+
```
156+
All tests passed!
157+
```
158+
{%- else -%}
142159
```
143160
{{ wf["formatting"] }}
144161
```
162+
{% endif %}

source/docs/about/adding_workflows.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11

2-
# Adding workflows
2+
## Adding workflows
33

44
Workflows are **automatically added** to the Workflow Catalog. This is done by regularly searching Github repositories for matching workflow structures. The catalog includes workflows based on the following criteria.
55

6-
## Generic workflows
6+
### Generic workflows
77

88
- The workflow is contained in a public Github repository.
99
- The repository has a `README.md` file, containing the words "snakemake" and "workflow" (case insensitive).
@@ -12,11 +12,11 @@ Workflows are **automatically added** to the Workflow Catalog. This is done by r
1212
- The repository is small enough to be cloned into a [Github Actions](https://docs.github.com/en/actions/about-github-actions/understanding-github-actions) job (very large files should be handled via [Git LFS](https://docs.github.com/en/repositories/working-with-files/managing-large-files), so that they can be stripped out during cloning).
1313
- The repository is not blacklisted here.
1414

15-
## *Standardized Usage* workflows
15+
### *Standardized Usage* workflows
1616

1717
In order to additionally appear in the "standardized usage" area, repositories additionally have to:
1818

19-
- have their main workflow definition named `workflow/Snakefile` (unlike for plain inclusion (see above), which also allows just `Snakefile` in the root of the repository),
19+
- have their main workflow definition named `workflow/Snakefile` (unlike for [plain inclusion](#generic-workflows), which also allows just `Snakefile` in the root of the repository),
2020
- provide configuration instructions under `config/README.md`
2121
- contain a `YAML` file `.snakemake-workflow-catalog.yml` in their root directory, which configures the usage instructions displayed by this workflow catalog.
2222

@@ -44,6 +44,6 @@ The content of the `.snakemake-workflow-catalog.yml` file is subject to change.
4444

4545
Once included in the standardized usage area you can link directly to the usage instructions for your repository via the URL `https://snakemake.github.io/snakemake-workflow-catalog?usage=<owner>/<repo>`. Do not forget to replace the `<owner>` and `<repo>` tags at the end of the URL.
4646

47-
## Release handling
47+
### Release handling
4848

4949
If your workflow provides Github releases, the catalog will always just scrape the latest non-preview release. Hence, in order to update your workflow's records here, you need to release a new version on Github.

source/docs/about/contributions.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11

2-
# Contributions
2+
## Contributions
33

44
Contributions to the Snakemake Workflow Catalog are welcome!
55
Ideas can be discussed on the [catalog's Issues page](https://github.com/snakemake/snakemake-workflow-catalog/issues) first, and contributions made through Github Pull Requests.
66

7-
## License
7+
### License
88

99
The Snakemake Workflow Catalog is open-source and available under the [MIT License](https://choosealicense.com/licenses/mit/).
10-
For more information and to explore the available workflows, visit https://snakemake.github.io/snakemake-workflow-catalog/.
10+
For more information about the individual workflows, browse the [list of *standardized usage* workflows](<all_standardized_workflows>).
1111

1212
:::{note}
1313
All workflows collected and presented on the Catalog are licensed under their own terms!

source/docs/about/purpose.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
# Purpose
2+
## Purpose
33

44
This repository serves as a centralized collection of workflows designed to facilitate reproducible and scalable data analyses using the [**Snakemake**](https://snakemake.github.io/) workflow management system.
55

source/docs/about/using_workflows.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11

2-
# Using workflows
2+
## Using workflows
33

4-
## Basic usage
4+
### Basic usage
55

66
To get started with a workflow from the catalog:
77

@@ -29,15 +29,13 @@ cd <workflow-dir>
2929
snakemake --cores 2
3030
```
3131

32-
:::tip Dry-run
33-
32+
:::{tip}
3433
Use the `--dry-run` option first to check if all inputs are found.
35-
3634
:::
3735

38-
For more detailed instructions, please refer to the individual documentation for each [workflow](<docs/all_standardized_workflows>).
36+
For more detailed instructions, please refer to the individual documentation for each [workflow](<all_standardized_workflows>).
3937

40-
## Deployment options
38+
### Deployment options
4139

4240
The deployment method is controlled using the `--software-deployment-method` (short `--sdm`) argument.
4341

source/docs/catalog.md

Lines changed: 5 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -3,70 +3,18 @@
33

44
```{toctree}
55
:hidden:
6-
7-
about/purpose
8-
about/using_workflows
9-
about/adding_workflows
10-
about/contributions
116
```
127

138
Here you can find the most important information about the **Snakemake workflow catalog**.
149

15-
*Estimated reading time: 5 minutes*.
16-
17-
:::
18-
## Use a workflow from the catalog
19-
:::
20-
21-
1. Clone the repository or download the specific workflow directory.
22-
23-
```bash
24-
git clone https://github.com/<user>/<workflow>
10+
```{include} about/purpose.md
2511
```
2612

27-
2. Review the documentation provided with the workflow to understand its requirements and usage.
28-
29-
3. Configure the workflow by editing the `config.yml` files as needed.
30-
31-
4. Create an environment with access to Snakemake. It is recommended to use `mamba`.
32-
33-
```bash
34-
mamba create -n <env-name> -c <channels> snakemake
35-
mamba activate <env-name>
13+
```{include} about/using_workflows.md
3614
```
3715

38-
5. Execute the workflow using Snakemake.
39-
40-
```bash
41-
cd <workflow-dir>
42-
snakemake --cores 2
16+
```{include} about/adding_workflows.md
4317
```
4418

45-
:::tip Dry-run
46-
47-
Use the `--dry-run` option first to check if all inputs are found.
48-
49-
:::
50-
51-
For more detailed instructions, please refer to the individual documentation for each [workflow](workflows/top_wf_by_stars.mdx).
52-
53-
:::
54-
## Add a workflow to the catalog
55-
:::
56-
57-
Workflows are **automatically added** to the Workflow Catalog. This is done by regularly searching Github repositories for matching workflow structures. The catalog includes workflows based on the following criteria.
58-
59-
The catalog currently discriminates between two types of workflows based on their documentation:
60-
61-
**Generic workflows**
62-
63-
- all snakemake workflows in public Github repositories
64-
- repositories need to have a `README.md` file containing the words "snakemake" and "workflow"
65-
- also need to have a workflow definition named either `Snakefile` or `workflow/Snakefile`, and contain rules in `*.smk` format.
66-
67-
**Standardized Usage workflows**
68-
69-
- workflows that additionally adhere to standards of the workflow catalog
70-
- main workflow definition must be named `workflow/Snakefile`
71-
- provide configuration instructions under `config/README.md`
72-
- contain a `.snakemake-workflow-catalog.yml` file with instructions on deployment options
19+
```{include} about/contributions.md
20+
```

source/docs/snakemake.md

Lines changed: 5 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -5,31 +5,15 @@
55
:hidden:
66
```
77

8-
## What is 'snakemake'?
8+
## What is Snakemake?
99

10-
The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described via a human readable, Python based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition. Finally, Snakemake workflows can entail a description of required software, which will be automatically deployed to any execution environment.
10+
The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described *via* a human readable, Python based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition.
1111

12-
## Basic usage
12+
- To learn more about Snakemake, visit the [Snakemake homepage!](https://snakemake.github.io/)
1313

14-
Snakemake usage is [extensively documented here](https://snakemake.readthedocs.io/en/stable/).
14+
- To get an impression of the Snakemake architecture, [read the Snakemake paper](https://doi.org/10.12688/f1000research.29032.2)
1515

16-
Snakemake is organized in `rules`, which define specific input and output files.
17-
Files are processed using code which is for example directly deployed with the `shell` directive, or with external `python` and `R` scripts, or even directly rendered `markdown` based notebooks.
18-
19-
To get a first impression:
20-
21-
```bash
22-
rule select_by_country:
23-
input:
24-
"data/worldcitiespop.csv"
25-
output:
26-
"by-country/{country}.csv"
27-
shell:
28-
"xsv search -s Country '{wildcards.country}' "
29-
"{input} > {output}"
30-
```
31-
32-
In this code chunk, the input table `data/worldcitiespop.csv` is searched by the keyword `country`, which is used as a wildcard to construct new file names for the output. The result is that all lines from the original table are split by Country and saved as separare files in a new output directory.
16+
- To learn how to use Snakemake workflows, [read the documentation](https://snakemake.readthedocs.io/en/stable/).
3317

3418
## Create your own workflows
3519

0 commit comments

Comments
 (0)