Skip to content

Commit fc12b4e

Browse files
committed
Merge branch 'main' into synthetic
2 parents 20b1580 + 8d76766 commit fc12b4e

File tree

8 files changed

+162
-12
lines changed

8 files changed

+162
-12
lines changed

.github/workflows/publish.yml

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,6 @@ permissions:
1212
pages: write
1313
id-token: write
1414

15-
# Allow one concurrent deployment
16-
concurrency:
17-
group: 'pages'
18-
cancel-in-progress: true
19-
2015
jobs:
2116
pre-commit:
2217
name: Run pre-commit checks
@@ -61,7 +56,7 @@ jobs:
6156
run: sh run_snakemake.sh
6257
- name: Run Snakemake workflow for DMMMs
6358
shell: bash --login {0}
64-
run: snakemake --cores 1 --configfile configs/dmmm.yaml --show-failed-logs -s spras/Snakefile
59+
run: snakemake --cores 4 --configfile configs/dmmm.yaml --show-failed-logs -s spras/Snakefile
6560
# TODO: re-enable PRAs once RN/synthetic data PRs are merged.
6661
# - name: Run Snakemake workflow for PRAs
6762
# shell: bash --login {0}
@@ -88,6 +83,9 @@ jobs:
8883
environment:
8984
name: github-pages
9085
url: ${{ steps.deployment.outputs.page_url }}
86+
concurrency:
87+
group: 'pages'
88+
cancel-in-progress: true
9189
steps:
9290
- name: Download Artifacts
9391
uses: actions/download-artifact@v4

CONTRIBUTING.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,11 @@ To add a dataset (see `datasets/yeast-osmotic-stress` as an example of a dataset
1111
1. Check that your dataset provider isn't already added (some of these datasets act as providers for multiple datasets)
1212
1. Create a new folder under `datasets/<your-dataset>`
1313
1. Add a `raw` folder containing your data
14-
1. Add an attached Snakefile that converts your `raw` data to `processed` data
15-
1. Add your snakefile to the top-level `run_snakemake.sh` file.
16-
1. If your dataset is a paper reproduction, add a `reproduction/raw` and `reproduction/processed` folder
14+
1. Add an attached Snakefile that converts your `raw` data to `processed` data.
15+
- Make sure to use `uv` here. See `yeast-osmotic-stress`'s Snakefile for an example.
16+
1. Add your Snakefile to the top-level `run_snakemake.sh` file.
1717
1. Add your datasets to the appropiate `configs`
18+
- If your dataset has gold standards, make sure to include them here.
1819

1920
## Adding an algorithm
2021

README.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
# SPRAS benchmarking
1+
# [SPRAS benchmarking](https://reed-compbio.github.io/spras-benchmarking/)
22

33
![example workflow](https://github.com/Reed-CompBio/spras-benchmarking/actions/workflows/publish.yml/badge.svg)
44

55
Benchmarking datasets for the [SPRAS](https://github.com/Reed-CompBio/spras) project. This repository contains gold standard datasets to evaluate on as well as paper reproductions & improvements incorporating new methodologies.
6+
The results of every benchmarking run are deployed on GitHub pages. [(See the current web output)](https://reed-compbio.github.io/spras-benchmarking/).
67

78
## Setup
89

@@ -28,3 +29,29 @@ snakemake --cores 1 --configfile configs/dmmm.yaml --show-failed-logs -s spras/S
2829
> [!NOTE]
2930
> Each one of the dataset categories (at the time of writing, DMMM and PRA) are split into different configuration files.
3031
> Run each one as you would want.
32+
33+
## Organization
34+
35+
There are four primary folders in this repository:
36+
37+
```
38+
.
39+
├── configs
40+
├── datasets
41+
├── spras
42+
└── web
43+
```
44+
45+
`spras` is the cloned submodule of [SPRAS](https://github.com/reed-compbio/spras), `web` is an
46+
[astro](https://astro.build/) app which generates the `spras-benchmarking` [output](https://reed-compbio.github.io/spras-benchmarking/),
47+
`configs` is the YAML file used to talk to SPRAS, and `datasets` contains the raw data.
48+
49+
The workflow runs as so:
50+
51+
1. For every dataset, run its inner `Snakefile` with [Snakemake](https://snakemake.readthedocs.io/en/stable/). This is orchestrated
52+
through the top-level [`run_snakemake.sh`](./run_snakemake.sh) shell script.
53+
1. Run each config YAML file in `configs/` with SPRAS.
54+
1. Build the website in `web` with the generated `output` from all of the SPRAS runs, and deploy it on [GitHub Pages](https://pages.github.com/).
55+
To see how to build the website, go to its [README](./web/README.md).
56+
57+
For more information on how to add a dataset, see [CONTRIBUTING.md](./CONTRIBUTING.md).

configs/dmmm.yaml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,14 +43,21 @@ algorithms:
4343
g: [0]
4444

4545
datasets:
46+
# TODO: use old paramaters for datasets
47+
# HIV: https://github.com/Reed-CompBio/spras-benchmarking/blob/0293ae4dc0be59502fac06b42cfd9796a4b4413e/hiv-benchmarking/spras-config/config.yaml
4648
- label: dmmmhiv060
4749
node_files: ["processed_prize_060.txt"]
4850
edge_files: ["phosphosite-irefindex13.0-uniprot.txt"]
49-
# Placeholder
5051
other_files: []
5152
data_dir: "datasets/hiv/processed"
5253
- label: dmmmhiv05
5354
node_files: ["processed_prize_05.txt"]
5455
edge_files: ["phosphosite-irefindex13.0-uniprot.txt"]
5556
other_files: []
5657
data_dir: "datasets/hiv/processed"
58+
# Yeast: https://github.com/tristan-f-r/spras-benchmarking/blob/9477d85871024a5e3a4b0b8b9be7e78c0d0ee961/yeast-osmotic-stress/config.yaml
59+
- label: dmmmyeast
60+
node_files: ["prizes1_dummies.txt"]
61+
edge_files: ["network1.txt"]
62+
other_files: []
63+
data_dir: "datasets/yeast-osmotic-stress/processed"

datasets/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# datasets
2+
3+
Datasets contains both the raw data (straight from the study/database), as well as Python scripts and an associated Snakemake file
4+
which take all of the raw data and produce SPRAS-compatible data.

egfr/egfr-param-tuning.yaml

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
hash_length: 7
2+
container_framework: docker
3+
unpack_singularity: false
4+
container_registry:
5+
base_url: docker.io
6+
owner: reedcompbio
7+
algorithms:
8+
- name: omicsintegrator2
9+
params:
10+
include: true
11+
run1:
12+
b: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
13+
g: [2, 3, 4, 5, 6, 7]
14+
w: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
15+
- name: domino
16+
params:
17+
include: true
18+
run1:
19+
module_threshold: [0.001, 0.01, 0.02]
20+
slice_threshold: [0.001, 0.1, 0.3, 0.9, 1]
21+
- name: mincostflow
22+
params:
23+
include: true
24+
run1:
25+
capacity: [1, 5, 10, 15]
26+
flow: [6, 8, 20, 50, 60, 70, 80, 90, 150]
27+
- name: pathlinker
28+
params:
29+
include: true
30+
run1:
31+
k: [10, 20, 30, 40, 50, 60, 100, 200, 500]
32+
- name: allpairs
33+
params:
34+
include: true
35+
- name: meo
36+
params:
37+
include: true
38+
run1:
39+
local_search: ['No']
40+
max_path_length: [2]
41+
rand_restarts: [10]
42+
- name: omicsintegrator1
43+
params:
44+
include: true
45+
run1:
46+
b: [0.01, 0.55, 2, 5, 10]
47+
d: [10, 20, 30, 40]
48+
g: [0.0001, 0.001]
49+
mu: [0.001, 0.005, 0.008, 0.02, 0.03]
50+
r: [0.01, 0.1, 1]
51+
w: [0.001, 0.1, 0.5, 2, 8]
52+
datasets:
53+
- label: tps_egfr
54+
node_files:
55+
- tps-egfr-prizes.txt
56+
edge_files:
57+
- phosphosite-irefindex13.0-uniprot.txt
58+
other_files: []
59+
data_dir: input
60+
gold_standards:
61+
- label: gs_egfr
62+
node_files:
63+
- gs-egfr.txt
64+
data_dir: input
65+
dataset_labels:
66+
- tps_egfr
67+
reconstruction_settings:
68+
locations:
69+
reconstruction_dir: output/tps_egfr
70+
run: true
71+
analysis:
72+
summary:
73+
include: true
74+
graphspace:
75+
include: false
76+
cytoscape:
77+
include: false
78+
ml:
79+
include: true
80+
aggregate_per_algorithm: true
81+
components: 4
82+
labels: false
83+
linkage: ward
84+
metric: euclidean
85+
evaluation:
86+
include: false
87+
aggregate_per_algorithm: false

spras

Submodule spras updated 173 files

web/README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# web
2+
3+
This module is an [Astro](https://astro.build/) project which wraps the output from SPRAS
4+
into a presentable webpage. See the output: https://reed-compbio.github.io/spras-benchmarking/
5+
6+
## Building
7+
8+
To build this, you need [`pnpm`](https://pnpm.io/). It is recommended to use a node version manager
9+
([nvm](https://github.com/nvm-sh/nvm) for mac/linux, [nvm-windows](https://github.com/coreybutler/nvm-windows) for windows),
10+
to install `nodejs` and `npm` (at the time of writing, this would be node `v22`), and use `npm` to install `pnpm`:
11+
12+
```sh
13+
npm install --global pnpm
14+
```
15+
16+
After this, you can install the dependencies (make sure your current working directory is `web`):
17+
18+
```sh
19+
pnpm install
20+
```
21+
22+
Then, assuming your data is in `public/data`, build the website:
23+
24+
```sh
25+
pnpm run build
26+
```

0 commit comments

Comments
 (0)