Skip to content

Commit 310db00

Browse files
committed
add pathways.txt.gz
we also introduce the inner Snakefile
1 parent db5a09e commit 310db00

File tree

7 files changed

+30
-11
lines changed

7 files changed

+30
-11
lines changed

cache/directory.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,11 @@ def download(self, output: str | PathLike):
358358
name="PathwayCommons Universal BioPAX file",
359359
cached="https://drive.google.com/uc?id=1R7uE2ky7fGlZThIWCOblu7iqbpC-aRr0",
360360
pinned="https://download.baderlab.org/PathwayCommons/PC2/v14/pc-biopax.owl.gz"
361+
),
362+
"pathways.txt.gz": CacheItem(
363+
name="PathwayCommons Pathway Identifiers",
364+
cached="https://drive.google.com/uc?id=1SMwuuohuZuNFnTev4zRNJrBnBsLlCHcK",
365+
pinned="https://download.baderlab.org/PathwayCommons/PC2/v14/pathways.txt.gz",
361366
)
362367
}
363368
}

datasets/synthetic-data/.gitignore

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
intermediate
2-
processed
3-
raw
1+
/intermediate
2+
/processed
3+
/raw

datasets/synthetic-data/README.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,11 @@
11
# Synthetic Data
22

3-
## Download STRING Human Interactome
4-
1. Download the STRING *Homo sapiens* `9606.protein.links.full.v12.0.txt.gz` database file from [STRING](https://string-db.org/cgi/download?sessionId=bL9sRTdIaUEt&species_text=Homo+sapiens&settings_expanded=0&min_download_score=0&filter_redundant_pairs=0&delimiter_type=txt).
5-
2. Move the downloaded file into the `raw/human-interactome/` folder.
6-
3. From the `raw/synthetic-data/` directory, extract the file using:
3+
## PANTHER Pathway Fetching
74

8-
```sh
9-
gunzip human-interactome/9606.protein.links.full.v12.0.txt.gz
10-
```
5+
This dataset has a kind of 'sub'-dataset, which is a separate Snakemake rule
6+
used for generating the pathway files and their associated metadata to be used inside this one.
7+
8+
Located under `./panther_pathways`, it provides TODO.
119

1210
## Download New PANTHER Pathways
1311
1. Visit [Pathway Commons](https://www.pathwaycommons.org/).

datasets/synthetic-data/Snakefile

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,6 @@ produce_fetch_rules({
3838
"raw/human-interactome/table_S3_surfaceome.xlsx": ["Surfaceome", "table_S3_surfaceome.xlsx"],
3939
"raw/human-interactome/Homo_sapiens_TF.tsv": ["TranscriptionFactors", "Homo_sapiens_TF.tsv"],
4040
"raw/human-interactome/HUMAN_9606_idmapping_selected.tsv": FetchConfig(["UniProt", "9606", "HUMAN_9606_idmapping_selected.tab.gz"], uncompress=True),
41-
"raw/pc-biopax.owl": FetchConfig(["PathwayCommons", "pc-biopax.owl.gz"], uncompress=True)
4241
})
4342

4443
rule interactome:
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
/raw
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
include: "../../../cache/Snakefile"
2+
3+
rule all:
4+
input:
5+
# TODO: pass to script
6+
"raw/pathways.txt"
7+
8+
produce_fetch_rules({
9+
"raw/pc-biopax.owl": FetchConfig(["PathwayCommons", "pc-biopax.owl.gz"], uncompress=True),
10+
"raw/pathways.txt": FetchConfig(["PathwayCommons", "pathways.txt.gz"], uncompress=True)
11+
})
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
def main():
2+
pass
3+
4+
if __name__ == "__main__":
5+
main()

0 commit comments

Comments
 (0)