Skip to content

Commit 7138dfd

Browse files
committed
feat: more refactors
1 parent a7ed939 commit 7138dfd

File tree

2 files changed

+14
-4
lines changed

2 files changed

+14
-4
lines changed

datasets/synthetic_data/README.md

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,19 @@ We have a `./util/parse_pc_pathways.py`, which takes a `pathways.txt` provided b
2222
human-readable pathway names into [identifiers.org](https://identifiers.org/) identifiers, which we later trim down
2323
with our provided list of pathway names in `pathways.jsonc` using `list_curated_pathways.py`.
2424

25-
## Sources and Targets
25+
## SIF Pathway Processing
2626

27-
[Sources](http://wlab.ethz.ch/surfaceome/), or `table_S3_surfaceome.xlsx`, (see [original paper](https://doi.org/10.1073/pnas.1808790115))
27+
The scripts `process_panther_pathway.py` and `panther_spras_formatting` convert pathways from the fetching step into ones usable by SPRAS, using
28+
external data:
29+
- [Sources](http://wlab.ethz.ch/surfaceome/), or `table_S3_surfaceome.xlsx`, (see [original paper](https://doi.org/10.1073/pnas.1808790115))
2830
are silico human surfaceomes receptors.
31+
- [Targets]( https://guolab.wchscu.cn/AnimalTFDB4//#/), or `Homo_sapiens_TF.tsv`, (see [original paper](https://doi.org/10.1093/nar/gkac907))
32+
are human transcription factors. We map these to UniProt in `map_transcription_factors.py`.
2933

30-
[Targets]( https://guolab.wchscu.cn/AnimalTFDB4//#/), or `Homo_sapiens_TF.tsv`, (see [original paper](https://doi.org/10.1093/nar/gkac907))
31-
are human transcription factors.
34+
## Interactome Generation
35+
36+
`interactome.py` uses STRING and UniProt data to produce a UniProt-based interactome.
37+
38+
## Thresholding
39+
40+
Using the interactome and processed pathway files, we threshold pathways. TODO write more about this.

datasets/synthetic_data/scripts/process_panther_pathway.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ def process_pathway(file: Path, folder: Path):
6666
scores["active"] = "true"
6767
scores.to_csv(folder / "prizes.txt", sep="\t", index=False)
6868

69+
6970
if __name__ == "__main__":
7071
pathway = parser().parse_args().pathway
7172
pathway_file = data_directory / Path(pathway).with_suffix(".sif")

0 commit comments

Comments
 (0)