README

Overview

Bioinformatic workflow for processing and analyzing biological datasets using various computational tools, including Python and R. In particular, this workflow focuses on using different biological networks, processing them through embedding, and using this information to prioritize causal genes related to a given disease.

Prerequisites

Ensure the following dependencies are installed before running the daemon.sh script the manage the workflow:

Bash shell (Linux/macOS)
Python (by deafaul, initialized via init_python)
R (by default, initialized via init_R)
AutoFlow
Required scripts and daemons located in ~soft_bio_267/ as default. You can use them cloning the repository of biosys scripts and adjusting de path from daemon.sh, named as sys_bio_lab_scripts_path.

Usage

Run daemon.sh with the desired execution mode:

./daemon.sh <exec_mode> [additional_options]

Execution Modes

download_layers: Downloads the necessary dataset layers.
download_translators: Fetches and processes translator tables.
process_download: Processes downloaded datasets.
dversion <version>: Selects the data version to use.
whitelist: Filters genes based on a whitelist.
process_control: Prepares control datasets.
get_control <benchmark>: Retrieves control genes for benchmarking (zampieri or menche).
kernels: Computes similarity kernels.
plot_sims: Generates similarity plots.
ranking <benchmark>: Computes non-integrated rankings.
integrate: Integrates kernels or embeddings.
integrated_ranking <benchmark>: Computes rankings from integrated kernels or embeddings.
report <save_option>: Generates HTML reports. It is worth to mention that, in order to respect author order, menche banchmark is named as buphamalai benchmark in the resulting reports.
check <folder>: Checks AutoFlow execution logs.
recover <folder>: Recovers execution logs.

Outputs

The daemon.sh generates several output folders:

output_folder/similarity_kernels/ - Stores computed similarity kernels.
output_folder/rankings/ - Contains computed rankings.
output_folder/integrations/ - Stores kernel integrations.
output_folder/integrated_rankings/ - Contains integrated rankings.
report_folder/ - Stores generated reports.

Example

./daemon.sh kernels
./daemon.sh ranking menche
./daemon.sh report save

Notes

The workflow makes use of AutoFlow for various processing steps.
Data sources include STRING, OMIM, DepMap, and other biological databases.
Ensure network connectivity when downloading data.

License

This repository is intended for research and educational purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
control_genes		control_genes
input		input
net2jsons		net2jsons
report		report
scripts		scripts
translators		translators
white_list		white_list
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
control_neg		control_neg
control_pos		control_pos
daemon.sh		daemon.sh
net2json		net2json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Overview

Prerequisites

Usage

Execution Modes

Outputs

Example

Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

README

Overview

Prerequisites

Usage

Execution Modes

Outputs

Example

Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages