-
Notifications
You must be signed in to change notification settings - Fork 204
feat: added wrapper for MOFA2 #4883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
seneschall
wants to merge
34
commits into
snakemake:master
Choose a base branch
from
seneschall:mofa2-wrapper
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 11 commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
d205b53
initialised workspace
seneschall ab3f1c8
added test data, dependencies
seneschall 609a372
implemented wrapper; added test data
d59ec3c
worked on wrapper
seneschall 5e2defa
fixed environment.yaml ; added test data
seneschall 265399c
started adding params
seneschall 26636a1
added test; appended meta.yaml; cleanup
seneschall df7be10
changed flag in test
seneschall c850121
removed testing pixi envs
seneschall 88b7ccc
changed wrapper path
seneschall 86efe3c
pinned environment; cleanup
seneschall aa73d2f
Merge branch 'master' into mofa2-wrapper
johanneskoester 97a5512
Apply suggestion from @coderabbitai[bot]
johanneskoester 8f20ee7
deleted requested files
seneschall 4480162
changed params froms strings to native bools
seneschall c788d45
fixed assignment bug; cleaned up comments
seneschall fe651a1
changed handling of output
seneschall 8072bd5
removed log file
seneschall 762a4ea
removed dependencies; readded gitignore
seneschall af345bf
updated pinned environment
seneschall da1b6b0
Merge branch 'master' into mofa2-wrapper
fgvieira dc9ac50
removed .gitattributes
seneschall 2aa6ed3
started working on subwrappers
seneschall 3bb4dff
added functionality for multiple plots
seneschall eac269f
added test data
seneschall eb482c2
adding params
seneschall db39e37
cleanup
seneschall bd61cb5
started working on meta.yaml
seneschall 15a040b
added notes to meta.yaml ; added test cases
e66771b
pinned environment
seneschall c92b290
fix to remove undesired output
seneschall 34a23a6
Merge branch 'mofa2-subwrappers' into mofa2-wrapper
seneschall 181b022
fixed issues
seneschall a679d95
changed notes to params in plotting/meta.yaml
seneschall File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| # SCM syntax highlighting & preventing 3-way merges | ||
| pixi.lock merge=binary linguist-language=YAML linguist-generated=true |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| # SCM syntax highlighting & preventing 3-way merges | ||
| pixi.lock merge=binary linguist-language=YAML linguist-generated=true -diff | ||
seneschall marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # pixi environments | ||
| .pixi/* | ||
| !.pixi/config.toml | ||
fgvieira marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| - nodefaults | ||
| dependencies: | ||
| - bioconductor-mofa2 =1.16.0 | ||
| - r-base =4.4.3 | ||
| - r-arrow =22.0.0 | ||
| - mofapy2 =0.7.2 | ||
| - python =3.14.2 | ||
fgvieira marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| name: mofa2 | ||
| description: | | ||
| Train a model on a multi-omic data set with default options. | ||
| url: https://www.bioconductor.org/packages/release/bioc/html/MOFA2.html | ||
| authors: | ||
| - Simon Sack | ||
| input: | ||
| - | | ||
| A parquet file in tidy format containing data with the headers: `sample, feature, view, group (optional), value` | ||
|
|
||
| `sample`: The name of the sample | ||
|
|
||
| `feature`: The name of the observed feature | ||
|
|
||
| `group` (optional, advanced): Discouraged for beginners. The aim of the multi-group framework is not to capture differential changes in mean levels between the groups (as for example when doing differential RNA expression). The goal is to compare the sources of variability that drive each group. | ||
|
|
||
| `value`: The observed value | ||
|
|
||
| `view`: The view the observed feature is grouped into | ||
| output: | ||
| - An HDF5-file with the trained model. | ||
| notes: | | ||
| In the params, set `scale_group` and/or `scale_views` to `TRUE`, if your groups/views | ||
| have different ranges/variances. This scales them to unit variance. | ||
| Defaults to `FALSE` if no params are given. | ||
| For all other training variables, this wrapper uses the default values. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| rule mofa2: | ||
| input: | ||
| "{data}.parquet", | ||
| output: | ||
| "{data}.hdf5", | ||
| log: | ||
| "log/{data}.log", | ||
| params: | ||
| scale_groups="FALSE", # set to TRUE if groups have different ranges/variances | ||
| scale_views="FALSE", # set to TRUE if views have different ranges/variances | ||
fgvieira marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| wrapper: | ||
| "master/bio/mofa2" | ||
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| Creating MOFA object from a data.frame... | ||
fgvieira marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # Multi-group mode requested. | ||
|
|
||
| This is an advanced option, if this is the first time that you are running MOFA, we suggest that you try do some exploration first without specifying groups. Two important remarks: | ||
|
|
||
| - The aim of the multi-group framework is to identify the sources of variability *within* the groups. If your aim is to find a factor that 'separates' the groups, you DO NOT want to use the multi-group framework. Please see the FAQ on the MOFA2 webpage. | ||
|
|
||
| - It is important to account for the group effect before selecting highly variable features (HVFs). We suggest that either you calculate HVFs per group and then take the union, or regress out the group effect before HVF selection | ||
| Checking data options... | ||
| Checking training options... | ||
| Checking model options... | ||
| Connecting to the mofapy2 python package using reticulate (use_basilisk = FALSE)... | ||
| Please make sure to manually specify the right python binary when loading R with reticulate::use_python(..., force=TRUE) or the right conda environment with reticulate::use_condaenv(..., force=TRUE) | ||
| If you prefer to let us automatically install a conda environment with 'mofapy2' installed using the 'basilisk' package, please use the argument 'use_basilisk = TRUE' | ||
|
|
||
| 10 factors were found to explain no variance and they were removed for downstream analysis. You can disable this option by setting load_model(..., remove_inactive_factors = FALSE) | ||
| Trained MOFA with the following characteristics: | ||
| Number of views: 2 | ||
| Views names: view_0 view_1 | ||
| Number of features (per view): 1000 1000 | ||
| Number of groups: 2 | ||
| Groups names: group_0 group_1 | ||
| Number of samples (per group): 100 100 | ||
| Number of factors: 5 | ||
|
|
||
| Warning message: | ||
| In run_mofa(mofa_object, outfile, ) : | ||
| The latest mofapy2 version is 0.7.0, you are using 0.7.2. Please upgrade with 'pip install mofapy2' | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| Creating MOFA object from a data.frame... | ||
| Checking data options... | ||
| Checking training options... | ||
| Checking model options... | ||
| Warning message: | ||
| In prepare_mofa(object = mofa_object, data_options = data_opts, : | ||
| The total number of samples is very small for learning 15 factors. | ||
| Try to reduce the number of factors to obtain meaningful results. It should not exceed ~14. | ||
| Connecting to the mofapy2 python package using reticulate (use_basilisk = FALSE)... | ||
| Please make sure to manually specify the right python binary when loading R with reticulate::use_python(..., force=TRUE) or the right conda environment with reticulate::use_condaenv(..., force=TRUE) | ||
| If you prefer to let us automatically install a conda environment with 'mofapy2' installed using the 'basilisk' package, please use the argument 'use_basilisk = TRUE' | ||
|
|
||
| Trained MOFA with the following characteristics: | ||
| Number of views: 3 | ||
| Views names: Bacteria Fungi Viruses | ||
| Number of features (per view): 180 18 42 | ||
| Number of groups: 1 | ||
| Groups names: single_group | ||
| Number of samples (per group): 59 | ||
| Number of factors: 15 | ||
|
|
||
| Warning message: | ||
| In run_mofa(mofa_object, outfile, ) : | ||
| The latest mofapy2 version is 0.7.0, you are using 0.7.2. Please upgrade with 'pip install mofapy2' |
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| #!/bin/R | ||
|
|
||
| # load libraries | ||
| library(MOFA2) | ||
| library(arrow) | ||
|
|
||
| # connect to conda environment | ||
| conda_prefix <- Sys.getenv("CONDA_PREFIX") | ||
| reticulate::use_condaenv(conda_prefix) | ||
|
|
||
| # if log file is provided, write log to that file | ||
| if (length(snakemake@log) > 0) { | ||
| log <- file(snakemake@log[[1]], open = "wt") | ||
| sink(log) | ||
| sink(log, type = "message") | ||
| } | ||
|
|
||
| # load long.data frame from parquet file with following headers: | ||
| # `sample, feature, view, group (optional), value` | ||
|
|
||
| # cast input path as character to avoid errors | ||
| path <- as.character(snakemake@input[[1]]) | ||
|
|
||
| df <- read_parquet(path) | ||
|
|
||
| mofa_object <- create_mofa(df) | ||
|
|
||
| data_opts <- get_default_data_options(mofa_object) | ||
| model_opts <- get_default_model_options(mofa_object) | ||
| train_opts <- get_default_training_options(mofa_object) | ||
|
|
||
| # add params: | ||
| # model params: scale_groups, scale_views | ||
|
|
||
| if ("scale_groups" %in% names(snakemake@params)) { | ||
| if (snakemake@params[["scale_groups"]] == "FALSE") { | ||
| data_opts$scale_groups <- FALSE | ||
| } | ||
| if (snakemake@params[["scale_groups"]] == "TRUE") { | ||
| data_opts$scale_groups <- TRUE | ||
| } | ||
| } | ||
|
|
||
| if ("scale_views" %in% names(snakemake@params)) { | ||
| if (snakemake@params[["scale_views"]] == "FALSE") { | ||
| data_opts$scale_views <- FALSE | ||
| } | ||
| if (snakemake@params[["scale_views"]] == "TRUE") { | ||
| data_opts$scale_views <- TRUE | ||
| } | ||
| } | ||
|
|
||
| # training params: maxiter (int), convergence_mode, gpu_mode, verbose | ||
|
|
||
coderabbitai[bot] marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| mofa_object <- prepare_mofa( | ||
| object = mofa_object, | ||
| data_options = data_opts, | ||
| model_options = model_opts, | ||
| training_options = train_opts | ||
| ) | ||
|
|
||
| outfile <- file.path(getwd(), snakemake@output[[1]]) | ||
coderabbitai[bot] marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # train the MOFA model and write the result to `outfile` | ||
| run_mofa( | ||
| mofa_object, | ||
| outfile, | ||
| ) | ||
johanneskoester marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.