Skip to content

pachterlab/biowomp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

biowomp

Make alluvial plots with node order and colors optimized to minimize edge crossings with biowomp! Builds on top of the sorting algorithm in wompwomp.

biowomp functions/commands

alt text

Installation:

R - Requires system R to be installed

Bioconductor (not yet released on Bioconductor - please install from GitHub)

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("biowomp")

GitHub

if (!require("remotes", quietly = TRUE))
    install.packages("remotes")
remotes::install_github("pachterlab/biowomp")

Command line - Does not require system R to be installed if using conda.

git clone https://github.com/pachterlab/biowomp
cd biowomp
conda env create -f environment.yml  # or to avoid conda: Rscript inst/install.R
conda activate wompwomp_env  # skip if used install.R above
Rscript -e 'remotes::install_local(".")' # or use --dev flag in commands

The first time any command is run on the command line, a prompt will appear asking to install any missing R dependencies.

Docker

We provide an Docker image for running biowomp built on rocker/tidyverse

docker run -it -p 8787:8787 -e PASSWORD=<YOUR_PASS> josephrich98/biowomp:latest

Then vist "http://localhost:8787" in a browser and use username: rstudio, password: <YOUR_PASS> (change password as desired after "-e").

Usage

The I/O for each of biowomp's functions is as follows:

  1. plot_alluvial: dataframe, csv, or tibble (grouped or ungrouped) --> plot

The input table can have one of two formats:

  1. Ungrouped: columns specified by graphing_columns, where each row corresponds to a separate entity
  2. Grouped: columns specified by graphing_columns and column_weights, where each row corresponds to a combination of graphing_columns, and column_weights specified the number of items in this combination

Examples in R

Ungrouped input

library("biowomp")
df <- data.frame(method1 = sample(1:3, 100, TRUE), method2 = sample(1:3, 100, TRUE))
head(df)
#>   method1    method2
#> 1   1   1
#> 2   1   3
#> 3   1   2
#> 4   1   1
#> 5   2   1
#> 6   2   2

p <- plot_alluvial(df)
p

Grouped input

set.seed(42)
raw_df <- data.frame(
    method1 = sample(1:3, 100, TRUE),
    method2 = sample(1:3, 100, TRUE)
)

# Aggregate by combination
df <- as.data.frame(dplyr::count(raw_df, method1, method2, name = "weight"))
head(df)

#>   method1    method2     weight
#> 1    1   1   13  
#> 2    1   2   15  
#> 3    1   3   12  
#> 4    2   1   12  
#> 5    2   2   17  
#> 6    2   3   10  

p <- plot_alluvial(df, column_weights = "weight")
p

Examples in Command Line:

./exec/biowomp plot_alluvial --df mydata.csv --graphing_columns column1 column2

For help on any command, run ./exec/biowomp COMMAND --help

Notes about command line usage:

  • all parameter values should be space-separted ex. ./exec/biowomp plot_alluvial --df data.csv, NOT --df=data.csv
  • all parameters that take a single argument have identical names between R and command line, with the value immediately following the argument ex. plot_alluvial(df=data.csv), ./exec/biowomp plot_alluvial --df data.csv
  • all parameters that take a vector/list of arguments have identical names between R and command line, with the values immediately following the argument, all separated by spaced ex. plot_alluvial(graphing_columns=c("tissue", "cluster")), ./exec/biowomp plot_alluvial --graphing_columns tissue cluster
  • all parameters that take a named vector/list as argument will be passed with format KEY=VALUE on command line (all interpreted as character) ex. plot_alluvial(color_band_list=c("A"="blue", "B"="green"), ./exec/biowomp plot_alluvial --color_band_list A=blue B=green
  • all boolean parameters are passed with the flag without any following arguments; boolean parameters that default to FALSE have identical names between R and command line, while boolean parameters that default to TRUE have "disable_" prepended to the name in the command line ex. (note that the defaults for include_group_sizes=FALSE and include_axis_titles=TRUE): plot_alluvial(include_group_sizes=TRUE, include_axis_titles=FALSE), ./exec/biowomp plot_alluvial --include_group_sizes --disable_include_axis_titles

See a full tutorial in our introductory vignette biowomp_intro.Rmd

Read our preprint on arXiv here.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors