Displaying large amounts of data often requires first turning it into not-so-large amounts of data. Clodius is a program and library designed to aggregate large datasets to make them easy to display at different resolutions.
Install the clodius package:
pip install clodiusAnd use it aggregate a bigWig file:
curl https://raw.githubusercontent.com/hms-dbmi/clodius/develop/test/sample_data/geneAnnotationsExonsUnions.short.bed \
> /tmp/sample.short.bed
clodius aggregate bedfile /tmp/sample.short.bedThe output files can then be displayed using higlass-manage. For more information about viewing these types of files take a look at the higlass docs.
More examples are available.
- Non-genomic Rasters
- Genomic Data
The recommended way to develop clodius is to use a conda environment and
install clodius with develop mode:
pip install -e ".[dev]"Test data files in data/ are stored in Git LFS. They are downloaded automatically when you clone the repository with LFS enabled:
git lfs install # once per machine
git clone <repo> # LFS files downloaded automatically
# or, in an existing clone:
git lfs pull-
Check if the file type is already tracked — open .gitattributes and look for a matching pattern (e.g.
data/*.gz,*.bam). If not, add a new tracking rule:git lfs track "data/*.ext" # adds a line to .gitattributes git add .gitattributes
-
Allow the file through
.gitignore—data/*is ignored by default. Add a negation line for your file:!data/your_new_file.ext -
Stage and commit as normal:
git add data/your_new_file.ext git commit -m "Add test fixture: your_new_file.ext" git push # LFS objects are uploaded automatically
The unit tests for clodius can be run using pytest:
pytestIndividual unit tests can be specified by indicating the file and function they are defined in:
pytest test/cli_test.py:test_clodius_aggregate_bedgraph