Skip to content

Commit 5ff807f

Browse files
committed
overhauling the wrapper CLI and implementing a collected variant table script
1 parent 224b892 commit 5ff807f

31 files changed

+2257
-1856
lines changed

.github/workflows/test.yml

Lines changed: 0 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -37,32 +37,6 @@ jobs:
3737
run: |
3838
uv run pytest
3939
40-
python-tests-tox:
41-
runs-on: ubuntu-latest
42-
43-
steps:
44-
- name: Checkout repository
45-
uses: actions/checkout@v4
46-
47-
- name: Set up Python
48-
uses: actions/setup-python@v5
49-
with:
50-
python-version: '3.12'
51-
52-
- name: Install UV
53-
uses: astral-sh/setup-uv@v4
54-
with:
55-
enable-cache: true
56-
cache-dependency-glob: "pyproject.toml"
57-
58-
- name: Install and run tox
59-
run: |
60-
uvx --from tox-uv tox -p auto
61-
62-
- name: Run linting with tox
63-
run: |
64-
uvx --from tox-uv tox -e lint
65-
6640
nextflow-tests:
6741
runs-on: ubuntu-latest
6842
strategy:

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,12 @@
7979
!/bin/*.awk
8080
!/bin/README.md
8181

82+
# oneroof CLI package (Typer-based CLI wrapper)
83+
!/bin/oneroof_cli/
84+
!/bin/oneroof_cli/*.py
85+
!/bin/oneroof_cli/commands/
86+
!/bin/oneroof_cli/commands/*.py
87+
8288
# Rust development files for IDE support
8389
!/Cargo.toml
8490
!/Cargo.lock

README.md

Lines changed: 62 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
- [Detailed Setup Instructions](#detailed-setup-instructions)
88
- [Configuration](#configuration)
99
- [Developer Setup](#developer-setup)
10+
- [OneRoof CLI (Power Users)](#oneroof-cli-power-users)
1011
- [Testing](#testing)
1112
- [Running Tests](#running-tests)
1213
- [Test Structure](#test-structure)
@@ -56,7 +57,7 @@ nextflow run nrminor/oneroof \
5657

5758
If you want to use Apptainer containers instead of Docker, just add `-profile apptainer` to either of the above `nextflow run` commands. And if you don’t want to use containers at all, simply run `pixi shell --frozen` to bring all the pipeline’s dependencies into scope and then add `-profile containerless` to your `nextflow run` command.
5859

59-
Nextflow pipelines like this one have a ton of configuration, which can be overwhelming for beginners and new users. To make this process easier, we’re developing a Terminal User Interface (TUI) to guide you through setup. Please stay tuned!
60+
For power users working with a local clone of the repository, we also provide the `oneroof` CLI—a friendly command-line interface with organized help, input validation, and easy resume functionality. See the [OneRoof CLI](#oneroof-cli-power-users) section below for details.
6061

6162
## Quick Start
6263

@@ -98,7 +99,7 @@ Most users should configure `oneroof` through the command line via the following
9899
| `--min_variant_frequency` | 0.05 (illumina) or 0.10 (nanopore) | Minimum variant frequency to call a variant. |
99100
| `--meta_ref` | None | Dataset, either a local FASTA file or a pre-built dataset built by Sylph, to use for metagenomic profiling. Can download prebuilt ones here: [Pre-built Sylph Databases](https://github.com/bluenote-1577/sylph/wiki/Pre%E2%80%90built-databases). |
100101
| `--sylph_tax_db` | None | The taxonomic annotation for the sylph database specified with `--meta_ref`. The pipeline automaticially downloads the databases so only the identifier is needed here. |
101-
|`--meta_ref_link` | None | The link to download the sylph dataset needed to run metagenomics, would be used instead of specifying an already downloaded data set in `--meta_ref`. |
102+
| `--sylph_db_link` | None | The link to download the sylph dataset needed to run metagenomics, would be used instead of specifying an already downloaded data set in `--meta_ref`. |
102103
| `--nextclade_dataset` | None | The name of the dataset to run nextclade with. To see all dataset options run `nextclade dataset list --only-names`. |
103104
| `--results` | results/ | Where to place results. |
104105
| `--cleanup` | false | Whether to cleanup work directory after a successful run. |
@@ -147,9 +148,66 @@ Especially on Apple Silicon Macs, this will reduce the overhead of using the Doc
147148

148149
Note also that more information on the repo’s files is available in our [developer guide](developer.qmd).
149150

151+
### OneRoof CLI (Power Users)
152+
153+
For users who have cloned the repository and are working locally, we provide the `oneroof` command-line interface. This Typer-based CLI wraps the Nextflow pipeline with several quality-of-life improvements:
154+
155+
- **Organized help**: Parameters are grouped by category with Rich-formatted output
156+
- **Input validation**: File and directory paths are validated before the pipeline runs
157+
- **Kebab-case flags**: More natural CLI conventions (`--primer-bed` instead of `--primer_bed`)
158+
- **Dry-run mode**: Preview the generated Nextflow command without executing
159+
- **Easy resume**: Resume interrupted runs with `oneroof resume`
160+
161+
The CLI is available automatically when you enter the pixi environment:
162+
163+
``` bash
164+
pixi shell --frozen
165+
oneroof --help
166+
```
167+
168+
Or install it directly with pip or uv:
169+
170+
``` bash
171+
# With pip
172+
pip install -e .
173+
174+
# With uv
175+
uv pip install -e .
176+
```
177+
178+
**Example usage:**
179+
180+
``` bash
181+
# See all available commands
182+
oneroof --help
183+
184+
# See all run options, organized by category
185+
oneroof run --help
186+
187+
# Run with Illumina data
188+
oneroof run \
189+
--refseq my_ref.fasta \
190+
--illumina-fastq-dir my_reads/ \
191+
--profile docker
192+
193+
# Preview the command without running
194+
oneroof run \
195+
--refseq my_ref.fasta \
196+
--illumina-fastq-dir my_reads/ \
197+
--dry-run
198+
199+
# Resume an interrupted run (uses cached parameters)
200+
oneroof resume
201+
202+
# Run test profiles (no --refseq needed)
203+
oneroof run --profile illumina_test_with_primers
204+
```
205+
206+
The CLI generates and executes standard Nextflow commands under the hood, so all Nextflow features (caching, resume, etc.) work as expected.
207+
150208
## Testing
151209

152-
OneRoof includes a comprehensive test suite built with [nf-test](https://www.nf-test.com/), the official testing framework for Nextflow pipelines. The test suite validates pipeline functionality through module, workflow, and end-to-end tests.
210+
OneRoof includes a limited but growing test suite, which validates pipeline functionality through module, workflow, and end-to-end tests.
153211

154212
### Running Tests
155213

@@ -165,12 +223,11 @@ just test-nanopore
165223

166224
# To run specific tests, run `just` and then under `testing` will be all the specific tests
167225
just
168-
169226
```
170227

171228
### Test Structure
172229

173-
Tests are organized as profiles in `nextflow.config` and each specific test is in its own config file under `conf/illumina_tests` or `conf/nanopore_tests`. The data used for these tests can be found under `tests/data`. Tests can also be run with `nextflow run . -profile illumina_test_with_primers` or any other test specified as a profile in `nextflow.config` and this will show a verbose output to see the pipeline running.
230+
Tests are organized as profiles in `nextflow.config` and each specific test is in its own config file under `conf/illumina_tests` or `conf/nanopore_tests`. The data used for these tests can be found under `tests/data`. Tests can also be run with `nextflow run . -profile illumina_test_with_primers` or any other test specified as a profile in `nextflow.config` and this will show a verbose output to see the pipeline running.
174231

175232
For more details on the testing framework and how to write new tests, see the [test suite documentation](../tests/README.md).
176233

bin/README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -33,17 +33,20 @@ uv run concat_consensus.py
3333
uv run file_watcher.py --config credentials.yaml
3434
```
3535

36-
### generate_variant_pivot.py
37-
**Purpose**: Processes variant data from annotated VCF files into a pivot table format.
36+
### collect_full_variant_table.py
37+
**Purpose**: Collects and enriches variant data from all samples into a final, queryable table.
3838

3939
**Key Features**:
40-
- Parses VCF fields extracted with SnpSift
41-
- Creates pivot tables for variant analysis
42-
- Uses Polars for efficient data processing
40+
- Combines per-sample SnpSift variant effect files into a single table
41+
- Adds derived columns: variant_id, aa_change (one-letter), mutation_type, is_consensus, sample_count, is_shared
42+
- Converts HGVS three-letter amino acid notation to one-letter (e.g., p.Asp614Gly → D614G)
43+
- Outputs both TSV (human-readable) and Parquet (fast queries) formats
44+
- Uses Polars lazy API for efficient processing of large datasets
45+
- Sorted by chromosome, position, and sample for easy exploration
4346

4447
**Usage Example**:
4548
```bash
46-
uv run generate_variant_pivot.py --input_table variants.tsv
49+
uv run collect_full_variant_table.py --input-dir variant_tsvs/ --output full_variants --consensus-threshold 0.8
4750
```
4851

4952
### ivar_variants_to_vcf.py
@@ -162,9 +165,9 @@ uv run validate_primer_bed.py -i primers.bed -o validated_primers
162165
## Test Files
163166

164167
The directory includes test files for several scripts:
168+
- `test_collect_full_variant_table.py`
165169
- `test_concat_consensus.py`
166170
- `test_file_watcher.py`
167-
- `test_generate_variant_pivot.py`
168171
- `test_ivar_variants_to_vcf.py`
169172
- `test_slack_alerts.py`
170173

0 commit comments

Comments
 (0)