diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml index 0086358..d3e657e 100644 --- a/.github/ISSUE_TEMPLATE/config.yml +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -1 +1,5 @@ blank_issues_enabled: true +contact_links: + - name: Contributing Guidelines + url: https://github.com/Clinical-Genomics/oncorefiner?tab=contributing-ov-file#readme + about: If you would like to contribute to this pipeline, please follow the contributing guidelines. diff --git a/.nf-core.yml b/.nf-core.yml index 07f5253..c88c876 100644 --- a/.nf-core.yml +++ b/.nf-core.yml @@ -14,15 +14,15 @@ lint: - .prettierignore - assets/sendmail_template.txt - docs/README.md + - .github/CONTRIBUTING.md multiqc_config: false nf_core_version: 3.5.2 repository_type: pipeline template: author: Clinical Genomics Stockholm - description: Customizable post-processing and extension layer built on top of Oncoanalyser - that adapts its outputs to clinical and operational needs, adds missing analyses, - and ensures flexibility for evolving standards while retaining Oncoanalyser robust - core + description: Customizable post-processing and extension layer for Oncoanalyser that adapts + its outputs according to clinical and operational needs, adds missing analyses, and + ensures flexibility for evolving standards while retaining Oncoanalyser's robust core. force: false is_nfcore: false name: oncorefiner diff --git a/CHANGELOG.md b/CHANGELOG.md index 843b5eb..f51595e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,43 +9,45 @@ Initial release of Clinical-Genomics/oncorefiner, created with the [nf-core](htt ### `Added` -- Added Ensembl VEP annotation for SNV vcf file [#1](https://github.com/Clinical-Genomics/oncorefiner/pull/1) -- Added VCFANNO annotation for SNV vcf file [#2](https://github.com/Clinical-Genomics/oncorefiner/pull/2) -- Added filtering for SNV vcf file [#3](https://github.com/Clinical-Genomics/oncorefiner/pull/3) -- Added annotation for SV vcf file [#4](https://github.com/Clinical-Genomics/oncorefiner/pull/4) -- Added filtering for SV vcf file [#5](https://github.com/Clinical-Genomics/oncorefiner/pull/5) -- Added small test profile. The related test dataset have been added as a branch called oncorefiner under [Clinical-Genomics/test-datasets](https://github.com/Clinical-Genomics/test-datasets/tree/oncorefiner) [#8](https://github.com/Clinical-Genomics/oncorefiner/pull/8) -- Added CI checks for `Conventional PR title`, `Updated changelog` and `Add PR checklist comment` [#18](https://github.com/Clinical-Genomics/oncorefiner/pull/18) -- Added parameters documentation [#25](https://github.com/Clinical-Genomics/oncorefiner/pull/25) -- Added pre-commit hook for automatic generation of parameters documentation [#25](https://github.com/Clinical-Genomics/oncorefiner/pull/25) -- Added Nextflow strict syntax compatibility [#30](https://github.com/Clinical-Genomics/oncorefiner/pull/30) - -### Changed - -- Updated PR template, PR checklist, feature request template, bug report template and issue template chooser [#24](https://github.com/Clinical-Genomics/oncorefiner/pull/24) -- Updated nf-schema to 2.6.1 [#30](https://github.com/Clinical-Genomics/oncorefiner/pull/30) -- Updated minimum Nextflow version to 25.10.0 [#30](https://github.com/Clinical-Genomics/oncorefiner/pull/30) -- Added wgs-cancer-pipeline projects list in the issue templates [#37](https://github.com/Clinical-Genomics/oncorefiner/pull/37) +- [#1](https://github.com/Clinical-Genomics/oncorefiner/pull/1) Added Ensembl VEP annotation for SNV vcf file. +- [#2](https://github.com/Clinical-Genomics/oncorefiner/pull/2) Added VCFANNO annotation for SNV vcf file. +- [#3](https://github.com/Clinical-Genomics/oncorefiner/pull/3) Added filtering for SNV vcf file. +- [#4](https://github.com/Clinical-Genomics/oncorefiner/pull/4) Added annotation for SV vcf file. +- [#5](https://github.com/Clinical-Genomics/oncorefiner/pull/5) Added filtering for SV vcf file. +- [#8](https://github.com/Clinical-Genomics/oncorefiner/pull/8) Added small test profile. The related test dataset have been added as a branch called oncorefiner under [Clinical-Genomics/test-datasets](https://github.com/Clinical-Genomics/test-datasets/tree/oncorefiner). +- [#18](https://github.com/Clinical-Genomics/oncorefiner/pull/18) Added CI checks for `Conventional PR title`, `Updated changelog` and `Add PR checklist comment`. +- [#25](https://github.com/Clinical-Genomics/oncorefiner/pull/25) Added parameters documentation. +- [#25](https://github.com/Clinical-Genomics/oncorefiner/pull/25) Added pre-commit hook for automatic generation of parameters documentation. +- [#30](https://github.com/Clinical-Genomics/oncorefiner/pull/30) Added Nextflow strict syntax compatibility. + +### `Changed` + +- [#24](https://github.com/Clinical-Genomics/oncorefiner/pull/24) Updated PR template, PR checklist, feature request template, bug report template and issue template chooser. +- [#30](https://github.com/Clinical-Genomics/oncorefiner/pull/30) Updated nf-schema to 2.6.1. +- [#30](https://github.com/Clinical-Genomics/oncorefiner/pull/30) Updated minimum Nextflow version to 25.10.0. +- [#37](https://github.com/Clinical-Genomics/oncorefiner/pull/37) Added wgs-cancer-pipeline projects list in the issue templates. +- [#48](https://github.com/Clinical-Genomics/oncorefiner/pull/48) Updated documentation. ### `Fixed` -- Removed snv_vcf_tbi and sv_vcf_tbi parameter. VCF indexes are now automatically detected [#9](https://github.com/Clinical-Genomics/oncorefiner/pull/9) -- Renamed pipeline from postprocessing to oncorefiner []() -- Fixed linting issues [#20](https://github.com/Clinical-Genomics/oncorefiner/pull/20) -- Fixed nf-test to run a functional default test, and generated a snapshot [#26](https://github.com/Clinical-Genomics/oncorefiner/pull/26) -- Added missing description to bug_report.yml [32](https://github.com/Clinical-Genomics/oncorefiner/pull/32) -- Updated template settings to set organisation to `Clinical-Genomics` and skip unused features `igenomes` and `fastqc` [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) -- Refactored `genome` parameter to have default value 'GRCh38' and no longer refer to igenomes [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) -- Updated linting config to fix linting issues and re-added/removed checks for files where nf-core file structure is no longer required [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) -- Updated template for nf-core/tools version 3.5.2 to apply updated settings and changes missed in previous template update ([14](https://github.com/Clinical-Genomics/oncorefiner/pull/14)) [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) -- Fixed prepare_references config that was defined but not used [36](https://github.com/Clinical-Genomics/oncorefiner/pull/36) -- Fixed bug and formatting in feature request template [#39](https://github.com/Clinical-Genomics/oncorefiner/pull/39) -- Fixed merge mistake introduced in [#25](https://github.com/Clinical-Genomics/oncorefiner/pull/25) [#41](https://github.com/Clinical-Genomics/oncorefiner/pull/41) -- Added necessary GITHUB_TOKEN permissions for action add_pr_checklist_comment [#42](https://github.com/Clinical-Genomics/oncorefiner/pull/42) -- Updated all modules and removed deprecated `ch_versions` to implement latest nf-core changes that use the `versions` topic channel to collect software versions [#34](https://github.com/Clinical-Genomics/oncorefiner/pull/34) -- Fixed settings for `add_pr_checklist_comment` to allow action to run on a PR originated from a fork [#45](https://github.com/Clinical-Genomics/oncorefiner/pull/45) -- Added `species` parameter to provide information for annotation which was previously hardcoded [#49](https://github.com/Clinical-Genomics/oncorefiner/pull/49) -- Added settings and moved ungrouped parameters to relevant groups [#50](https://github.com/Clinical-Genomics/oncorefiner/pull/50) +- [#9](https://github.com/Clinical-Genomics/oncorefiner/pull/9) Removed snv_vcf_tbi and sv_vcf_tbi parameter. VCF indexes are now automatically detected. +- [#10](https://github.com/Clinical-Genomics/oncorefiner/pull/10) Renamed pipeline from postprocessing to oncorefiner. +- [#20](https://github.com/Clinical-Genomics/oncorefiner/pull/20) Fixed linting issues. +- [#26](https://github.com/Clinical-Genomics/oncorefiner/pull/26) Fixed nf-test to run a functional default test, and generated a snapshot. +- [#32](https://github.com/Clinical-Genomics/oncorefiner/pull/32) Added missing description in bug report template. +- [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) Updated template settings to set organisation to `Clinical-Genomics` and skip unused features `igenomes` and `fastqc`. +- [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) Refactored `genome` parameter to have default value 'GRCh38' and no longer refer to igenomes. +- [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) Updated linting config to fix linting issues and re-added/removed checks for files where nf-core file structure is no longer required. +- [#35](https://github.com/Clinical-Genomics/oncorefiner/pull/35) Updated template for nf-core/tools version 3.5.2 to apply updated settings and changes missed in previous template update ([#14](https://github.com/Clinical-Genomics/oncorefiner/pull/14)). +- [36](https://github.com/Clinical-Genomics/oncorefiner/pull/36) Fixed prepare_references config that was defined but not used. +- [#39](https://github.com/Clinical-Genomics/oncorefiner/pull/39) Fixed bug and formatting in feature request template. +- [#41](https://github.com/Clinical-Genomics/oncorefiner/pull/41) Fixed merge mistake in `.nf-core.yml` introduced in previous PR ([#25](https://github.com/Clinical-Genomics/oncorefiner/pull/25)). +- [#42](https://github.com/Clinical-Genomics/oncorefiner/pull/42) Added necessary GITHUB_TOKEN permissions for action add_pr_checklist_comment. +- [#34](https://github.com/Clinical-Genomics/oncorefiner/pull/34) Updated all modules and removed deprecated `ch_versions` to implement latest nf-core changes that use the `versions` topic channel to collect software versions. +- [#45](https://github.com/Clinical-Genomics/oncorefiner/pull/45) Fixed settings for `add_pr_checklist_comment` to allow action to run on a PR originated from a fork. +- [#49](https://github.com/Clinical-Genomics/oncorefiner/pull/49) Added `species` parameter to provide information for annotation which was previously hardcoded. +- [#50](https://github.com/Clinical-Genomics/oncorefiner/pull/50) Added settings and moved ungrouped parameters to relevant groups. +- [#48](https://github.com/Clinical-Genomics/oncorefiner/pull/48) Updated documentation. ### `Dependencies` @@ -53,5 +55,5 @@ Initial release of Clinical-Genomics/oncorefiner, created with the [nf-core](htt ### `Removed` -- Removed CI checks `awstest` and `awsfulltest` [#18](https://github.com/Clinical-Genomics/oncorefiner/pull/18) -- Removed unused parameter `custom_extra_files` [#51](https://github.com/Clinical-Genomics/oncorefiner/pull/51) +- [#18](https://github.com/Clinical-Genomics/oncorefiner/pull/18) Removed CI checks `awstest` and `awsfulltest`. +- [#51](https://github.com/Clinical-Genomics/oncorefiner/pull/51) Removed unused parameter `custom_extra_files`. diff --git a/CITATIONS.md b/CITATIONS.md index 2b87728..4676b71 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -1,18 +1,32 @@ # Clinical-Genomics/oncorefiner: Citations -## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/) +## Nextflow & nf-core -> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031. +- [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/) -## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/) + > Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311. -> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311. +- [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/) + + > Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031. ## Pipeline tools +- [`bcftools`](https://github.com/samtools/bcftools) + + > Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. Twelve years of SAMtools and BCFtools. Gigascience. 2021 Jan 29;10(2):giab008. doi: 10.1093/gigascience/giab008. PubMed PMID: 33590861; PubMed Central PMCID: PMC7931819. + +- [`Ensembl VEP`](https://www.ensembl.org/info/docs/tools/vep/index.html) + + > McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F. The Ensembl Variant Effect Predictor. Genome Biology. Jun 6;17(1):122. doi:10.1186/s13059-016-0974-4. PubMed PMID: 27268795; PMCID: PMC4893825. + - [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/) -> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924. + > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924. + +- [`Vcfanno`](https://github.com/brentp/vcfanno) + + > Pedersen BS, Layer RM, Quinlan AR. Vcfanno: fast, flexible annotation of genetic variants. Genome Biol. 2016 Jun 1;17(1):118. doi: 10.1186/s13059-016-0973-5. PMID: 27250555; PMCID: PMC4888505. ## Software packaging/containerisation tools diff --git a/.github/CONTRIBUTING.md b/CONTRIBUTING.md similarity index 69% rename from .github/CONTRIBUTING.md rename to CONTRIBUTING.md index 2cf6f3c..06fbb2a 100644 --- a/.github/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -21,6 +21,46 @@ If you'd like to write some code for Clinical-Genomics/oncorefiner, the standard If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/). +### Pull Requests + +When opening a pull request to suggest changes to the code, please make sure to follow the [Pipeline contribution conventions](#pipeline-contribution-conventions) for the code and to fill in the necessary information in the pull request template as well as address all points in the `PR checklist`. + +#### PR title conventions + +We have implemented a standardised PR title format to make it easier to understand the type of change being proposed at a glance. +Addionally, there is an automated check for every PR that will only allow mergere if the title adheres to one of the following formats: + +- feat: A new feature +- fix: A bug fix +- docs: Documentation only changes +- style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc) +- refactor: A code change that neither fixes a bug nor adds a feature +- perf: A code change that improves performance +- test: Adding missing tests or correcting existing tests +- build: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm) +- ci: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs) +- chore: Other changes that don't modify src or test files +- revert: Reverts a previous commit + +#### Review + +When reviewing a PR, make sure to check that: + +- The code follows the [Pipeline contribution conventions](#pipeline-contribution-conventions). +- The information in the PR (and related issue) is clear and sufficient to understand the change and the motivation for it - title, description and entry in `CHANGELOG.md`, if applicable. +- All the items in the `PR checklist` have been addressed, the changes are well documented and the tests are passing. + +Be positive and constructive in your review, and whenever possible offer suggestions for improvement rather than just pointing out issues. + +## Installation and dependencies for development + +In order to run the pipeline, develop and test your changes locally, we recommend that you set up: + +- A conda environment with `nextflow` and `nf-core` tools. - For this, follow the instructions from the [nf-core documentation](https://nf-co.re/docs/nf-core-tools/installation). Additional information about [Installation of nf-core dependencies](https://nf-co.re/docs/usage/getting_started/installation/) is also available, if needed. +- Install Docker (https://www.docker.com/products/docker-desktop/) and make sure the daemon is running when you want to run the tests locally. + +Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data. + ## Tests You have the option to test your changes locally by running the pipeline. For receiving warnings about process selectors and other `debug` information, it is recommended to use the debug profile. Execute all the tests with the following command: @@ -98,6 +138,13 @@ Please use the following naming schemes, to make it easy to understand what is g If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core pipelines bump-version --nextflow . [min-nf-version]` +### Update nf-core template + +Since this is not an nf-core pipeline, the nf-core template is not automatically updated in the `TEMPLATE` branch. Follow these step to update the template: + +1. Update the `TEMPLATE` branch by running `nf-core pipelines sync`. Fix any merge conflicts and open a PR to then merge the changes. +1. Open a PR to merge the `TEMPLATE` branch into `dev` to update the template files in the main codebase. + ### Images and figures For overview images and other documents we follow the nf-core [style guidelines and examples](https://nf-co.re/developers/design_guidelines). diff --git a/README.md b/README.md index 8245dd3..4891199 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,12 @@ # Clinical-Genomics/oncorefiner +

+ + + nf-core/oncorefiner + +

+ [![Open in GitHub Codespaces](https://img.shields.io/badge/Open_In_GitHub_Codespaces-black?labelColor=grey&logo=github)](https://github.com/codespaces/new/Clinical-Genomics/oncorefiner) [![GitHub Actions CI Status](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/nf-test.yml/badge.svg)](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/nf-test.yml) [![GitHub Actions Linting Status](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/linting.yml/badge.svg)](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/linting.yml)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX) @@ -14,43 +21,65 @@ ## Introduction -**Clinical-Genomics/oncorefiner** is a bioinformatics pipeline that ... +**Clinical-Genomics/oncorefiner** is a customizable post-processing and extension layer for Oncoanalyser that adapts its outputs according to clinical and operational needs, adds missing analyses, and ensures flexibility for evolving standards while retaining Oncoanalyser's robust core. - +### Workflow diagram -2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/)) + +1. Process SNV VCF files + 1. Annotate with [`Vcfanno`](https://github.com/brentp/vcfanno) + 1. Filter according to call quality with [`bcftools`](https://github.com/samtools/bcftools) + 1. Filter according to user provided list of research relevant variant with [`bcftools`](https://github.com/samtools/bcftools) + 1. Annotate with [`Ensembl VEP`](https://www.ensembl.org/info/docs/tools/vep/index.html) + 1. Filter according to user provided list of clinically relevant variants with [`bcftools`](https://github.com/samtools/bcftools) + +1. Process SV VCF files + 1. SVDB QUERY ??? + 1. Filter according to call quality with ??? + 1. Filter according to user provided list of research relevant variant with [`bcftools`](https://github.com/samtools/bcftools) + 1. Annotate with [`Ensembl VEP`](https://www.ensembl.org/info/docs/tools/vep/index.html) + 1. Filter according to user provided list of clinically relevant variants with [`bcftools`](https://github.com/samtools/bcftools) + +1. Present QC for raw reads ([`MultiQC`](http://multiqc.info/)) + +### Summary of tools and version used in the pipeline + +| Step | Tool | Version | +| --------------------- | ------------- | ------- | +| Clinical Filtering | bcftools | 1.22 | +| Clinical Filtering SV | bcftools | 1.22 | +| EnsemblVEP SNV | ensemblvep | 115.2 | +| EnsemblVEP SNV | perl-math-cdf | 0.1 | +| EnsemblVEP SNV | tabix | 1.21 | +| EnsemblVEP SV | ensemblvep | 115.2 | +| EnsemblVEP SV | perl-math-cdf | 0.1 | +| EnsemblVEP SV | tabix | 1.21 | +| Research Filtering | bcftools | 1.22 | +| Research Filtering SV | bcftools | 1.22 | +| SVDB Query DB | svdb | 2.8.4 | +| Untar VEP Cache | untar | 1.34 | +| Vcfanno | vcfanno | 0.3.7 | ## Usage > [!NOTE] > If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data. - +Each row represents a fastq file (single-end with only `fastq_1`) or a pair of fastq files (paired end with `fastq_1` and `fastq_2`). Now, you can run the pipeline using: - - ```bash nextflow run Clinical-Genomics/oncorefiner \ -profile \ @@ -61,6 +90,8 @@ nextflow run Clinical-Genomics/oncorefiner \ > [!WARNING] > Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files). +For more details and further functionality, please refer to the [usage documentation](./docs/usage.md) and the [parameter documentation](./docs/parameters.md). + ## Pipeline output For more details about the output files and reports, please refer to the [output documentation](.github/docs/output.md). @@ -69,7 +100,13 @@ For more details about the output files and reports, please refer to the [output Clinical-Genomics/oncorefiner was originally written by Clinical Genomics Stockholm. -We thank the following people for their extensive assistance in the development of this pipeline: [Eva Caceres](https://github.com/fevac), [Kristine Bilgrav Sæther](https://github.com/kristinebilgrav) +We thank the following people for their extensive assistance in the development of this pipeline: + +- [Eva Caceres](https://github.com/fevac) +- [Kristine Bilgrav Sæther](https://github.com/kristinebilgrav) +- [Beatriz Sá Vinhas](https://github.com/beatrizsavinhas) +- [Mathias Johansson](https://github.com/mathiasbio) +- [Felix Lenner](https://github.com/fellen31) ## Contributions and Support @@ -80,8 +117,6 @@ If you would like to contribute to this pipeline, please see the [contributing g - - An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file. This pipeline uses code and infrastructure developed and maintained by the [nf-core](https://nf-co.re) community, reused here under the [MIT license](https://github.com/nf-core/tools/blob/main/LICENSE). diff --git a/docs/output.md b/docs/output.md index 2447844..0003a76 100644 --- a/docs/output.md +++ b/docs/output.md @@ -6,15 +6,30 @@ This document describes the output produced by the pipeline. Most of the plots a The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory. - - ## Pipeline overview The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps: + + - [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline - [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution + + +### Vcfanno + +
+Output files + +- `vcfanno/` + - `SNV_vcfanno.vcf.gz`: a gzipped VCF file containing annotated SNVs. + - `SNV_vcfanno.vcf.gz.tbi`: index file for the gzipped VCF file. + +
+ +[`Vcfanno`](https://github.com/brentp/vcfanno) annotates VCF files with a number of INFO fields from the VCFs or BED files provided. + ### MultiQC
diff --git a/docs/usage.md b/docs/usage.md index 4e866cb..227386a 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -4,8 +4,6 @@ ## Introduction - - ## Samplesheet input You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row as shown in the examples below. @@ -14,17 +12,6 @@ You will need to create a samplesheet with information about the samples you wou --input '[path to samplesheet file]' ``` -### Multiple runs of the same sample - -The `sample` identifiers have to be the same when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. The pipeline will concatenate the raw reads before performing any downstream analysis. Below is an example for the same sample sequenced across 3 lanes: - -```csv title="samplesheet.csv" -sample,fastq_1,fastq_2 -CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz -CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz -CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz -``` - ### Full samplesheet The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below. @@ -69,6 +56,8 @@ work # Directory containing the nextflow working files # Other nextflow hidden files, eg. history of pipeline runs and old logs. ``` +Additionally, there are several parameters that can be used to customize the pipeline run. See [parameters documentation](../docs/parameters.md) for a full list of available parameters, their descriptions and formats. + If you wish to repeatedly use the same parameters for multiple runs, rather than specifying each flag in the command, you can specify these in a params file. Pipeline settings can be provided in a `yaml` or `json` file via `-params-file `. diff --git a/ro-crate-metadata.json b/ro-crate-metadata.json index ed73984..8acbd11 100644 --- a/ro-crate-metadata.json +++ b/ro-crate-metadata.json @@ -23,7 +23,7 @@ "@type": "Dataset", "creativeWorkStatus": "InProgress", "datePublished": "2026-03-05T15:49:14+00:00", - "description": "# Clinical-Genomics/oncorefiner\n\n[![Open in GitHub Codespaces](https://img.shields.io/badge/Open_In_GitHub_Codespaces-black?labelColor=grey&logo=github)](https://github.com/codespaces/new/Clinical-Genomics/oncorefiner)\n[![GitHub Actions CI Status](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/nf-test.yml/badge.svg)](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/nf-test.yml)\n[![GitHub Actions Linting Status](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/linting.yml/badge.svg)](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/linting.yml)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)\n[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)\n\n[![Nextflow](https://img.shields.io/badge/version-%E2%89%A525.10.0-green?style=flat&logo=nextflow&logoColor=white&color=%230DC09D&link=https%3A%2F%2Fnextflow.io)](https://www.nextflow.io/)\n[![nf-core template version](https://img.shields.io/badge/nf--core_template-3.5.2-green?style=flat&logo=nfcore&logoColor=white&color=%2324B064&link=https%3A%2F%2Fnf-co.re)](https://github.com/nf-core/tools/releases/tag/3.5.2)\n[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)\n[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)\n[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)\n[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/Clinical-Genomics/oncorefiner)\n\n## Introduction\n\n**Clinical-Genomics/oncorefiner** is a bioinformatics pipeline that ...\n\n\n\n\n2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.\n\n\n\nNow, you can run the pipeline using:\n\n\n\n```bash\nnextflow run Clinical-Genomics/oncorefiner \\\n -profile \\\n --input samplesheet.csv \\\n --outdir \n```\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).\n\n## Pipeline output\n\nFor more details about the output files and reports, please refer to the [output documentation](.github/docs/output.md).\n\n## Credits\n\nClinical-Genomics/oncorefiner was originally written by Clinical Genomics Stockholm.\n\nWe thank the following people for their extensive assistance in the development of this pipeline: [Eva Caceres](https://github.com/fevac), [Kristine Bilgrav S\u00e6ther](https://github.com/kristinebilgrav)\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).\n\n## Citations\n\n\n\n\n\n\nAn extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.\n\nThis pipeline uses code and infrastructure developed and maintained by the [nf-core](https://nf-co.re) community, reused here under the [MIT license](https://github.com/nf-core/tools/blob/main/LICENSE).\n\n> **The nf-core framework for community-curated bioinformatics pipelines.**\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).\n", + "description": "# Clinical-Genomics/oncorefiner\n\n[![Open in GitHub Codespaces](https://img.shields.io/badge/Open_In_GitHub_Codespaces-black?labelColor=grey&logo=github)](https://github.com/codespaces/new/Clinical-Genomics/oncorefiner)\n[![GitHub Actions CI Status](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/nf-test.yml/badge.svg)](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/nf-test.yml)\n[![GitHub Actions Linting Status](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/linting.yml/badge.svg)](https://github.com/Clinical-Genomics/oncorefiner/actions/workflows/linting.yml)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)\n[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)\n\n[![Nextflow](https://img.shields.io/badge/version-%E2%89%A525.10.0-green?style=flat&logo=nextflow&logoColor=white&color=%230DC09D&link=https%3A%2F%2Fnextflow.io)](https://www.nextflow.io/)\n[![nf-core template version](https://img.shields.io/badge/nf--core_template-3.5.2-green?style=flat&logo=nfcore&logoColor=white&color=%2324B064&link=https%3A%2F%2Fnf-co.re)](https://github.com/nf-core/tools/releases/tag/3.5.2)\n[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)\n[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)\n[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)\n[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/Clinical-Genomics/oncorefiner)\n\n## Introduction\n\n**Clinical-Genomics/oncorefiner** is a bioinformatics pipeline that ...\n\n\n\n\n2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.\n\n\n\nNow, you can run the pipeline using:\n\n\n\n```bash\nnextflow run Clinical-Genomics/oncorefiner \\\n -profile \\\n --input samplesheet.csv \\\n --outdir \n```\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).\n\n## Pipeline output\n\nFor more details about the output files and reports, please refer to the [output documentation](.github/docs/output.md).\n\n## Credits\n\nClinical-Genomics/oncorefiner was originally written by Clinical Genomics Stockholm.\n\nWe thank the following people for their extensive assistance in the development of this pipeline: [Eva Caceres](https://github.com/fevac), [Kristine Bilgrav S\u00e6ther](https://github.com/kristinebilgrav), [Beatriz S\u00e1 Vinhas](https://github.com/beatrizsavinhas) and [Mathias Johansson](https://github.com/mathiasbio).\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).\n\n## Citations\n\n\n\n\n\n\nAn extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.\n\nThis pipeline uses code and infrastructure developed and maintained by the [nf-core](https://nf-co.re) community, reused here under the [MIT license](https://github.com/nf-core/tools/blob/main/LICENSE).\n\n> **The nf-core framework for community-curated bioinformatics pipelines.**\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).\n", "hasPart": [ { "@id": "main.nf"