All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Support for Long-Read (LR) sequencing data (e.g., Oxford Nanopore) using
minimap2for alignment and indexing. - Configurable
READ_TYPEparameter (NGSvsLR) andSCORE_RATIO_CUTOFFinconfig.yaml. - Refined co-infection detection logic using clade information from
NextStrain.tsvto improve accuracy for long-read data.
- Updated
Mapping.pyandRSV_functions.pyto handle single-end reads gracefully during QC and read binning.
- Modernized PDF and HTML report formats with a professional card-based layout and improved CSS.
- Enhanced visual hierarchy in reports using slate-blue color schemes and responsive design.
- Co-infection detection logic to identify multiple RSV strains (different subtypes or clades) in a single sample.
- Automated read binning using BWA and Samtools to separate co-infection components into distinct read sets.
- Component-specific assembly and genotyping pipelines, creating separate output directories (e.g.,
-comp1,-comp2). - Real-time console notifications in the main pipeline when co-infections are identified.
- Highlighted and styled co-infection notifications in individual sample report sections.
- Enhanced visual hierarchy in reports using slate-blue color schemes and responsive design.
- Refactored
Mapping.pyto support multi-component processing and standardized read naming. - Updated
Report_functions.pyto correctly identify references and subtypes for co-infection components by inspecting the reference directory. - Silenced IQTREE3 STDOUT by redirecting output to log files to keep the terminal output clean.
- Updated tree builder from FastTree to IQtree3 for more accurate phylogenetic analysis.
- Updated the QC calling function (
qc_call) for more nuanced quality assessment. - Improved PDF and HTML reports with more detailed QC status in summary tables.
- Refined QC calling logic to provide more concise "Good" status messages.
- Optimized PDF summary table column widths to prevent clipping and ensure all data fits within page margins.
- Resolved
NameErrorinMapping.pyrelated to KMA logging variables. - Resolved
FileNotFoundErrorin reporting when processing co-infection components that utilized different reference genomes. - Resolved
ValueErrorin percentage formatting for theQC ratecolumn in reports. - Fixed logic to hide co-infection notifications for single-infection samples in the detailed report sections.
- Updated the reference database for genotyping.
- Removed G-protein based genotyping to remain consistent with current research standards. More details can be found in the Nextclade data repository changelog.
- Implemented a version control function to track pipeline and database versions.
- Added two new mutations to the RSV-B F-protein mutation list: I64T+K65E and N208D.
- Initial release of the RSVrecon pipeline.