Skip to content

Add Nextclade workflow for Norovirus genotyping #6

@j23414

Description

@j23414

Context

There's a push to replace the Genomic Detective hardcoded files with a Nextclade dataset, as suggested in this comment and this comment. This change aims to improve automation and consistency in genotyping processes.

Creating a Nextclade dataset could require:

  1. investigating clade-defining mutations or
  2. creating a guide tree based on the Genome Detective results (ORF1_type and ORF2_type) and augur traits.

Current situation:

A Nextclade dataset would greatly facilitate automated genotyping.

Potential next steps:

  • Build a guide tree based on CDC reference sequences and identified clades
    * [ ] Scraped and cleaned into CDC_references (3).xls (estimated labor: 3+ hours) * due to recombination, genotype follows VP1 (capsid)
    * [ ] Pull capsid sequences and annotate header into norovirus_cdc_reference.fasta.txt (estimated labor: 15mins)
    * [ ] Optional: Pull reference sequences from Chhabra et al, 2019 Table 1 and duel nomenclature from Table 2

  • Compare results against other tools

  • If results are consistent across tools, identify clade-defining mutations

    • Check for homoplasies which can result in inconsistent tree topologies
    • Mask any homoplasies that disagree with the collection date information (mutation similarities that arise from convergent or parallel evolution rather than from common ancestry)
    • Check for important indels compared to the chosen reference (evaluate the reference)
    • Check if the root needs to be different then the reference in the guide tree
  • Develop Nextclade dataset files (reference sequence, pathogen.json , etc.)

  • Implement the Nextclade workflow

  • Implement rules that test the Nextclade dataset using example data

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions