Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
72 changes: 48 additions & 24 deletions data/nextstrain/collection.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,12 @@
},
"dataset_order": [
"nextstrain/sars-cov-2/wuhan-hu-1/orfs",
"nextstrain/sars-cov-2/wuhan-hu-1/proteins",
"nextstrain/sars-cov-2/BA.2",
"nextstrain/sars-cov-2/XBB",
"nextstrain/sars-cov-2/BA.2.86",
"nextstrain/flu/h1n1pdm/ha/CY121680",
"nextstrain/flu/h1n1pdm/ha/MW626062",
"nextstrain/flu/h1n1pdm/na/MW626056",
"nextstrain/flu/h3n2/ha/CY163680",
"nextstrain/flu/h3n2/ha/EPI1857216",
"nextstrain/flu/h3n2/na/EPI1857215",
"nextstrain/flu/b/ha/KX058884",
"nextstrain/flu/b/na/CY073894",
"nextstrain/flu/vic/ha/KX058884",
"nextstrain/flu/vic/na/CY073894",
"nextstrain/flu/yam/ha/JN993010",
Expand All @@ -45,33 +41,61 @@
"nextstrain/mpox/lineage-b.1",
"nextstrain/orthoebolavirus/ebov",
"nextstrain/orthoebolavirus/sudv",
"nextstrain/measles/genome/WHO-2012",
"nextstrain/measles/N450/WHO-2012",
"nextstrain/dengue/all",
"nextstrain/yellow-fever/prM-E",
"nextstrain/hmpv/all-clades/NC_039199",
"nextstrain/herpes/vzv/NC_001348",
"nextstrain/rubella/E1",
"nextstrain/mumps/sh",
"nextstrain/mumps/genome",
"nextstrain/sars-cov-2/wuhan-hu-1/proteins",
"nextstrain/sars-cov-2/BA.2",
"nextstrain/sars-cov-2/XBB",
"nextstrain/sars-cov-2/BA.2.86",
"nextstrain/rubella/genome",
"nextstrain/flu/h3n2/pb2",
"nextstrain/flu/h3n2/pb1",
"nextstrain/flu/h3n2/pa",
"nextstrain/flu/h3n2/ha/CY163680",
"nextstrain/flu/h3n2/np",
"nextstrain/flu/h1n1pdm/pa",
"nextstrain/flu/h3n2/mp",
"nextstrain/flu/h3n2/ns",
"nextstrain/flu/h1n1pdm/pb2",
"nextstrain/flu/h1n1pdm/pb1",
"nextstrain/flu/h1n1pdm/pa",
"nextstrain/flu/h1n1pdm/ha/CY121680",
"nextstrain/flu/h1n1pdm/mp",
"nextstrain/flu/h1n1pdm/np",
"nextstrain/flu/h1n1pdm/ns",
"nextstrain/flu/h3n2/mp",
"nextstrain/flu/h3n2/pa",
"nextstrain/flu/h1n1pdm/pb2",
"nextstrain/flu/h1n1pdm/pb1",
"nextstrain/flu/h3n2/pb2",
"nextstrain/measles/genome/WHO-2012",
"nextstrain/measles/N450/WHO-2012",
"nextstrain/dengue/all",
"nextstrain/yellow-fever/prM-E",
"nextstrain/hmpv/all-clades/NC_039199",
"nextstrain/flu/vic/pa",
"nextstrain/flu/vic/pb1",
"nextstrain/flu/vic/pb2",
"nextstrain/flu/vic/pa",
"nextstrain/flu/vic/np",
"nextstrain/flu/vic/mp",
"nextstrain/flu/vic/pb2",
"nextstrain/flu/vic/ns",
"nextstrain/rubella/E1",
"nextstrain/herpes/vzv/NC_001348",
"nextstrain/mumps/sh",
"nextstrain/mumps/genome",
"nextstrain/rubella/genome"
"nextstrain/flu/b/pb1",
"nextstrain/flu/b/pb2",
"nextstrain/flu/b/pa",
"nextstrain/flu/b/np",
"nextstrain/flu/b/mp",
"nextstrain/flu/b/ns",
"nextstrain/flu/h1n1/pb2",
"nextstrain/flu/h1n1/pb1",
"nextstrain/flu/h1n1/pa",
"nextstrain/flu/h1n1/ha",
"nextstrain/flu/h1n1/np",
"nextstrain/flu/h1n1/na",
"nextstrain/flu/h1n1/mp",
"nextstrain/flu/h1n1/ns",
"nextstrain/flu/h2n2/pb2",
"nextstrain/flu/h2n2/pb1",
"nextstrain/flu/h2n2/pa",
"nextstrain/flu/h2n2/np",
"nextstrain/flu/h2n2/ha",
"nextstrain/flu/h2n2/na",
"nextstrain/flu/h2n2/mp",
"nextstrain/flu/h2n2/ns"
]
}
3 changes: 3 additions & 0 deletions data/nextstrain/flu/b/ha/KX058884/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Unreleased

Initial release
36 changes: 36 additions & 0 deletions data/nextstrain/flu/b/ha/KX058884/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Influenza B HA based on reference "B/Brisbane/60/2008"

| Key | Value |
| -------------------- | -------------------- |
| authors | [Richard Neher](https://neherlab.org), [Nextstrain](https://nextstrain.org) |
| name | Influenza B HA |
| reference | B/Brisbane/60/2008 |
| dataset path | flu/b/ha/KX058884 |
| reference accession | KX058884 |
| clade definitions | [github.com/influenza-clade-nomenclature/seasonal_B-Vic_HA/](https://github.com/influenza-clade-nomenclature/seasonal_B-Vic_HA/) |

This dataset encompasses all Influenza B viruses in humans and is based on the B/Brisbane/60/2008 reference sequence.

## Scope of this dataset
The reference sequence for this dataset precedes the deletions at positions 162ff in the HA1 protein of the virus and thus follows the canonical numbering of amino acids in the protein.

## Features
This dataset supports

* Assignment of sequences to Victoria and Yamagata lineages
* Assignment of legacy clade definitions for both lineages
* Assignment to clades and subclades based on the nomenclature defined in [github.com/influenza-clade-nomenclature/seasonal_B-Vic_HA/](https://github.com/influenza-clade-nomenclature/seasonal_B-Vic_HA/)
* Identification of glycosilation motifs
* Sequence QC
* Phylogenetic placement

## Clades of seasonal influenza viruses

In addition to these clades, "subclades" are defined to break down diversity at higher resolution and allow following the spread of different viral groups.
These follow a Pango-like nomenclature consisting of a letter followed by numbers separated by periods as in `A.3.2`.
The leading letter is an alias of a previous name.
Details of the nomenclature system can be found at [github.com/influenza-clade-nomenclature/seasonal_B-Vic_HA/](https://github.com/influenza-clade-nomenclature/seasonal_B-Vic_HA/).

## What is Nextclade dataset

Read more about Nextclade datasets in Nextclade documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html
5 changes: 5 additions & 0 deletions data/nextstrain/flu/b/ha/KX058884/genome_annotation.gff3
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
##gff-version 3
##sequence-region KX058884.1 1 1885
KX058884.1 feature gene 34 78 . + . gene_name="SigPep"
KX058884.1 feature gene 79 1119 . + . gene_name="HA1"
KX058884.1 feature gene 1120 1791 . + . gene_name="HA2"
122 changes: 122 additions & 0 deletions data/nextstrain/flu/b/ha/KX058884/pathogen.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
{
"$schema": "https://raw.githubusercontent.com/nextstrain/nextclade/refs/heads/release/packages/nextclade-schemas/input-pathogen-json.schema.json",
"schemaVersion": "3.0.0",
"alignmentParams": {
"excessBandwidth": 9,
"terminalBandwidth": 100,
"allowedMismatches": 4,
"gapAlignmentSide": "right",
"minSeedCover": 0.1
},
"compatibility": {
"cli": "3.0.0-alpha.0",
"web": "3.0.0-alpha.0"
},
"defaultCds": "HA1",
"files": {
"changelog": "CHANGELOG.md",
"examples": "sequences.fasta",
"genomeAnnotation": "genome_annotation.gff3",
"pathogenJson": "pathogen.json",
"readme": "README.md",
"reference": "reference.fasta",
"treeJson": "tree.json"
},
"qc": {
"privateMutations": {
"enabled": true,
"typical": 5,
"cutoff": 15,
"weightLabeledSubstitutions": 2,
"weightReversionSubstitutions": 1,
"weightUnlabeledSubstitutions": 1
},
"missingData": {
"enabled": false,
"missingDataThreshold": 100,
"scoreBias": 10
},
"snpClusters": {
"enabled": false,
"windowSize": 100,
"clusterCutOff": 5,
"scoreWeight": 50
},
"mixedSites": {
"enabled": true,
"mixedSitesThreshold": 4
},
"frameShifts": {
"enabled": true
},
"stopCodons": {
"enabled": true,
"ignoredStopCodons": []
}
},
"cdsOrderPreference": [
"HA1",
"HA2"
],
"maintenance": {
"website": [
"https://nextstrain.org",
"https://clades.nextstrain.org"
],
"documentation": [
"https://github.com/nextstrain/seasonal-flu"
],
"source code": [
"https://github.com/nextstrain/seasonal_flu"
],
"issues": [
"https://github.com/nextstrain/seasonal_flu/issues"
],
"organizations": [
"Nextstrain"
],
"authors": [
"Nextstrain team <https://nextstrain.org>"
]
},
"nucMutLabelMap": {},
"nucMutLabelMapReverse": {},
"shortcuts": [
"flu_b_ha",
"nextstrain/flu/b",
"nextstrain/flu/b/ha",
"nextstrain/flu/b/ha/brisbane-60-2008"
],
"aaMotifs": [
{
"name": "glycosylation",
"nameShort": "Glyc.",
"nameFriendly": "Glycosylation",
"description": "N-linked glycosylation motifs (N-X-S/T with X any amino acid other than P)",
"includeCdses": [
{
"cds": "HA1",
"ranges": []
},
{
"cds": "HA2",
"ranges": [
{
"begin": 0,
"end": 186
}
]
}
],
"motifs": [
"N[^P][ST]"
]
}
],
"attributes": {
"name": "Influenza B (all) HA",
"segment": "ha",
"reference accession": "KX058884",
"reference name": "B/Brisbane/60/2008-egg"
}
}
28 changes: 28 additions & 0 deletions data/nextstrain/flu/b/ha/KX058884/reference.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
>KX058884.1 Influenza B virus (B/Brisbane/60/2008) segment 4 hemagglutinin (HA) gene, complete cds
AGCAGAAGCAGAGCATTTTCTAATATCCACAAAATGAAGGCAATAATTGTACTACTCATGGTAGTAACAT
CCAATGCAGATCGAATCTGCACTGGGATAACATCGTCAAACTCACCACATGTCGTCAAAACTGCTACTCA
AGGGGAGGTCAATGTGACTGGTGTAATACCACTGACAACAACACCCACCAAATCTCATTTTGCAAATCTC
AAAGGAACAGAAACCAGGGGGAAACTATGCCCAAAATGCCTCAACTGCACAGATCTGGACGTAGCCTTGG
GCAGACCAAAATGCACGGGGAAAATACCCTCGGCAAGAGTTTCAATACTCCATGAAGTCAGACCTGTTAC
ATCTGGGTGCTTTCCTATAATGCACGACAGAACAAAAATTAGACAGCTGCCTAACCTTCTCCGAGGATAC
GAACATATCAGGTTATCAACCCATAACGTTATCAATGCAGAAAATGCACCAGGAGGACCCTACAAAATTG
GAACCTCAGGGTCTTGCCCTAACATTACCAATGGAAACGGATTTTTCGCAACAATGGCTTGGGCCGTCCC
AAAAAACGACAAAAACAAAACAGCAACAAATCCATTAACAATAGAAGTACCATACATTTGTACAGAAGGA
GAAGACCAAATTACCGTTTGGGGGTTCCACTCTGACAACGAGGCCCAAATGGCAAAGCTCTATGGGGACT
CAAAGCCCCAGAAGTTCACCTCATCTGCCAACGGAGTGACCACACATTACGTTTCACAGATTGGTGGCTT
CCCAAATCAAACAGAAGACGGAGGACTACCACAAAGTGGTAGAATTGTTGTTGATTACATGGTGCAAAAA
TCTGGGAAAACAGGAACAATTACCTATCAAAGGGGTATTTTATTGCCTCAAAAGGTGTGGTGCGCAAGTG
GCAGGAGCAAGGTAATAAAAGGATCCTTGCCTTTAATTGGAGAAGCAGATTGCCTCCACGAAAAATACGG
TGGATTAAACAAAAGCAAGCCTTACTACACAGGGGAACATGCAAAGGCCATAGGAAATTGCCCAATATGG
GTGAAAACACCCTTGAAGCTGGCCAATGGAACCAAATATAGACCTCCTGCAAAACTATTAAAGGAAAGGG
GTTTCTTCGGAGCTATTGCTGGTTTCTTAGAAGGAGGATGGGAAGGAATGATTGCAGGTTGGCACGGATA
CACATCCCATGGGGCACATGGAGTAGCGGTGGCAGCAGACCTTAAGAGCACTCAAGAGGCCATAAACAAG
ATAACAAAAAATCTCAACTCTTTGAGTGAGCTGGAAGTAAAGAATCTTCAAAGACTAAGCGGTGCCATGG
ATGAACTCCACAACGAAATACTAGAACTAGATGAGAAAGTGGATGATCTCAGAGCTGATACAATAAGCTC
ACAAATAGAACTCGCAGTCCTGCTTTCCAATGAAGGAATAATAAACAGTGAAGATGAACATCTCTTGGCG
CTTGAAAGAAAGCTGAAGAAAATGCTGGGCCCCTCTGCTGTAGAGATAGGGAATGGATGCTTTGAAACCA
AACACAAGTGCAACCAGACCTGTCTCGACAGAATAGCTGCTGGTACCTTTGATGCAGGAGAATTTTCTCT
CCCCACCTTTGATTCACTGAATATTACTGCTGCATCTTTAAATGACGATGGATTGGATAATCATACTATA
CTGCTTTACTACTCAACTGCTGCCTCCAGTTTGGCTGTAACACTGATGATAGCTATCTTTGTTGTTTATA
TGGTCTCCAGAGACAATGTTTCTTGCTCCATCTGTCTATAAGGGAAGTTAAGCCCTGTATTTTCCTTTAT
TGTAGTGCTTGTTTACTTGTTGTCATTACAAAGAAACGTTATTGAAAAATGCTCTTGTTACTACT
Loading