-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Using covSpectrum's advanced queries, I've noticed that the pango assignments that come from GISAID are quite often wrong. I think GISAID still uses pangoLEARN as opposed to Usher. They say they are using designation version 1.12
In Poland as much as 30% of sequences are misclassified BA.2 even though they are true BA.5. In Germany around 5% are misclassified.
Is this due to Scorpio or pangoLEARN?
Something I noticed when looking at a sample of misassigned sequences is that many of them miss the RBD - but that shouldn't stop pangoLEARN/Scorpio from being confident that (most of) these are true BA.5
Here's the full list of sequences that GISAID calls BA.2* but that are BA.5* by Nextclade: https://lapis.cov-spectrum.org/gisaid/v1/sample/gisaid-epi-isl?region=Europe&dateFrom=2022-05-09&variantQuery=nextcladePangoLineage%3ABA.5*++%26+BA.2*&host=Human&accessKey=9Cb3CqmrFnVjO3XCxQLO6gUnKPd&orderBy=random
Here's a sample screenshot from Nextclade showing the RBD region:

Query: (https://cov-spectrum.org/explore/Europe/AllSamples/Past3M/variants?variantQuery=nextcladePangoLineage%3ABA.5*++%26+BA.2*&aaMutations1=S%3A346&pangoLineage1=BA.5*&)

