Skip to content

Pangolin v1.12 as used by GISAID misclassifies a lot of true BA.5* as BA.2* #48

@corneliusroemer

Description

@corneliusroemer

Using covSpectrum's advanced queries, I've noticed that the pango assignments that come from GISAID are quite often wrong. I think GISAID still uses pangoLEARN as opposed to Usher. They say they are using designation version 1.12

In Poland as much as 30% of sequences are misclassified BA.2 even though they are true BA.5. In Germany around 5% are misclassified.

Is this due to Scorpio or pangoLEARN?

Something I noticed when looking at a sample of misassigned sequences is that many of them miss the RBD - but that shouldn't stop pangoLEARN/Scorpio from being confident that (most of) these are true BA.5

Here's the full list of sequences that GISAID calls BA.2* but that are BA.5* by Nextclade: https://lapis.cov-spectrum.org/gisaid/v1/sample/gisaid-epi-isl?region=Europe&dateFrom=2022-05-09&variantQuery=nextcladePangoLineage%3ABA.5*++%26+BA.2*&host=Human&accessKey=9Cb3CqmrFnVjO3XCxQLO6gUnKPd&orderBy=random

Here's a sample screenshot from Nextclade showing the RBD region:
image

Query: (https://cov-spectrum.org/explore/Europe/AllSamples/Past3M/variants?variantQuery=nextcladePangoLineage%3ABA.5*++%26+BA.2*&aaMutations1=S%3A346&pangoLineage1=BA.5*&)
image

image

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions