-
Notifications
You must be signed in to change notification settings - Fork 3
Gene FusionsΒ #1506
Description
Some labs report on somatic gene fusions. We are currently ignoring these.
The problem is, classifications are very "variant centric" - or "allele centric" at least. VariantGrid is about variants... It would be a large amount of work to pull the variants out of classifications. It is probably least disruptive to keep it VCF/variant-like, so we can run through the VCF import pipelines, export as VCF etc
A fusion is inherently a two-locus event, while we make a lot of assumptions about single-locus. It may be possible to represent a gene fusion as a breakend (BND) notation specifically for structural variants, including fusions. A fusion is modelled as a pair of breakends one on each partner gene:
# Chromosome 9, BCR breakend
9 133729451 bnd_1 N N]22:23632600] . PASS SVTYPE=BND;MATEID=bnd_2
# Chromosome 22, ABL1 breakend
22 23632600 bnd_2 N ]9:133729451]N . PASS SVTYPE=BND;MATEID=bnd_1
The bracket notation encodes the orientation/strand of each side of the join. So a fusion is essentially two linked variant records.
At the moment we don't have breakends, and it has to be 1 variant / allele per classification (this would need to link 2)
The other alternative is a bit of a hack where we just insert against 1 end and keep track of the fusion etc via INFO fields or whatever. In some ways, if we can resolve 2 fusions to the same "variant" that's all that matters for discordance. Any range/overlap code would need to take this into account and ignore these.
The gene fusions listed are like GENEA-GENEB - do they even have base-level coordinates? If some are base level and some aren't - how do we check if they are discodant? Need to even define what "are 2 gene fusions the same" means