Skip to content

Mapping Output for Reference Identical WT Variants #53

@bencap

Description

@bencap

Investigators will sometimes report wild type variation as multiple synonymous variants (p.Asp1=, p.Met2=, …, p.Glu=) but will sometimes describe their wild type variation as p.=, representing no changes from the underlying reference across the whole sequence.

Currently, the mapping job ignores these p.= variation descriptions of len == 3 , but is fine with the other single descriptors. Ideally, we’d support both.

It seems to me like there were two ways to describe this p.= syntax in VRS:

  • The first would be to convert the p.= string into a string that represents the variation at each position like p.[AA1=; AA2=; …; AAn=]. This has the benefit of being able to go through the vrs-python HGVS translator we are using in the mapping job, but generates a CisPhasedBlock VRS representation that probably isn’t best practice.
  • The second is to use the ReferenceLengthExpression VRS object to describe the sequence state as having a length equal to the length of the SequenceReference and no repeatSubunits. This would be a much more compact object, but we’d have to craft the object by hand (to my knowledge) because the translator can’t handle the p.= string (and I realize the difficulty of supporting such a syntax in the translator, even if it were given like NP_XXXX:p.= because of the necessity of inferring from the reference the sequence length and start/stop values for the SequenceLocation object.

Metadata

Metadata

Assignees

No one assigned

    Labels

    app: mapperTask implementation touches the mappertype: enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions