How anchor points in IMGT library rules were assigned? #1930
-
|
Hello, I’m trying to understand how the anchor points in https://github.com/repseqio/library-imgt/tree/master/rules were defined. Using the same IMGT FASTA files, I tried to built custom libraries with mixcr buildLibrary and noticed that some genes have different anchor coordinates compared with the rule‑based library. I was wondering that if the anchors in the rule set derived directly from IMGT numbering/annotations and manually encoded in the json rule files, or were they computed by an alignment procedure? Thank you for your time! Best regards, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
|
Hi, the rules come from gapped reference rules introduced by IMGT. We have a script that parses these sequences that is why we have those rules file. But when you create a standard MiXCR reference (using IMGT fasta seqeunces for example) we don't use gapped data, but instead infer points using alignments so they will differ. But its only the numbers as the actual position in the sequence will be the same ofc. |
Beta Was this translation helpful? Give feedback.
-
|
Hello @mizraelson, Thank you for clarifying! Just to confirm that I’ve understood correctly: To build the MiXCR references in repseqio/library-imgt:
Is that an accurate summary? Please let me know if I’ve misunderstand anything. Best, |
Beta Was this translation helpful? Give feedback.
Yes, but you dont need to compile the IMGT library as we already have it in repseqio repository.