How anchor points in IMGT library rules were assigned? #1930

moosa-r · 2025-04-16T22:02:59Z

moosa-r
Apr 16, 2025

Hello,

I’m trying to understand how the anchor points in https://github.com/repseqio/library-imgt/tree/master/rules were defined.

Using the same IMGT FASTA files, I tried to built custom libraries with mixcr buildLibrary and noticed that some genes have different anchor coordinates compared with the rule‑based library.

I was wondering that if the anchors in the rule set derived directly from IMGT numbering/annotations and manually encoded in the json rule files, or were they computed by an alignment procedure?

Thank you for your time!

Best regards,
Moosa

Answered by mizraelson

Apr 28, 2025

Yes, but you dont need to compile the IMGT library as we already have it in repseqio repository.

View full answer

mizraelson · 2025-04-24T13:56:27Z

mizraelson
Apr 24, 2025
Collaborator

Hi, the rules come from gapped reference rules introduced by IMGT. We have a script that parses these sequences that is why we have those rules file. But when you create a standard MiXCR reference (using IMGT fasta seqeunces for example) we don't use gapped data, but instead infer points using alignments so they will differ. But its only the numbers as the actual position in the sequence will be the same ofc.

0 replies

moosa-r · 2025-04-25T11:37:25Z

moosa-r
Apr 25, 2025
Author

Hello @mizraelson,

Thank you for clarifying! Just to confirm that I’ve understood correctly:

To build the MiXCR references in repseqio/library-imgt:

Start with IMGT’s gapped FASTA files
IMGT provides sequences padded with gap characters so that every FR/CDR anchor falls at the same IMGT-numbered position across species and a given gene (V,D, or J). The rules_*.json files hard-code those anchor indexes relative to the padded sequences.
Infer anchors on ungapped sequences
You strip out the gaps and compute where each hard-coded anchor maps onto the gap-less sequences.
The resulting ungapped FASTA entries plus the newly inferred anchorPoints are compiled into the MiXCR IMGT reference library.

Is that an accurate summary? Please let me know if I’ve misunderstand anything.

Best,
Moosa

6 replies

mizraelson Apr 28, 2025
Collaborator

Yes, but you dont need to compile the IMGT library as we already have it in repseqio repository.

Answer selected by moosa-r

moosa-r May 26, 2025
Author

I see. Just wanted to understand how that reference library was prosuved. Thank you for your time and help.

NathaneilKnight Oct 28, 2025

Yes, but you dont need to compile the IMGT library as we already have it in repseqio repository.是的，但你不需要编译 IMGT 库，因为我们已经在 repseqio 仓库中有了它。

Thanks for clarifying that the IMGT library is already available in the repseqio repository.

My understanding is that the IMGT snapshot in the repseqio repo is from 2022, and IMGT has added/updated a number of species and alleles since then. Could you share the exact commands or a tutorial for rebuilding a MiXCR reference from the latest IMGT release?

When I try repseqio fromPaddedFasta, the resulting JSON doesn’t seem to include anchor points.

It also looks like MiXCR buildLibrary can’t read padded FASTA directly. Because chicken doesn’t have a very close relative in the default MiXCR references, I’m concerned that inferring anchors from ungapped FASTA could be inaccurate. What is the correct workflow to avoid anchor‑point errors for Gallus gallus?

Any pointers, examples, or an updated repository snapshot would be greatly appreciated.

mizraelson Oct 28, 2025
Collaborator

MiXCR has a built in reference for chicken IGH genes. Do you work with TCR or BCR?

NathaneilKnight Oct 29, 2025

MiXCR has a built in reference for chicken IGH genes. Do you work with TCR or BCR?

I'm working with BCR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How anchor points in IMGT library rules were assigned? #1930

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 6 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How anchor points in IMGT library rules were assigned? #1930

Uh oh!

moosa-r Apr 16, 2025

Replies: 2 comments · 6 replies

Uh oh!

mizraelson Apr 24, 2025 Collaborator

Uh oh!

moosa-r Apr 25, 2025 Author

Uh oh!

mizraelson Apr 28, 2025 Collaborator

Uh oh!

moosa-r May 26, 2025 Author

Uh oh!

NathaneilKnight Oct 28, 2025

Uh oh!

mizraelson Oct 28, 2025 Collaborator

Uh oh!

NathaneilKnight Oct 29, 2025

moosa-r
Apr 16, 2025

Replies: 2 comments 6 replies

mizraelson
Apr 24, 2025
Collaborator

moosa-r
Apr 25, 2025
Author

mizraelson Apr 28, 2025
Collaborator

moosa-r May 26, 2025
Author

mizraelson Oct 28, 2025
Collaborator