Skip to content
This repository was archived by the owner on Oct 13, 2025. It is now read-only.

Requirements for a General Nomenclature Validation Service

Martin Maiers edited this page Jul 16, 2018 · 8 revisions

Background

In order to provide validation of inputs and validation of "resolution" of inputs, we want PHYCuS to be able to reference a general nomenclature validation service.

Requirements

Here is a list of functional requirements for such a service:

  • ability to validate any HLA input string and determine if its is “valid"

    • accept WHO nomenclature (HLA-A*01:01:01:01)
    • accept shortening of names (HLA-A*01:01:01 or HLA-A*01:01)
    • accept shortening of loci (A*01:01:01:01)
    • accept MAC designations (HLA-A*01:AB)
    • accept GL strings (HLA-A*01:01:01:01/HLA-A*01:01:01:02+HLA-A*03:01:01:01)
    • accept G and P codes
    • accept WMDA codes (XXXX, NNNN, POS, NEG)
  • ability to validate with version (e.g. 3.33.0) or “current” version (result may change over time)

  • after establishing the validity, the service should have the ability to validate that any HLA input string is at the stated resolution

    • 1,2,3,4-field
    • G, g-NMDP, g-DKMS
    • P (only with version)
    • Serology

g-NMDP is nucleotide sequence based: a 2-field version of the corresponding G-group.

g-DKMS is amino acid sequence based: a P-group with nulls put back in.

In most cases these are the same.

HLA-C*02:10 is an example where the encoding is different. NMDP uses g-NMDP for frequency analysis but then converts to g-DKMS for the purpose of matching (C*02:10 is ARD match to C*02:02)

Discussion

This could possibly be built quickly my making a swagger-ized version of pyARD (which calls MAC)

Clone this wiki locally