-
Notifications
You must be signed in to change notification settings - Fork 24
Requirements for a General Nomenclature Validation Service
In order to provide validation of inputs and validation of "resolution" of inputs, we want PHYCuS to be able to reference a general nomenclature validation service.
Here is a list of functional requirements for such a service:
-
ability to validate any HLA input string and determine if its is “valid"
- accept WHO nomenclature (HLA-A*01:01:01:01)
- accept shortening of names (HLA-A*01:01:01 or HLA-A*01:01)
- accept shortening of loci (A*01:01:01:01)
- accept MAC designations (HLA-A*01:AB)
- accept GL strings (HLA-A*01:01:01:01/HLA-A*01:01:01:02+HLA-A*03:01:01:01)
- accept G and P codes
- accept WMDA codes (XXXX, NNNN, POS, NEG)
-
ability to validate with version (e.g. 3.33.0) or “current” version (result may change over time)
-
after establishing the validity, the service should have the ability to validate that any HLA input string is at the stated resolution
- 1,2,3,4-field
- G, g-NMDP, g-DKMS
- P (only with version)
g-NMDP is nucleotide sequence based: a 2-field version of the corresponding G-group.
g-DKMS is amino acid sequence based: a P-group with nulls put back in.
In most cases these are the same.
HLA-C*02:10 is an example where the encoding is different. NMDP uses g-NMDP for frequency analysis but then converts to g-DKMS for the purpose of matching (C*02:10 is ARD match to C*02:02)
This could possibly be built quickly my making a swagger-ized version of pyARD (which calls MAC)