Skip to content
This repository was archived by the owner on Oct 13, 2025. It is now read-only.

Specifications

Martin Maiers edited this page Jun 21, 2022 · 5 revisions

File format

Genotype format

CSV

  • ID: a(ny) unique identifier within the file
  • GL-STRING: a GL-String per the GL String Manuscript
  • POP_i: one or more population identifiers. ',' delimits (up to) two lists ';' delimits the list

ID,GL-STRING,POP1;POP2;POP3,POP4

This is an individual with two population identity strings:

  • one is a mix of POP1, POP2 and POP3
  • the other is POP4

Example:

355310045,A*33:03+A*68:01^B*50:01+B*57:04^C*06:02+C*18:01^DRB1*01:02+DRB1*13:02,AAFA,AAFA

HPF format

CSV

  • haplotype: HLA haplotype delimited with "~" in GL-String format
  • population: A population identifier
  • frequency: Floating point number

H,P,F

Example:

A*24:07~C*04:01~B*35:05~DRB3*03:01~DRB1*12:02~DQB1*03:01,FILII,0.04079961611346017 A*24:02~C*07:02~B*38:02~DRB5*01:01~DRB1*15:02~DQB1*05:02,FILII,0.034274561754028245 A*34:01~C*15:02~B*40:02~DRB5*01:01~DRB1*15:02~DQB1*05:02,FILII,0.027284337715666348

Metadata

TODO

Clone this wiki locally