-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Note: This idea originated from the bottom of #73 (comment)
Once PGEN files can store multiallelic dosages, we should consider creating a command called dosage that can append dosage info computed using haptools' Genotypes classes to an existing PGEN file:
- It can use
GenotypesPLINKTRto compute the genotypes, sum them to compute the dosages (and possibly divide by 2 so that they remain in the range [0, 2]?) and then write them to the existing file usingPgenWriter.append_dosages_batch(). It should only change the PGEN file. The PVAR and PSAM files can remain the same - If
--outis specified, we should create a new PGEN file instead of overwriting the existing one. In the former case, we should also copy (or symlink?) the PVAR and PSAM files.
Once we do all of this, we should be able to use PLINK2 to analyze TRs! For example, we could do the following:
- Convert a TR VCF into a PGEN file
plink2 --vcf-half-call m --make-pgen 'pvar-cols=vcfheader,qual,filter,info' --vcf input.vcf --out output - Run
haptools dosageto add dosages to the PGEN filehaptools dosage output.pgen - Use the PGEN file as input to
plink2. For example, we could use it with--r[2]-[un]phased,--ld-*,--pca,--glm,--freq,--clump, or--maf/--mac. Quoting from the documentation:Dosages are always used when present
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request