-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Is your feature request related to a problem? Please describe.
snp_cor used in the calculateLD.R script can calculate correlation between SNPs using physical position or genetic distance. Right now this script has been built around using genetic distance in centimorgans, requiring the use of genetic maps.
containers/scripts/pgs/LDpred2/calculateLD.R
Lines 84 to 85 in df104cc
| GD <- snp_asGeneticPos(CHR, POS, dir=dirGeneticMaps, type=argGeneticMapsType, ncores=NCORES) | |
| if (is.null(GD)) stop('Genetic distance is not available') |
containers/scripts/pgs/LDpred2/calculateLD.R
Lines 132 to 133 in df104cc
| corr0 <- snp_cor(G, ind.col=indices.G, ind.row=individualSample, size=argWindowSize/1000, | |
| infos.pos=GD[indices.G], ncores=NCORES, thr_r2=argThresholdR2) |
An unrelated issue is that I've seen warnings when running this script. This can be due to missing files as these lines:
containers/scripts/pgs/LDpred2/calculateLD.R
Lines 128 to 131 in df104cc
| if (nDataPoints == 0) { | |
| warning('\nSkipping chromosome ', chr,'. Reason: 0 SNPs available\n') | |
| next | |
| } |
Could be fixed by appending warnings() to the end of the script, but it would be better that they were outputted immediately. Guess there is some option one can set at the start of the script for that to happen.
Describe the solution you'd like
Make it possible to calculate LD using physical position instead. It's probably best to replace the --arg-window-size with --mode-distance-physical and --mode-distance-genetic that both take an optional argument (distance in no basepairs or centimorgans). Should be a requirement to specify one of these.
The size argument to snp_cor is interpreted as basepairs without the info argument, but kilobasepairs with it. Probably better to specify the same unit in both modes, but convert them prior o passing values to snp_cor.