Replies: 1 comment
-
Your GWAS data seems a bit disorganized, could you send me the full summary data? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, thank you for developing and maintaining gsMap!
I'm trying to format a GWAS summary statistics file using gsmap format_sumstats, but encountered an issue: all ~13 million SNPs were removed due to "missing values", resulting in an output file with 0 SNPs.
Here is a summary of what I observed:
gsmap format_sumstats \ --sumstats "/path/to/gsmap.max.Rating.glm.linear" \ --out "/path/to/output/CPI" \ --snp "ID" \ --a1 "A1" \ --a2 "OMITTED" \ --beta "BETA" \ --se "SE" \ --n "OBS_CT" \ --p "P" \ --frq "A1_FREQ" \ --z "T_STAT" \ --chr "CHROM" \ --pos "POS" \ --format "gsMap"
A sample of the input file:
CHROM POS ID REF ALT PROVISIONAL_REF? A1 OMITTED A1_FREQ TEST OBS_CT BETA SE T_STAT P ERRCODE 1 10616 1:10616_CCGCCGTTGCAAAGGCGCGCCG_C CCGCCGTTGCAAAGGCGCGCCG C Y CCGCCGTTGCAAAGGCGCGCCG C 0.00140553 ADD 131979 -0.386364 0.150018 -2.57545 0.010012 .
Log output shows:
[2025-04-09 12:33:49,991] INFO | gsMap.format_sumstats - ------Formating gwas data for /home/ml/wangxiuzhi/UKBB/GWAS_pain_ratings_allsub/New2_ Data/gsmap.max.Rating.glm.linear... /home/ml/wangxiuzhi/anaconda3/envs/gsmap/lib/python3.11/site-packages/gsMap/format_sumstats.py:405: FutureWarning: The 'delim_whitespace' keyw ord in pd.read_csv is deprecated and will be removed in a future version. Use ``sep='\s+'`` instead gwas = pd.read_csv( [2025-04-09 12:34:16,137] INFO | gsMap.format_sumstats - Read 13818744 SNPs from /home/ml/wangxiuzhi/UKBB/GWAS_pain_ratings_allsub/New2_Data/g smap.max.Rating.glm.linear. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - Iterpreting column names as follows: [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - ID: Variant ID (e.g., rs number). [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - A1: Allele 1, interpreted as the effect allele for signed sumstat. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - OMITTED: Allele 2, interpreted as the non-effect allele for signed sumstat. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - BETA: [linear/logistic] regression coefficient (0 → no effect; above 0 → A1 is trait/ risk increasing). [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - SE: Standard error of the regression coefficient. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - P: P-Value. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - T_STAT: Z-Value. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - OBS_CT: Sample size. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - A1_FREQ: Allele frequency of A1. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - CHROM: Chromsome. [2025-04-09 12:34:16,140] INFO | gsMap.format_sumstats - POS: SNP positions. [2025-04-09 12:34:16,141] INFO | gsMap.format_sumstats - Filtering SNPs as follows: [2025-04-09 12:34:21,765] INFO | gsMap.format_sumstats - Removed 13818744 SNPs with missing values. [2025-04-09 12:34:21,767] INFO | gsMap.format_sumstats - Removed 0 SNPs with MAF <= 0.01. [2025-04-09 12:34:21,768] INFO | gsMap.format_sumstats - Removed 0 SNPs with out-of-bounds p-values. [2025-04-09 12:34:21,769] INFO | gsMap.format_sumstats - Removed 0 variants that were not SNPs or were strand-ambiguous. [2025-04-09 12:34:21,769] INFO | gsMap.format_sumstats - Removed 0 SNPs with duplicated rs numbers. [2025-04-09 12:34:21,770] INFO | gsMap.format_sumstats - Removed 0 SNPs with N < nan. [2025-04-09 12:34:22,287] INFO | gsMap.format_sumstats - Summary of GWAS data: [2025-04-09 12:34:22,287] INFO | gsMap.format_sumstats - Mean chi^2 = nan [2025-04-09 12:34:22,288] INFO | gsMap.format_sumstats - Lambda GC = nan [2025-04-09 12:34:22,288] INFO | gsMap.format_sumstats - Max chi^2 = nan [2025-04-09 12:34:22,288] INFO | gsMap.format_sumstats - 0 Genome-wide significant SNPs (some may have been removed by filtering). [2025-04-09 12:34:22,289] INFO | gsMap.format_sumstats - Writing summary statistics for 0 SNPs to /home/ml/wangxiuzhi/UKBB/gsMAP/Data/CPI.sumstats.gz. [2025-04-09 12:34:22,315] INFO | gsMap - Finished running format_sumstats at: 2025-04-09 12:34:22. [2025-04-09 12:34:22,788] INFO | gsMap - Resource usage summary: [2025-04-09 12:34:22,788] INFO | gsMap - • Wall clock time: 33.88 seconds [2025-04-09 12:34:22,788] INFO | gsMap - • CPU time: 40.02 seconds [2025-04-09 12:34:22,788] INFO | gsMap - • Average CPU utilization: 124.6% [2025-04-09 12:34:22,789] INFO | gsMap - • Peak memory usage: 6.33 GB
Could you please advise how to resolve this?
Best regards
Beta Was this translation helpful? Give feedback.
All reactions