Skip to content

Commit 1644b80

Browse files
committed
Added test options and documentation; cleaned up code
1 parent e27c83b commit 1644b80

File tree

10 files changed

+243
-215
lines changed

10 files changed

+243
-215
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
VERSION=0.3
22

33
CPPFLAGS= -Ofast -flto -pipe -I$(cdir)/src/htslib -I$(cdir)/src/cdflib -I$(cdir)/src/tabixpp -L$(cdir)/src/htslib -L$(cdir)/src/cdflib -L$(cdir)/src/tabixpp
4-
CXXFLAGS= -std=c++11 -DNDEBUG
4+
CXXFLAGS= -std=c++11
55
FFLAGS=
66
LDFLAGS= -lz -lm
77

README.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,8 @@
22

33
A C++ tool for Gene-based Analysis with oMniBus, Integrative Tests
44

5-
- Implements SKAT, burden, and ACAT gene-based test methods using variant- or region-based functional annotations
6-
- Calculates annotation-stratified gene-based tests (e.g., TWAS/PrediXcan tests using eSNPs, gene-based tests using only coding variants, and gene-based tests using enhancer-to-target-gene maps)
7-
- Calculates omnibus gene-based tests by aggregating across annotation classes
5+
- Implements several gene-based test forms (quadratic: weighted sum of Zsq, linear: weighted sum of Z, and maximum Zsq) to aggregate GWAS single-variant summary statistics cross-referenced with variant- or region-based functional annotations
6+
- Calculates annotation-stratified gene-based tests (e.g., TWAS/PrediXcan tests using eSNPs, gene-based tests using only coding variants, and gene-based tests using enhancer-to-target-gene maps), and omnibus tests by combining p-values for each gene
87
- Inputs: GWAS association summary statistics file (chromosome, position, ref/alt allele, and z-score or beta-hat + se), annotation files, and LD reference panel
98

109

@@ -38,7 +37,7 @@ chr1 769200 769400 Enhancer chr1:769200:769400 C1orf170:3.36|PERM1:3.36
3837

3938
- Association tests for individual regulatory elements is reported in `*.stratified_out.txt` files, and gene-based p-values (aggregating across regulatory elements for each gene) in `*.summary_out.txt` files.
4039

41-
- **Aggregation Methods for Regulatory Elements.** By default, GAMBIT aggregates test statistics across variants in regulatory elements using a weighted sum of single-variant chi-squared statistics (SKAT gene-based test). To instead use weighted ACAT to combine single-variant p-values, specify `--acat`.
40+
- **Aggregation Methods for Regulatory Elements.** By default, GAMBIT aggregates test statistics across variants in regulatory elements using a weighted sum of single-variant chi-squared statistics (SKAT gene-based test). To instead use weighted ACAT or HMP to combine single-variant p-values, specify `--no-skat` and a p-value combination method via `--pcomb`.
4241

4342
#### Gene-Based Analysis with Coding and Other Annotated Variants
4443

@@ -57,7 +56,7 @@ UTR UTR5 Utr5
5756

5857
- **Gene-Based Test Output.** Test statistics stratified by gene and annotation subclass are provided in `*.stratified_out.txt` files, and gene-based p-values (aggregating across annotation classes for each gene) in `*.summary_out.txt` files.
5958

60-
- **Variant Aggregation Methods.** By default, GAMBIT aggregates test statistics across variants using a weighted sum of single-variant chi-squared statistics (SKAT gene-based test). To instead use weighted ACAT to combine single-variant p-values, specify `--acat`.
59+
- **Variant Aggregation Methods.** By default, GAMBIT aggregates test statistics across variants using a weighted sum of single-variant chi-squared statistics (SKAT gene-based test). To instead use weighted ACAT or HMP to combine single-variant p-values, specify `--no-skat` and a p-value combination method via `--pcomb`.
6160

6261
#### TWAS Analysis
6362
- To compute TWAS/PrediXcan gene-based tests using GAMBIT, specify an eWeight file via `--eweights my_eWeights.txt.gz`, formatted
@@ -73,11 +72,11 @@ UTR UTR5 Utr5
7372

7473
- The `BETAS` field format is `eGene_A=Weight_A1@Tissue_A1;Weight_A2@Tissue_A2|eGene_B=Weight_B1@Tissue_B1`, and labels for tissue IDs can be specified in the header.
7574
- **Subsetting tissues.** To restrict analysis to a subset of tissues/cell-types, specify a comma-separated list of tissues following the `--tissues` flag. By default, GAMBIT includes all tissues/cell-types present in the eWeight file.
76-
- **Tissue Aggregation for Omnibus tests.** GAMBIT reports both single-tissue TWAS/PrediXcan analysis results, and omnibus tests results aggregating across all specified tissues/cell-types for each eGene. Omnibus p-values for multi-tissue TWAS/PrediXcan analysis can be calculated in GAMBIT using either 1) the maximum single-tissue test statistic based on the joint distribution of single-tissue statistics, 2) the sum of squared single-tissue z-scores (analogous to SKAT), or 3) ACAT [default]. Omnibus test method for multi-tissue analysis can be specified via `--tissue-aggreg` (`ACAT`, `MinP`, `SKAT`, or `ALL`).
75+
- **Tissue Aggregation for Omnibus tests.** GAMBIT reports both single-tissue TWAS/PrediXcan analysis results, and omnibus tests results aggregating across all specified tissues/cell-types for each eGene. Omnibus p-values for multi-tissue TWAS/PrediXcan analysis can be calculated in GAMBIT using either 1) the maximum single-tissue test statistic based on the joint distribution of single-tissue statistics, 2) the sum of squared single-tissue z-scores (analogous to SKAT), or 3) PCOMB for ACAT or HMP [default]. Omnibus test method for multi-tissue analysis can be specified via `--tissue-aggreg` (`PCOMB`, `MinP`, `SKAT`, or `ALL`). P-value combination method can be specified via `--pcomb` (`ACAT` or `HMP`).
7776
- **Single-tissue and omnibus test output.** Gene-based tests and p-values for each eGene-tissue pair are reported in `*.stratified_out.txt` files, and omnibus p-values (aggregating across all tissues for each eGene) in `*.summary_out.txt` files.
7877

7978
#### dTSS-Weighted Gene-Based Tests
80-
- To incorporate un-annotated regulatory variants in gene-based analysis, GAMBIT implements a dTSS (distance to Transcription Start Site) weighted gene-based test, which aggregates all single-variant p-values within a specified window from each gene's TSS using ACAT and assigns higher weight to variants nearer the TSS using an exponential decay function.
79+
- To incorporate un-annotated regulatory variants in gene-based analysis, GAMBIT implements a dTSS (distance to Transcription Start Site) weighted gene-based test, which aggregates all single-variant p-values within a specified window from each gene's TSS using weighted ACAT or HMP and assigns higher weight to variants nearer the TSS using an exponential decay function.
8180
- To compute dTSS-weighted gene-based tests, specify a TSS bed file via `--tss-bed my_tss_bed.bed.gz`, fomatted
8281

8382
```

bin/GAMBIT

66.7 KB
Binary file not shown.

src/Main.cpp

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,10 @@ void print_usage() {
2020
cerr << " --ldref-only : only retain variants with complete LD information\n";
2121
cerr << " --tissues STR : only use eSNPs from tissues listed in file\n";
2222
cerr << " test statistics\n";
23+
cerr << " --pcomb : p-value combination method (\"ACAT\" or \"HMP\") \n";
2324
cerr << " --tss-alpha : alpha values (comma-separated) for dTSS weights \n";
24-
cerr << " --acat : use ACAT rather than SKAT for regulatory elements & exons\n";
25-
cerr << " --tissue-aggreg [ACAT] : method to aggregate eSNP tests across tissues (\"ACAT\", \"MinP\", \"SKAT\", or \"All\") \n";
25+
cerr << " --no-skat : use ACAT or HMP rather than SKAT for regulatory elements & exons\n";
26+
cerr << " --tissue-aggreg: method to aggregate eSNP tests across tissues (\"PCOMB\" for ACAT/HMP (default), \"MaxZsq\", \"SKAT\", or \"All\") \n";
2627
cerr << " --tissues STR : only use eSNPs from listed tissues (comma-separated list or file)\n";
2728
cerr << " output\n";
2829
cerr << " --region STR : restrict analysis to specified region \n";
@@ -74,7 +75,9 @@ int main (int argc, char *argv[]) {
7475

7576
char default_test = 'Q'; // default is 'Q' (SKAT)
7677

77-
int cauchy_no_skat = 0;
78+
int pcomb_no_skat = 0;
79+
80+
string PCOMB_METHOD = "HMP";
7881

7982
string TSS_VERBOSITY_PVAL_STR = "";
8083
double TSS_VERBOSITY_PVAL = 1.00;
@@ -93,6 +96,7 @@ int main (int argc, char *argv[]) {
9396
{"no-memo", no_argument, &no_memo_LD, 1},
9497
{"preload", no_argument, &preload_LD, 1},
9598
{"ldref", required_argument, 0, 'l'},
99+
{"pcomb", required_argument, 0, 'z'},
96100
{"gwas", required_argument, 0, 'g'},
97101
{"anno-defs", required_argument, 0, 'a'},
98102
{"defs", required_argument, 0, 'a'},
@@ -107,7 +111,7 @@ int main (int argc, char *argv[]) {
107111
{"merge-tissues", no_argument, &tmerge, 1},
108112
{"tissue-aggreg", required_argument, 0, 'm'},
109113
{"debug", no_argument, &debug_mode, 1},
110-
{"acat", no_argument, &cauchy_no_skat, 1},
114+
{"no-skat", no_argument, &pcomb_no_skat, 1},
111115
{"stdout", no_argument, &print_screen, 1},
112116
{"bayes", no_argument, &bayes, 1},
113117
{"region", required_argument, 0, 'r' },
@@ -118,7 +122,7 @@ int main (int argc, char *argv[]) {
118122
{0, 0, 0, 0}
119123
};
120124
int long_index =0;
121-
while ((opt = getopt_long(argc,argv,"l:g:a:f:s:x:y:e:j:t:b:m:r:p:v:",long_options,&long_index)) != -1) {
125+
while ((opt = getopt_long(argc,argv,"l:g:a:f:s:x:y:z:e:j:t:b:m:r:p:v:",long_options,&long_index)) != -1) {
122126
switch (opt) {
123127
case 'f' : afile = optarg;
124128
break;
@@ -142,6 +146,8 @@ int main (int argc, char *argv[]) {
142146
break;
143147
case 'y' : TSS_WINDOW_STR = optarg;
144148
break;
149+
case 'z' : PCOMB_METHOD = optarg;
150+
break;
145151
case 'e' : TSS_VERBOSITY_PVAL_STR = optarg;
146152
break;
147153
case 'b' : bfile = optarg;
@@ -226,6 +232,8 @@ int main (int argc, char *argv[]) {
226232
setPreload(false);
227233
}
228234

235+
setCombPvalMethod(PCOMB_METHOD);
236+
229237
setMultiForm(multi_test_type);
230238

231239
if( JUMP_DIST_STR != "" ){
@@ -244,8 +252,8 @@ int main (int argc, char *argv[]) {
244252

245253
initCHR();
246254

247-
if( cauchy_no_skat ){
248-
default_test = 'C'; // rather than SKAT, using Cauchy/ACAT test (no LD)
255+
if( pcomb_no_skat ){
256+
default_test = 'C'; // rather than SKAT, using ACAT/HMP test (no LD)
249257
}
250258

251259
if( TSS_VERBOSITY_PVAL_STR != "" ){

src/distributionFunctions.cpp

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
#include "distributionFunctions.hpp"
2+
3+
using namespace std;
4+
5+
double pchisq( double stat, double df) {
6+
double cdf, ccdf;
7+
cumchi ( &stat, &df, &cdf, &ccdf );
8+
return ccdf;
9+
}
10+
11+
double pchisq( double stat, double df, double ncp) {
12+
double cdf, ccdf;
13+
cumchn( &stat, &df, &ncp, &cdf, &ccdf);
14+
return ccdf;
15+
}
16+
17+
double qchisq( double cdf, double df) {
18+
double stat, ccdf, bd;
19+
int which = 2;
20+
ccdf = 1 - cdf;
21+
int status;
22+
cdfchi ( &which, &cdf, &ccdf, &stat, &df, &status, &bd );
23+
return stat;
24+
}
25+
26+
double pcauchy(double x){
27+
return 0.5 + atan( x )/M_PI;
28+
}
29+
30+
double qcauchy(double q){
31+
return tan( M_PI*(q - 0.5) );
32+
}

src/distributionFunctions.hpp

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
#ifndef DISTRIBUTIONFUNCTIONS_HPP
2+
#define DISTRIBUTIONFUNCTIONS_HPP
3+
4+
#include "cdflib/cdflib.hpp"
5+
#include "eigenmvn/eigenmvn.hpp"
6+
#include "ROOT_Math/Landau.hpp"
7+
8+
#include <iostream>
9+
#include <sstream>
10+
#include <fstream>
11+
#include <stdio.h>
12+
#include <stdlib.h>
13+
#include <math.h>
14+
#include <string>
15+
#include <algorithm>
16+
#include <vector>
17+
#include <unordered_map>
18+
#include <unordered_set>
19+
#include <set>
20+
#include <map>
21+
22+
using namespace std;
23+
24+
double pcauchy(double);
25+
double qcauchy(double);
26+
27+
double pchisq(double, double);
28+
double pchisq(double, double, double);
29+
double qchisq(double, double);
30+
31+
#endif

0 commit comments

Comments
 (0)