Skip to content

Commit afe4d85

Browse files
author
Max Leiserson
committed
Merge pull request #6 from raphael-group/eccb2016-experiments
Add of experiments reproducing submission to ECCB 2016
2 parents ce73069 + ab79644 commit afe4d85

13 files changed

+1100
-0
lines changed

experiments/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
tables/
2+
figures/

experiments/eccb2016/README.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Experiments for ECCB 2016 submission #
2+
3+
### Description ###
4+
5+
This directory contains the scripts for reproducing the experiments, tables, and figures for the 2016 ECCB submission. You can see the commands for reproducing the in `commands.sh`. This will perform the following:
6+
7+
1. Download and unpack the raw mutation data used in the experiments into `data/`.
8+
2. Preprocess the mutation data and generate permuted matrices into `data/mutations`, `data/permuted`, and `data/weights`.
9+
3. Reproduce the tables and figures from the paper to `tables/` and `figures/`, with intermediate output stored in `output`.
10+
11+
### Set up ###
12+
13+
Our experiments require additional Python modules to be installed. You can install them with:
14+
15+
pip install -r requirements.txt
16+
17+
### Data ###
18+
19+
In order to reproduce the experiments, you will need to download the COADREAD, THCA, and UCEC MAFs and the list of hypermutator samples in the tarball we've posted on our website. We've included the relevant commands in `commands.sh`, but you can also find the data at the URL below:
20+
21+
> http://compbio-research.cs.brown.edu/projects/weighted-exclusivity-test/data/eccb2016.tar
22+
23+
### Differences ###
24+
25+
There are only a few, relatively small differences in the experiments here and those in the actual submission:
26+
27+
* The mutation probability matrices in `figures/Figure 2.pdf` will include the "pseudocounts" of 1/2N (where N is the number of permutations) for cells with zeros. These were missing from the ECCB 2016 submission and were shown as zeros.
28+
* Obviously, the permutations are randomly generated, so the weighted and permutational results may look different, though we wouldn't expect this difference to be large. For example, when I ran this code to reproduce the experiments, we found 49, 5305, and 6804 triples with FDR < 0.001 for THCA, COADREAD, and UCEC, respectively (compared to 48, 5286, 6790 in the paper). Related, we also found that the top triples (ranked 4-10) in THCA all had very similar weighted p-values so were reorded when regenerating the weight matrices.
29+
* The `commands.sh` script currently only generates 10,000 permutations for comparing the permutational vs. the saddlepoint (see Figure 5 in the submission), while we used 1,000,000 in the paper. This is because 1,000,000 permutations is computationally intensive (it will take days to generate without parallelization) and storage intensive (terabytes). Thus, the correlations between the weighted test and permutational reported in Section 3.4 and Figure 5 were not reproduced. We believe the code to be correct, however, and so the results can be reproduced by changing the `TOTAL_PERMUTATIONS` variable in `commands.sh`.

experiments/eccb2016/commands.sh

Lines changed: 318 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,318 @@
1+
#!/bin/sh
2+
3+
################################################################################
4+
# SETTINGS #
5+
################################################################################
6+
NUM_CORES=25
7+
COADREAD_MIN_FREQ=20
8+
THCA_MIN_FREQ=5
9+
UCEC_MIN_FREQ=30
10+
LENGTH_THRESHOLD=600
11+
FDR_CUTOFF=0.001
12+
TOTAL_PERMUTATIONS=10000 # set to 1000000 for ECCB 2016 submission
13+
PAIRS_PERMUTATIONS=10000
14+
WEIGHTS_PERMUTATIONS=1000
15+
16+
################################################################################
17+
# SET UP DIRECTORIES #
18+
################################################################################
19+
CODE_DIR=../../
20+
EXPERIMENT_DIR=experiments/eccb2016
21+
DATA_DIR=$EXPERIMENT_DIR/data
22+
SCRIPTS_DIR=$EXPERIMENT_DIR/scripts
23+
FIGURES_DIR=$EXPERIMENT_DIR/figures
24+
TABLES_DIR=$EXPERIMENT_DIR/tables
25+
MUTATIONS_DIR=$DATA_DIR/mutations
26+
PERMUTED_DIR=$DATA_DIR/permuted
27+
WEIGHTS_DIR=$DATA_DIR/weights
28+
29+
OUTPUT_DIR=$EXPERIMENT_DIR/output
30+
PAIRS_DIR=$OUTPUT_DIR/pairs
31+
TRIPLES_DIR=$OUTPUT_DIR/triples
32+
GENE_LENGTH_FILE=$DATA_DIR/gene-lengths.tsv
33+
34+
cd $CODE_DIR
35+
mkdir -p $MUTATIONS_DIR $WEIGHTS_DIR $PERMUTED_DIR $OUTPUT_DIR $PAIRS_DIR
36+
mkdir -p $TRIPLES_DIR $TABLES_DIR $FIGURES_DIR
37+
38+
################################################################################
39+
# DOWNLOAD AND UNPACK DATA #
40+
################################################################################
41+
cd $EXPERIMENT_DIR
42+
wget http://compbio-research.cs.brown.edu/projects/weighted-exclusivity-test/data/eccb2016.tar
43+
tar -xvf eccb2016.tar && rm eccb2016.tar
44+
45+
################################################################################
46+
# PRE-PROCESS MUTATIONS #
47+
################################################################################
48+
cd $CODE_DIR
49+
50+
# Colorectal
51+
COADREAD_MUTATIONS_WOUT_LENGTH_FILTERING=$MUTATIONS_DIR/coadread-mutations-wout-length-filtering.json
52+
python process_mutations.py -mf $DATA_DIR/mafs/COADREAD.maf \
53+
-hf $DATA_DIR/sample_lists/coadread-hypermutators.txt \
54+
-o $COADREAD_MUTATIONS_WOUT_LENGTH_FILTERING
55+
56+
COADREAD_MUTATIONS=$MUTATIONS_DIR/coadread-mutations.json
57+
python $SCRIPTS_DIR/remove_genes_with_no_length.py -lf $GENE_LENGTH_FILE \
58+
-mf $COADREAD_MUTATIONS_WOUT_LENGTH_FILTERING \
59+
-o $COADREAD_MUTATIONS
60+
61+
# # Thyroid
62+
THCA_MUTATIONS_WOUT_LENGTH_FILTERING=$MUTATIONS_DIR/thca-mutations-wout-length-filtering.json
63+
python process_mutations.py -mf $DATA_DIR/mafs/THCA.maf \
64+
-o $THCA_MUTATIONS_WOUT_LENGTH_FILTERING
65+
66+
THCA_MUTATIONS=$MUTATIONS_DIR/thca-mutations.json
67+
python $SCRIPTS_DIR/remove_genes_with_no_length.py -lf $GENE_LENGTH_FILE \
68+
-mf $THCA_MUTATIONS_WOUT_LENGTH_FILTERING \
69+
-o $THCA_MUTATIONS
70+
71+
# # Endometrial
72+
UCEC_MUTATIONS_WOUT_LENGTH_FILTERING=$MUTATIONS_DIR/ucec-mutations-wout-length-filtering.json
73+
python process_mutations.py -mf $DATA_DIR/mafs/UCEC.maf \
74+
-hf $DATA_DIR/sample_lists/ucec-hypermutators.txt \
75+
-o $UCEC_MUTATIONS_WOUT_LENGTH_FILTERING
76+
77+
UCEC_MUTATIONS=$MUTATIONS_DIR/ucec-mutations.json
78+
python $SCRIPTS_DIR/remove_genes_with_no_length.py -lf $GENE_LENGTH_FILE \
79+
-mf $UCEC_MUTATIONS_WOUT_LENGTH_FILTERING \
80+
-o $UCEC_MUTATIONS
81+
82+
################################################################################
83+
# GENERATE PERMUTED MATRICES AND COMPUTE WEIGHTS #
84+
################################################################################
85+
WEIGHTS_SUFFIX=`printf %.E $WEIGHTS_PERMUTATIONS`
86+
PAIRS_PERMUTATION_SUFFIX=`printf %.E $PAIRS_PERMUTATIONS`
87+
TOTAL_PERMUTATION_SUFFIX=`printf %.E $PAIRS_PERMUTATIONS`
88+
89+
# Set up directories
90+
COADREAD_PERMUTATIONS=$PERMUTED_DIR/coadread
91+
THCA_PERMUTATIONS=$PERMUTED_DIR/thca
92+
UCEC_PERMUTATIONS=$PERMUTED_DIR/ucec
93+
mkdir -p $COADREAD_PERMUTATIONS $THCA_PERMUTATIONS $UCEC_PERMUTATIONS
94+
95+
# Colorectal
96+
COADREAD_WEIGHTS=$WEIGHTS_DIR/coadread-np${WEIGHTS_SUFFIX}.npy
97+
python permute_matrix.py -mf $COADREAD_MUTATIONS -np $TOTAL_PERMUTATIONS \
98+
-o $COADREAD_PERMUTATIONS
99+
100+
python compute_mutation_probabilities.py -pf $COADREAD_PERMUTATIONS/*.json \
101+
-mf $COADREAD_MUTATIONS -np $WEIGHTS_PERMUTATIONS -nc $NUM_CORES \
102+
-o $COADREAD_WEIGHTS
103+
104+
# Thyroid
105+
THCA_WEIGHTS=$WEIGHTS_DIR/thca-np${WEIGHTS_SUFFIX}.npy
106+
python permute_matrix.py -mf $THCA_MUTATIONS -np $TOTAL_PERMUTATIONS \
107+
-o $THCA_PERMUTATIONS
108+
109+
python compute_mutation_probabilities.py -pf $THCA_PERMUTATIONS/*.json \
110+
-mf $THCA_MUTATIONS -np $WEIGHTS_PERMUTATIONS -nc $NUM_CORES \
111+
-o $THCA_WEIGHTS
112+
113+
# Endometrial
114+
UCEC_WEIGHTS=$WEIGHTS_DIR/ucec-np${WEIGHTS_SUFFIX}.npy
115+
python permute_matrix.py -mf $UCEC_MUTATIONS -np $TOTAL_PERMUTATIONS \
116+
-o $UCEC_PERMUTATIONS
117+
118+
python compute_mutation_probabilities.py -pf $UCEC_PERMUTATIONS/*.json \
119+
-mf $UCEC_MUTATIONS -np $WEIGHTS_PERMUTATIONS -nc $NUM_CORES \
120+
-o $UCEC_WEIGHTS
121+
122+
################################################################################
123+
# ENUMERATE PAIRS #
124+
################################################################################
125+
# Colorectal
126+
COADREAD_PERMUTATIONAL_PAIRS=$PAIRS_DIR/coadread-pairs-permutational-np${PAIRS_PERMUTATION_SUFFIX}.json
127+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -k 2 -nc $NUM_CORES \
128+
-o $COADREAD_PERMUTATIONAL_PAIRS -f $COADREAD_MIN_FREQ Permutational \
129+
-np $PAIRS_PERMUTATIONS -pf $COADREAD_PERMUTATIONS/*.json
130+
131+
COADREAD_WEIGHTED_EXACT_PAIRS=$PAIRS_DIR/coadread-pairs-weighted-exact-nw${WEIGHTS_SUFFIX}.json
132+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -nc $NUM_CORES -k 2\
133+
-f $COADREAD_MIN_FREQ -o $COADREAD_WEIGHTED_EXACT_PAIRS \
134+
Weighted -m Exact -wf $COADREAD_WEIGHTS
135+
136+
COADREAD_WEIGHTED_SADDLEPOINT_PAIRS=$PAIRS_DIR/coadread-pairs-weighted-saddlepoint-nw${WEIGHTS_SUFFIX}.json
137+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -nc $NUM_CORES -k 2\
138+
-f $COADREAD_MIN_FREQ -o $COADREAD_WEIGHTED_SADDLEPOINT_PAIRS \
139+
Weighted -m Saddlepoint -wf $COADREAD_WEIGHTS
140+
141+
COADREAD_UNWEIGHTED_EXACT_PAIRS=$PAIRS_DIR/coadread-pairs-unweighted-exact.json
142+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -nc $NUM_CORES -k 2\
143+
-f $COADREAD_MIN_FREQ -o $COADREAD_UNWEIGHTED_EXACT_PAIRS Unweighted -m Exact
144+
145+
146+
# THYROID
147+
THCA_PERMUTATIONAL_PAIRS=$PAIRS_DIR/thca-pairs-permutational-np${PAIRS_PERMUTATION_SUFFIX}.json
148+
python compute_exclusivity.py -mf $THCA_MUTATIONS -k 2 -nc $NUM_CORES \
149+
-o $THCA_PERMUTATIONAL_PAIRS -f $THCA_MIN_FREQ Permutational \
150+
-np $PAIRS_PERMUTATIONS -pf $THCA_PERMUTATIONS/*.json
151+
152+
THCA_WEIGHTED_EXACT_PAIRS=$PAIRS_DIR/thca-pairs-weighted-exact-nw${WEIGHTS_SUFFIX}.json
153+
python compute_exclusivity.py -mf $THCA_MUTATIONS -nc $NUM_CORES -k 2\
154+
-f $THCA_MIN_FREQ -o $THCA_WEIGHTED_EXACT_PAIRS \
155+
Weighted -m Exact -wf $THCA_WEIGHTS
156+
157+
THCA_WEIGHTED_SADDLEPOINT_PAIRS=$PAIRS_DIR/thca-pairs-weighted-saddlepoint-nw${WEIGHTS_SUFFIX}.json
158+
python compute_exclusivity.py -mf $THCA_MUTATIONS -nc $NUM_CORES -k 2\
159+
-f $THCA_MIN_FREQ -o $THCA_WEIGHTED_SADDLEPOINT_PAIRS \
160+
Weighted -m Saddlepoint -wf $THCA_WEIGHTS
161+
162+
THCA_UNWEIGHTED_EXACT_PAIRS=$PAIRS_DIR/thca-pairs-unweighted-exact.json
163+
python compute_exclusivity.py -mf $THCA_MUTATIONS -nc $NUM_CORES -k 2\
164+
-f $THCA_MIN_FREQ -o $THCA_UNWEIGHTED_EXACT_PAIRS Unweighted -m Exact
165+
166+
167+
# Endometrial
168+
UCEC_PERMUTATIONAL_PAIRS=$PAIRS_DIR/ucec-pairs-permutational-np${PAIRS_PERMUTATION_SUFFIX}.json
169+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -k 2 -nc $NUM_CORES \
170+
-o $UCEC_PERMUTATIONAL_PAIRS -f $UCEC_MIN_FREQ Permutational \
171+
-np $PAIRS_PERMUTATIONS -pf $UCEC_PERMUTATIONS/*.json
172+
173+
UCEC_WEIGHTED_EXACT_PAIRS=$PAIRS_DIR/ucec-pairs-weighted-exact-nw${WEIGHTS_SUFFIX}.json
174+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -nc $NUM_CORES -k 2\
175+
-f $UCEC_MIN_FREQ -o $UCEC_WEIGHTED_EXACT_PAIRS \
176+
Weighted -m Exact -wf $UCEC_WEIGHTS
177+
178+
UCEC_WEIGHTED_SADDLEPOINT_PAIRS=$PAIRS_DIR/ucec-pairs-weighted-saddlepoint-nw${WEIGHTS_SUFFIX}.json
179+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -nc $NUM_CORES -k 2\
180+
-f $UCEC_MIN_FREQ -o $UCEC_WEIGHTED_SADDLEPOINT_PAIRS \
181+
Weighted -m Saddlepoint -wf $UCEC_WEIGHTS
182+
183+
UCEC_UNWEIGHTED_EXACT_PAIRS=$PAIRS_DIR/ucec-pairs-unweighted-exact.json
184+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -nc $NUM_CORES -k 2\
185+
-f $UCEC_MIN_FREQ -o $UCEC_UNWEIGHTED_EXACT_PAIRS Unweighted -m Exact
186+
187+
################################################################################
188+
# ENUMERATE TRIPLES #
189+
################################################################################
190+
191+
# Colorectal
192+
COADREAD_PERMUTATIONAL_TRIPLES=$TRIPLES_DIR/coadread-triples-permutational-np${TOTAL_PERMUTATION_SUFFIX}.json
193+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -nc $NUM_CORES -k 3 \
194+
-f $COADREAD_MIN_FREQ -o $COADREAD_PERMUTATIONAL_TRIPLES \
195+
Permutational -pf $COADREAD_PERMUTATIONS/*.json -np $TOTAL_PERMUTATIONS
196+
197+
COADREAD_UNWEIGHTED_EXACT_TRIPLES=$TRIPLES_DIR/coadread-triples-unweighted-exact.json
198+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -nc $NUM_CORES -k 3 \
199+
-f $COADREAD_MIN_FREQ -o $COADREAD_UNWEIGHTED_EXACT_TRIPLES \
200+
Unweighted -m Exact
201+
202+
COADREAD_UNWEIGHTED_SADDLEPOINT_TRIPLES=$TRIPLES_DIR/coadread-triples-unweighted-saddlepoint.json
203+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -nc $NUM_CORES -k 3 \
204+
-f $COADREAD_MIN_FREQ -o $COADREAD_UNWEIGHTED_SADDLEPOINT_TRIPLES \
205+
Unweighted -m Saddlepoint
206+
207+
COADREAD_WEIGHTED_SADDLEPOINT_TRIPLES=$TRIPLES_DIR/coadread-triples-weighted-saddlepoint-nw${WEIGHTS_SUFFIX}.json
208+
python compute_exclusivity.py -mf $COADREAD_MUTATIONS -nc $NUM_CORES -k 3 \
209+
-f $COADREAD_MIN_FREQ -o $COADREAD_WEIGHTED_SADDLEPOINT_TRIPLES \
210+
Weighted -m Saddlepoint -wf $COADREAD_WEIGHTS
211+
212+
# Thyroid
213+
THCA_PERMUTATIONAL_TRIPLES=$TRIPLES_DIR/thca-triples-permutational-np${TOTAL_PERMUTATION_SUFFIX}.json
214+
python compute_exclusivity.py -mf $THCA_MUTATIONS -nc $NUM_CORES -k 3 \
215+
-f $THCA_MIN_FREQ -o $THCA_PERMUTATIONAL_TRIPLES \
216+
Permutational -pf $THCA_PERMUTATIONS/*.json -np $TOTAL_PERMUTATIONS
217+
218+
THCA_UNWEIGHTED_EXACT_TRIPLES=$TRIPLES_DIR/thca-triples-unweighted-exact.json
219+
python compute_exclusivity.py -mf $THCA_MUTATIONS -nc $NUM_CORES -k 3 \
220+
-f $THCA_MIN_FREQ -o $THCA_UNWEIGHTED_EXACT_TRIPLES \
221+
Unweighted -m Exact
222+
223+
THCA_UNWEIGHTED_SADDLEPOINT_TRIPLES=$TRIPLES_DIR/thca-triples-unweighted-saddlepoint.json
224+
python compute_exclusivity.py -mf $THCA_MUTATIONS -nc $NUM_CORES -k 3 \
225+
-f $THCA_MIN_FREQ -o $THCA_UNWEIGHTED_SADDLEPOINT_TRIPLES \
226+
Unweighted -m Saddlepoint
227+
228+
THCA_WEIGHTED_SADDLEPOINT_TRIPLES=$TRIPLES_DIR/thca-triples-weighted-saddlepoint-nw${WEIGHTS_SUFFIX}.json
229+
python compute_exclusivity.py -mf $THCA_MUTATIONS -nc $NUM_CORES -k 3 \
230+
-f $THCA_MIN_FREQ -o $THCA_WEIGHTED_SADDLEPOINT_TRIPLES \
231+
Weighted -m Saddlepoint -wf $THCA_WEIGHTS
232+
233+
# Endometrial
234+
UCEC_PERMUTATIONAL_TRIPLES=$TRIPLES_DIR/ucec-triples-permutational-np${TOTAL_PERMUTATION_SUFFIX}.json
235+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -nc $NUM_CORES -k 3 \
236+
-f $UCEC_MIN_FREQ -o $UCEC_PERMUTATIONAL_TRIPLES \
237+
Permutational -pf $UCEC_PERMUTATIONS/*.json -np $TOTAL_PERMUTATIONS
238+
239+
UCEC_UNWEIGHTED_EXACT_TRIPLES=$TRIPLES_DIR/ucec-triples-unweighted-exact.json
240+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -nc $NUM_CORES -k 3 \
241+
-f $UCEC_MIN_FREQ -o $UCEC_UNWEIGHTED_EXACT_TRIPLES \
242+
Unweighted -m Exact
243+
244+
UCEC_UNWEIGHTED_SADDLEPOINT_TRIPLES=$TRIPLES_DIR/ucec-triples-unweighted-saddlepoint.json
245+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -nc $NUM_CORES -k 3 \
246+
-f $UCEC_MIN_FREQ -o $UCEC_UNWEIGHTED_SADDLEPOINT_TRIPLES \
247+
Unweighted -m Saddlepoint
248+
249+
UCEC_WEIGHTED_SADDLEPOINT_TRIPLES=$TRIPLES_DIR/ucec-triples-weighted-saddlepoint-nw${WEIGHTS_SUFFIX}.json
250+
python compute_exclusivity.py -mf $UCEC_MUTATIONS -nc $NUM_CORES -k 3 \
251+
-f $UCEC_MIN_FREQ -o $UCEC_WEIGHTED_SADDLEPOINT_TRIPLES \
252+
Weighted -m Saddlepoint -wf $UCEC_WEIGHTS
253+
254+
################################################################################
255+
# CREATE FIGURES AND TABLES #
256+
################################################################################
257+
258+
# FIGURE 1: Distribution of number of mutations per sample
259+
python $SCRIPTS_DIR/sample_mutation_frequency_plot.py -mf $THCA_MUTATIONS \
260+
$COADREAD_MUTATIONS $UCEC_MUTATIONS -c THCA COADREAD UCEC \
261+
-o $FIGURES_DIR/Figure1.pdf
262+
263+
# FIGURE 2: Mutation probability matrices
264+
python $SCRIPTS_DIR/weights_matrix.py -mf $THCA_MUTATIONS $COADREAD_MUTATIONS \
265+
$UCEC_MUTATIONS -wf $THCA_WEIGHTS $COADREAD_WEIGHTS $UCEC_WEIGHTS \
266+
-c THCA COADREAD UCEC -o $FIGURES_DIR/Figure2.pdf
267+
268+
# FIGURE 3 and TABLE 1: Comparison of exact and saddlepoint for unweighted test
269+
python $SCRIPTS_DIR/unweighted_comparison.py -c THCA COADREAD UCEC \
270+
-sf $THCA_UNWEIGHTED_SADDLEPOINT_TRIPLES \
271+
$COADREAD_UNWEIGHTED_SADDLEPOINT_TRIPLES \
272+
$UCEC_UNWEIGHTED_SADDLEPOINT_TRIPLES\
273+
-ef $THCA_UNWEIGHTED_EXACT_TRIPLES $COADREAD_UNWEIGHTED_EXACT_TRIPLES \
274+
$UCEC_UNWEIGHTED_EXACT_TRIPLES\
275+
-ff $FIGURES_DIR/Figure3.png -tf $TABLES_DIR/Table1.tsv
276+
277+
# FIGURE 4 and TABLE 3: Pairs
278+
python $SCRIPTS_DIR/pairs_summary.py -c THCA THCA THCA THCA COADREAD \
279+
COADREAD COADREAD COADREAD UCEC UCEC UCEC UCEC -np $PAIRS_PERMUTATIONS \
280+
-pf $THCA_PERMUTATIONAL_PAIRS $THCA_WEIGHTED_EXACT_PAIRS \
281+
$THCA_WEIGHTED_SADDLEPOINT_PAIRS $THCA_UNWEIGHTED_EXACT_PAIRS \
282+
$COADREAD_PERMUTATIONAL_PAIRS $COADREAD_WEIGHTED_EXACT_PAIRS \
283+
$COADREAD_WEIGHTED_SADDLEPOINT_PAIRS $COADREAD_UNWEIGHTED_EXACT_PAIRS \
284+
$UCEC_PERMUTATIONAL_PAIRS $UCEC_WEIGHTED_EXACT_PAIRS \
285+
$UCEC_WEIGHTED_SADDLEPOINT_PAIRS $UCEC_UNWEIGHTED_EXACT_PAIRS \
286+
-ff $FIGURES_DIR/Figure4.pdf -tf $TABLES_DIR/Table2.tsv $TABLES_DIR/Table3.tsv
287+
288+
# FIGURE 5: Unweighted vs. Permutational (N=10^6)and Weighted (N=10^3) vs.
289+
# Permutational
290+
291+
python $SCRIPTS_DIR/triple_pval_scatter.py -np $TOTAL_PERMUTATIONS \
292+
-o $FIGURES_DIR/Figure5.png -uwf $COADREAD_UNWEIGHTED_EXACT_TRIPLES \
293+
$THCA_UNWEIGHTED_EXACT_TRIPLES $UCEC_UNWEIGHTED_EXACT_TRIPLES \
294+
-wf $COADREAD_WEIGHTED_SADDLEPOINT_TRIPLES \
295+
$THCA_WEIGHTED_SADDLEPOINT_TRIPLES $UCEC_WEIGHTED_SADDLEPOINT_TRIPLES \
296+
-pf $COADREAD_PERMUTATIONAL_TRIPLES $THCA_PERMUTATIONAL_TRIPLES \
297+
$UCEC_PERMUTATIONAL_TRIPLES
298+
299+
# TABLE 4: THCA results
300+
echo "THCA"
301+
python $SCRIPTS_DIR/results_table.py -lf $GENE_LENGTH_FILE \
302+
-mf $THCA_MUTATIONS -wf $THCA_WEIGHTED_SADDLEPOINT_TRIPLES \
303+
-uf $THCA_UNWEIGHTED_EXACT_TRIPLES -nt 5 -lt $LENGTH_THRESHOLD \
304+
-f $FDR_CUTOFF -o $TABLES_DIR/Table4
305+
306+
# TABLE 5: COADREAD results
307+
echo "COADREAD"
308+
python $SCRIPTS_DIR/results_table.py -lf $GENE_LENGTH_FILE \
309+
-mf $COADREAD_MUTATIONS -wf $COADREAD_WEIGHTED_SADDLEPOINT_TRIPLES \
310+
-uf $COADREAD_UNWEIGHTED_EXACT_TRIPLES -nt 5 -lt $LENGTH_THRESHOLD \
311+
-f $FDR_CUTOFF -o $TABLES_DIR/Table5
312+
313+
# TABLE 6: UCEC results
314+
echo "UCEC"
315+
python $SCRIPTS_DIR/results_table.py -lf $GENE_LENGTH_FILE \
316+
-mf $UCEC_MUTATIONS -wf $UCEC_WEIGHTED_SADDLEPOINT_TRIPLES \
317+
-uf $UCEC_UNWEIGHTED_EXACT_TRIPLES -nt 5 -lt $LENGTH_THRESHOLD \
318+
-f $FDR_CUTOFF -o $TABLES_DIR/Table6
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
matplotlib
2+
seaborn
3+
pandas

0 commit comments

Comments
 (0)