Skip to content

Commit 3594257

Browse files
committed
resolved conflicts after merge
2 parents 8175c05 + 05e861f commit 3594257

File tree

417 files changed

+2488
-3522
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

417 files changed

+2488
-3522
lines changed

.circleci/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ version: 2
22
jobs:
33
build:
44
docker:
5-
- image: circleci/python:3.6.8-jessie
5+
- image: cimg/python:3.9.8-node
66
steps:
77
- checkout
88
- run:

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ docs: ## generate Sphinx HTML documentation, including API docs
7676
sphinx-apidoc -o docs/ spladder
7777
$(MAKE) -C docs clean
7878
$(MAKE) -C docs html
79-
$(BROWSER) docs/_build/html/index.html
79+
$(BROWSER) docs/build/html/index.html
8080

8181
servedocs: docs ## compile the docs watching for changes
8282
watchmedo shell-command -p '*.rst' -c '$(MAKE) -C docs html' -R -D .

docs/source/conf.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@
2424
author = u'Andre Kahles'
2525

2626
# The short X.Y version
27-
version = u'2.4'
27+
version = u'2.5'
2828
# The full version, including alpha/beta/rc tags
29-
release = u'2.4.2'
29+
release = u'2.5.0'
3030

3131

3232
# -- General configuration ---------------------------------------------------

docs/source/file_formats.rst

Lines changed: 216 additions & 62 deletions
Large diffs are not rendered by default.

docs/source/img/splice_events.pdf

350 KB
Binary file not shown.

docs/source/img/splice_events.png

63.5 KB
Loading

docs/source/spladder_modes.rst

Lines changed: 98 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ In its latest version, SplAdder also supports (on an experimental level) CRAM co
121121
files as input. If you are using such files, in addition to the input filenames of the
122122
alignment files, also the path to the indexed reference sequence used for compression is required::
123123

124-
spladder build --bams alignment1.cram,alignment2.cram,... --cram-reference path/to/cram_ref.fa
124+
spladder build --bams alignment1.cram,alignment2.cram,... --reference path/to/cram_ref.fa
125125

126126
**Alignment**
127127
By default, SplAdder only uses primary alignments (in SAM/BAM the ones not carrying the 256
@@ -319,27 +319,107 @@ entirely (for instance to carry it out at a later point in time). This is done v
319319

320320
spladder build ... --no-extract-ase ...
321321

322-
SplAdder can currently extract 6 different types of alternative splicing events:
322+
**Event extraction**
323+
SplAdder can currently extract 6 different types of alternative splicing events:
323324

324-
- exon skips (`exon_skip`)
325-
- intron retentions (`intron_retention`)
326-
- alternative 3' splice sites (`alt_3prime`)
327-
- alternative 5' splice sites (`alt_5prime`)
328-
- mutually exclusive exons (`mutex_exons`)
329-
- multiple (coordinated) exons skips (`mult_exon_skip`)
325+
- exon skips (`exon_skip`)
326+
- intron retentions (`intron_retention`)
327+
- alternative 3' splice sites (`alt_3prime`)
328+
- alternative 5' splice sites (`alt_5prime`)
329+
- mutually exclusive exons (`mutex_exons`)
330+
- multiple (coordinated) exons skips (`mult_exon_skip`)
330331

331-
Per default all events of all types are extracted from the graph. To specify a single type or a
332-
subset of types (e.g., exon skips and mutually exclusive exons only), the user can specify the short
333-
names of the event types (as shown in parentheses above) as follows::
332+
Per default all events of all types are extracted from the graph. To specify a single type or a
333+
subset of types (e.g., exon skips and mutually exclusive exons only), the user can specify the short
334+
names of the event types (as shown in parentheses above) as follows::
334335

335-
spladder build ... --event-types exon_skip,mutex_exons ...
336+
spladder build ... --event-types exon_skip,mutex_exons ...
337+
338+
In some cases (for instance when integrating hundreds of alignment samples), the splicing graphs can
339+
grow very complex. To limit the running time, an upper bound for the maximum number of edges in the
340+
splicing graph of a gene to be used for event extraction is set. This threshold is 500 per default.
341+
To adapt this threshold, e.g., to 250, the user can specify::
342+
343+
spladder build ... --ase-edge-limit 250 ...
344+
345+
**Event verification**
346+
Similar to graph validation, SplAdder also performs a step of splice event verification. Only
347+
verified events are reported as confident to the user. There are two possibilities how the validity of
348+
a confident event is established.
349+
350+
The classical way for event verification is to use heuristic criteria based on the RNA-Seq
351+
evidence provided to SplAdder. Depending on the alternative event type, as different set of
352+
criteria is used. The tables below summarize the criteria currently in use for the different
353+
event types. The order and numbering of criteria is the same as used in the output files of
354+
SplAdder.
355+
356+
+----------------------------------------------------------------------------------------+
357+
| Multiple Exon Skip |
358+
+====+===================================================================================+
359+
| 0 | exon coordinates are valid (>= 0 && start < stop && non-overlapping) & |
360+
| | skipped exon coverage >= FACTOR * mean(pre, after) |
361+
+----+-----------------------------------------------------------------------------------+
362+
| 1 | inclusion count first intron >= threshold |
363+
+----+-----------------------------------------------------------------------------------+
364+
| 2 | inclusion count last intron >= threshold |
365+
+----+-----------------------------------------------------------------------------------+
366+
| 3 | avg inclusion count inner exons >= threshold |
367+
+----+-----------------------------------------------------------------------------------+
368+
| 4 | skip count >= threshold |
369+
+----+-----------------------------------------------------------------------------------+
370+
371+
+----+-----------------------------------------------------------------------------------+
372+
| Intron Retention |
373+
+====+===================================================================================+
374+
| 0 | counts meet criteria for min_retention_cov, min_retention_region and |
375+
| | min_retetion_rel_cov |
376+
+----+-----------------------------------------------------------------------------------+
377+
| 1 | min_non_retention_count >= threshold |
378+
+----+-----------------------------------------------------------------------------------+
379+
380+
+----+-----------------------------------------------------------------------------------+
381+
| Exon Skip |
382+
+====+===================================================================================+
383+
| 0 | coverage of skipped exon is >= than FACTOR * mean(pre, after) |
384+
+----+-----------------------------------------------------------------------------------+
385+
| 1 | inclusion count of first intron >= threshold |
386+
+----+-----------------------------------------------------------------------------------+
387+
| 2 | inclusion count of second intron >= threshold |
388+
+----+-----------------------------------------------------------------------------------+
389+
| 3 | skip count of exon >= threshold |
390+
+----+-----------------------------------------------------------------------------------+
391+
392+
+----+-----------------------------------------------------------------------------------+
393+
| Alternative 3/5 Prime |
394+
+====+===================================================================================+
395+
| 0 | coverage of diff region is at least FACTOR * coverage constant region |
396+
+----+-----------------------------------------------------------------------------------+
397+
| 1 | both alternative introns are >= threshold |
398+
+----+-----------------------------------------------------------------------------------+
399+
400+
+----+-----------------------------------------------------------------------------------+
401+
| Mutually Exclusive Exons |
402+
+====+===================================================================================+
403+
| 0 | coverage of first alt exon is >= than FACTOR times average of pre and after |
404+
+----+-----------------------------------------------------------------------------------+
405+
| 1 | coverage of second alt exon is >= than FACTOR times average of pre and after |
406+
+----+-----------------------------------------------------------------------------------+
407+
| 2 | both introns neighboring first alt exon are confirmed >= threshold |
408+
+----+-----------------------------------------------------------------------------------+
409+
| 3 | both introns neighboring second alt exon are confirmed >= threshold |
410+
+----+-----------------------------------------------------------------------------------+
411+
412+
In addition to the classical, RNA-Seq evidence based mode, since version 2.5 it is also allowed
413+
to use the provided annotation to verify an existing event. In this mode each one of the
414+
criteria listed above is replaced with a lookup in the provided annotation. That is, if an
415+
intron is already annotated, it will be used for event verification irrespective of any RNA-Seq
416+
expression support. This mode is especially useful for single sample analysis, where a complete
417+
isoform switch might have occurred and only the alternative event path is supported by reads but
418+
not the annotated one. In this case, the event is still reported. This mode is switched off by
419+
default and can be activated via::
420+
421+
spladder build ... -use-anno-support ...
336422

337-
In some cases (for instance when integrating hundreds of alignment samples), the splicing graphs can
338-
grow very complex. To limit the running time, an upper bound for the maximum number of edges in the
339-
splicing graph of a gene to be used for event extraction is set. This threshold is 500 per default.
340-
To adapt this threshold, e.g., to 250, the user can specify::
341-
342-
spladder build ... --ase-edge-limit 250 ...
343423

344424
The ``test`` mode
345425
-----------------

requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
numpy>=1.14.6
2+
numba>=0.52.0
23
matplotlib>=2.2
34
scipy>=1.3
45
intervaltree>=3.0.0

scripts/generate_test_results_events.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,17 +15,19 @@ for i in $(seq 1 20)
1515
do
1616
bams="$bams,${datadir}/align/testcase_${testname}_1_sample${i}.bam"
1717
done
18-
python -m spladder.spladder build -v -o ${outdir} -a ${datadir}/testcase_${testname}_spladder.gtf -b ${bams#,} --event-types exon_skip,intron_retention,alt_3prime,alt_5prime,mutex_exons,mult_exon_skip --readlen 50 --output-conf-icgc --output-txt --output-txt-conf --output-gff3 --output-struc --output-struc-conf --output-bed --output-conf-bed --output-conf-tcga
18+
python -m spladder.spladder build -v -o ${outdir} -a ${datadir}/testcase_${testname}_spladder.gtf -b ${bams#,} --event-types exon_skip,intron_retention,alt_3prime,alt_5prime,mutex_exons,mult_exon_skip --readlen 50 --output-conf-icgc --output-txt --output-txt-conf --output-gff3 --output-struc --output-struc-conf --output-bed --output-conf-bed --output-conf-tcga #--use-anno-support
1919

2020
bamsA=bamlistA.txt
21+
rm -f $bamsA
2122
for i in $(seq 1 10)
2223
do
2324
echo "${datadir}/align/testcase_${testname}_1_sample${i}.bam" >> $bamsA
2425
done
2526
bamsB=bamlistB.txt
27+
rm -f $bamsB
2628
for i in $(seq 11 20)
2729
do
2830
echo "align/testcase_${testname}_1_sample${i}.bam" >> $bamsB
2931
done
30-
python -m spladder.spladder test -o ${outdir} -v --diagnose-plots -f ps --readlen 50 --merge-strat merge_graphs --event-types exon_skip -a $bamsA -b $bamsB
32+
python -m spladder.spladder test -o ${outdir} -v --diagnose-plots -f pdf --readlen 50 --merge-strat merge_graphs --event-types exon_skip -a $bamsA -b $bamsB --dpsi 0
3133
rm bamlistA.txt bamlistB.txt

scripts/generate_test_results_events_cram.sh

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,17 +19,19 @@ do
1919
bams="$bams,${datadir}/align/testcase_${testname}_1_sample${i}.bam"
2020
done
2121
#export REF_PATH=$genome
22-
python -m spladder.spladder build -v -o ${outdir} -a ${datadir}/testcase_${testname}_spladder.gtf -b ${crams#,} --event-types exon_skip,intron_retention,alt_3prime,alt_5prime,mutex_exons,mult_exon_skip --readlen 50 --output-conf-icgc --output-txt --output-txt-conf --output-gff3 --output-struc --output-struc-conf --output-bed --output-conf-bed --output-conf-tcga --cram-ref ${genome}
22+
python -m spladder.spladder build -v -o ${outdir} -a ${datadir}/testcase_${testname}_spladder.gtf -b ${crams#,} --event-types exon_skip,intron_retention,alt_3prime,alt_5prime,mutex_exons,mult_exon_skip --readlen 50 --output-conf-icgc --output-txt --output-txt-conf --output-gff3 --output-struc --output-struc-conf --output-bed --output-conf-bed --output-conf-tcga --reference ${genome}
2323

2424
cramsA=cramlistA.txt
25-
for i in $(seq 1 10)
25+
rm -f $cramsA
26+
for i in 10 8 2 1 7 6 5 3 9 4
2627
do
2728
echo "${datadir}/align/testcase_${testname}_1_sample${i}.cram" >> $cramsA
2829
done
2930
cramsB=cramlistB.txt
30-
for i in $(seq 11 20)
31+
rm -f $cramsB
32+
for i in 20 13 17 11 12 19 15 14 16 18
3133
do
3234
echo "align/testcase_${testname}_1_sample${i}.cram" >> $cramsB
3335
done
34-
python -m spladder.spladder test -o ${outdir} -v --diagnose-plots -f ps --readlen 50 --merge-strat merge_graphs --event-types exon_skip -a $cramsA -b $cramsB
36+
python -m spladder.spladder test -o ${outdir} -v --diagnose-plots -f pdf --readlen 50 --merge-strat merge_graphs --event-types exon_skip -a $cramsA -b $cramsB --dpsi 0
3537
rm cramlistA.txt cramlistB.txt

0 commit comments

Comments
 (0)