Skip to content

Commit 8a88dad

Browse files
authored
Merge branch 'master' into gene_id
2 parents 66e5eff + 431d02f commit 8a88dad

File tree

58 files changed

+2856
-364
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+2856
-364
lines changed

.DS_Store

6 KB
Binary file not shown.

CMakeLists.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,9 @@ include(TestHelper)
1111

1212
#versioning stuff
1313
set (regtools_VERSION_MAJOR 0)
14-
set (regtools_VERSION_MINOR 4)
15-
set (regtools_VERSION_PATCH 0)
14+
set (regtools_VERSION_MINOR 5)
15+
set (regtools_VERSION_PATCH 2)
16+
1617
configure_file (
1718
"${PROJECT_SOURCE_DIR}/src/version.h.in"
1819
"${PROJECT_BINARY_DIR}/version.h"

Dockerfile

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
################################################################################
2+
##################### Set Inital Image to work from ############################
3+
4+
# work from latest LTS ubuntu release
5+
FROM ubuntu:18.04
6+
7+
# set variables
8+
ENV r_version 3.6.0
9+
10+
# run update
11+
RUN apt-get update -y && apt-get install -y \
12+
gfortran \
13+
libreadline-dev \
14+
libpcre3-dev \
15+
libcurl4-openssl-dev \
16+
build-essential \
17+
zlib1g-dev \
18+
libbz2-dev \
19+
liblzma-dev \
20+
openjdk-8-jdk \
21+
wget \
22+
libssl-dev \
23+
libxml2-dev \
24+
libnss-sss \
25+
git \
26+
build-essential \
27+
cmake \
28+
python3
29+
30+
################################################################################
31+
##################### Add Container Labels #####################################
32+
LABEL "Regtools_License"="MIT"
33+
LABEL "Description"="Software package which integrate DNA-seq and RNA-seq data\
34+
to help interpret mutations in a regulatory and splicing\
35+
context."
36+
37+
################################################################################
38+
####################### Install R ##############################################
39+
40+
# change working dir
41+
WORKDIR /usr/local/bin
42+
43+
# install R
44+
RUN wget https://cran.r-project.org/src/base/R-3/R-${r_version}.tar.gz
45+
RUN tar -zxvf R-${r_version}.tar.gz
46+
WORKDIR /usr/local/bin/R-${r_version}
47+
RUN ./configure --prefix=/usr/local/ --with-x=no
48+
RUN make
49+
RUN make install
50+
51+
# install R packages
52+
RUN R --vanilla -e 'install.packages(c("data.table", "plyr", "tidyverse"), repos = "http://cran.us.r-project.org")'
53+
54+
################################################################################
55+
##################### Install Regtools #########################################
56+
57+
# clone git repository
58+
RUN cd / && git clone https://github.com/griffithlab/regtools.git
59+
60+
# make a build directory for regtools
61+
WORKDIR /regtools/
62+
63+
64+
# compile from source
65+
RUN mkdir build && cd build && cmake .. && make
66+
67+
################################################################################
68+
###################### set environment path #################################
69+
70+
# make a build directory for regtools
71+
WORKDIR /regtools/scripts/
72+
73+
# add regtools executable to path
74+
ENV PATH="/regtools/build:/usr/local/bin/R-${r_version}:${PATH}"
75+

README.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ in a regulatory and splicing context.
1616

1717
## Installation
1818

19-
Clone and install regtools by running:
19+
Clone and install regtools by running the following:
2020
```
2121
git clone https://github.com/griffithlab/regtools
2222
cd regtools/
@@ -26,6 +26,8 @@ Clone and install regtools by running:
2626
make
2727
```
2828

29+
For convienience we also maintain a docker image available at [https://hub.docker.com/r/griffithlab/regtools/](https://hub.docker.com/r/griffithlab/regtools/)
30+
2931
## Usage:
3032

3133
```
@@ -55,6 +57,7 @@ If you would like to build the documentation locally, please install
5557
work on most machines. Then run `mkdocs serve` from within the `regtools`
5658
base directory.
5759

60+
5861
## Acknowledgements
5962

6063
Regtools uses several open-source libraries. We would like to thank the
@@ -64,3 +67,9 @@ useful comments and code.
6467
## License
6568

6669
The project is licensed under the [MIT license](https://opensource.org/licenses/MIT).
70+
71+
## Stable release with DOI
72+
73+
[![DOI](https://zenodo.org/badge/35841695.svg)](https://zenodo.org/badge/latestdoi/35841695)
74+
75+

docs/about.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# Regtools
2-
Regtools is a project by The Griffith Lab at the McDonnell Genome Institute.
1+
# RegTools
2+
RegTools is a project by The Griffith Lab at the McDonnell Genome Institute.
33
The source for the project is on [Github.](https://github.com/griffithlab/regtools)
44

55
##License

docs/commands/cis-ase-identify.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The `cis-ase identify` command is used to identify allele-specific expression ev
1010
| somatic-variants.vcf | Somatic variant calls in VCF format. The tool looks for allele specific expression at polymorphic loci near the somatic variants|
1111
| polymorphisms.vcf | List of polymorphic loci in the VCF format. RNA expression is checked at these sites to identify evidence of allele speciific expression|
1212
| dna-alignments.bam | Aligned DNA reads in the BAM format that has been indexed for example with `samtools index`. We have tested this command with alignments from BWA.|
13-
| dna-alignments.bam | Aligned RNAseq BAM produced with a splice aware aligner, that has been indexed for example with `samtools index`. We have tested this command with alignments from TopHat.|
13+
| rna-alignments.bam | Aligned RNAseq BAM produced with a splice aware aligner, that has been indexed for example with `samtools index`. We have tested this command with alignments from TopHat.|
1414
| ref.fa | The reference FASTA file. The donor and acceptor sequences used in the "splice-site" column of the annotated junctions are extracted from the FASTA file. |
1515
| annotations.gtf | The GTF file specifies the transcriptome that is used to annotate the junctions and variants. For examples, the Ensembl GTFs for release78 are [here](ftp://ftp.ensembl.org/pub/release-78/gtf/).|
1616

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
[csei]: ../images/csei_examples.png
2+
3+
###Synopsis
4+
The `cis-splice-effects associate` command is used to identify splicing misregulation events. This command is similar to `cis-splice-effects identify`, but takes the BED output of `junctions extract` in lieu of a BAM file with RNA alignments. The tool then proceeds to associate non-canonical splicing junctions near the variant sites.
5+
6+
###Usage
7+
`regtools cis-splice-effects associate [options] variants.vcf junctions.bed ref.fa annotations.gtf`
8+
9+
###Input
10+
| Input | Description |
11+
| ------ | ----------- |
12+
| variants.vcf | Variant call in VCF format from which to look for cis-splice-effects.|
13+
| junctions.bed | BED file of junctions to look through for evidence of splice events. The file is expected to be in the [BED12 format](junctions-extract.md#output) of the `junctions extract` output. |
14+
| ref.fa | The reference FASTA file. The donor and acceptor sequences used in the "splice-site" column of the annotated junctions are extracted from the FASTA file. |
15+
| annotations.gtf | The GTF file specifies the transcriptome that is used to annotate the junctions and variants. For examples, the Ensembl GTFs for release78 are [here](ftp://ftp.ensembl.org/pub/release-78/gtf/).|
16+
17+
**Note** - Please make sure that the version of the annotation GTF that you use corresponds with the version of the assembly build (ref.fa) and that the co-ordinates in the VCF file are also from the same build.
18+
19+
###Options
20+
| Option | Description |
21+
| ------ | ----------- |
22+
| -o STR | Output file containing the aberrant splice junctions with annotations. [STDOUT] |
23+
| -v STR | Output file containing variants annotated as splice relevant (VCF format). |
24+
| -j STR | Output file containing the aberrant junctions in BED12 format. |
25+
| -w INT | Window size in b.p to associate splicing events in. The tool identifies events in variant.start +/- w basepairs. Default behaviour is to look at the window between previous and next exons. |
26+
| -e INT | Maximum distance from the start/end of an exon to annotate a variant as relevant to splicing, the variant is in exonic space, i.e a coding variant. [3] |
27+
| -i INT | Maximum distance from the start/end of an exon to annotate a variant as relevant to splicing, the variant is in intronic space. [2] |
28+
| -I | Annotate variants in intronic space within a transcript(not to be used with -i). |
29+
| -E | Annotate variants in exonic space within a transcript(not to be used with -e). |
30+
| -S | Don't skip single exon transcripts. |
31+
32+
###Output
33+
For an explanation of the annotated junctions that are identified by this command please refer to the output of the `junctions annotate` command [here](junctions-annotate.md#output)
34+
For an explanation of the annotated variants that are identified by this command when using the -v option, please refer to the output of the `variants annotate` command [here](variants-annotate.md#output)
35+
36+
###Examples
37+
![cis-splice-effects identify example][csei]

docs/commands/cis-splice-effects-identify.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,16 @@ The `cis-splice-effects identify` command is used to identify splicing misregula
1919
###Options
2020
| Option | Description |
2121
| ------ | ----------- |
22-
| -o | Output file containing the aberrant splice junctions. [STDOUT] |
23-
| -v | Output file containing variants annotated as splice relevant (VCF format). |
24-
| -w | Window around the variant file (in basepairs) to identify splicing events in. If specified the tool looks at +/- n b.p around the variant start position. For example -w 500 will look at a 1kb window around the variant. If this option is not specified, the default option is to look at a window that ranges from the start co-ordinate of the previous exon and ends at the end co-ordinate of the next exon i.e by treating the current exon as a cassette exon. |
25-
| -j | Optional file containing the aberrant junctions in BED12 format. |
26-
| -e | Maximum distance from the start/end of an exon to annotate a variant as relevant to splicing, the variant is in exonic space, i.e a coding variant. [default = 3] |
27-
| -i | Maximum distance from the start/end of an exon to annotate a variant as relevant to splicing, the variant is in intronic space. [default = 2] |
28-
| -I | Annotate variants in intronic space within a transcript (not to be used with -i).
29-
| -E | Annotate variants in exonic space within a transcript (not to be used with -e).
30-
| -S | Dont skip single exon transcripts. The default is to skip the single exon transcripts. |
22+
| -o STR | Output file containing the aberrant splice junctions with annotations. [STDOUT] |
23+
| -v STR | Output file containing variants annotated as splice relevant (VCF format). |
24+
| -j STR | Output file containing the aberrant junctions in BED12 format. |
25+
| -s INT | Strand specificity of RNA library preparation, where 0 = unstranded/XS, 1 = first-strand/RF, 2 = second-strand/FR. This option is required. If your alignments contain XS tags, these will be used in the "unstranded" mode. |
26+
| -w INT | Window size in b.p to identify splicing events in. The tool identifies events in variant.start +/- w basepairs. Default behaviour is to look at the window between previous and next exons. |
27+
| -e INT | Maximum distance from the start/end of an exon to annotate a variant as relevant to splicing, the variant is in exonic space, i.e a coding variant. [3] |
28+
| -i INT | Maximum distance from the start/end of an exon to annotate a variant as relevant to splicing, the variant is in intronic space. [2] |
29+
| -I | Annotate variants in intronic space within a transcript(not to be used with -i). |
30+
| -E | Annotate variants in exonic space within a transcript(not to be used with -e). |
31+
| -S | Don't skip single exon transcripts. |
3132

3233
###Output
3334
For an explanation of the annotated junctions that are identified by this command please refer to the output of the `junctions annotate` command [here](junctions-annotate.md#output)

docs/commands/commands.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ This set of tools helps identify and work with aberrant splicing events near var
1313
Below are links to detailed explanations of the `cis-splice-effects` sub-commands:
1414

1515
- [identify](cis-splice-effects-identify.md)
16+
- [associate](cis-splice-effects-associate.md)
1617

1718
##cis-ase
1819
This set of tools helps identify and work with allele-specific-expression near variants, these could be somatic variants or germline polymorphisms/mutations. These variants are hypothesized to act in cis and affect how the gene is transcribed.

docs/commands/junctions-annotate.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Gene Annotation databases such as Ensembl/RefSeq/UCSC etc. The goal of the annot
2020
###Options
2121
| Option | Description |
2222
| ------ | ----------- |
23-
| -E | Do not skip single exon genes. The default is to skip the single exon genes while annotating junctions.|
23+
| -S | Do not skip single exon genes. The default is to skip the single exon genes while annotating junctions.|
2424
| -o | File to write output to. STDOUT by default. The output format is described [here](#output)|
2525
| -h | Display help message for this command.|
2626

0 commit comments

Comments
 (0)