22
33## Synopsis
44``` shell
5- $ dnmtools roi [OPTIONS] < intervals.bed> < input.meth >
5+ $ dnmtools roi [OPTIONS] < intervals.bed> < input.counts >
66```
77
88## Description
@@ -17,15 +17,18 @@ found in the documentation for the `levels` command.
1717
1818The ` roi ` command requires two input files. The first is a
1919sorted [ counts output file] ( ../counts ) ,
20- i.e. ` input.meth ` in the example above. This file provides data for
20+ i.e. ` input.counts ` in the example above. This file provides data for
2121every site, either a cytosine or CpG, that is of interest. The second
2222input file (` intervals.bed ` ) specifies the genomic intervals in which
2323methylation statistics should be summarized. If either file is not
2424sorted by (chrom,end,start,strand) it can be sorted using the
2525following command:
2626``` shell
27- $ LC_ALL=C sort -k 1,1 -k 3,3n -k 2,2n -k 6,6 -o input-sorted.meth input.meth
27+ $ LC_ALL=C sort -k 1,1 -k 3,3n -k 2,2n -k 6,6 -o input-sorted.counts input.counts
2828```
29+ Note: As of v1.4.0, the sorted order of chromosomes/targets within these
30+ files is not important, but the sites within each chromosome must
31+ still be sorted.
2932
3033The intervals must be specified as a BED format file, and these can be
3134sorted using [ bedtools
@@ -35,9 +38,19 @@ formats: (1) 6-column BED format, which may have more than 6 columns,
3538but requires the first 6 columns to match the specification, or (2)
36393-column BED format.
3740
41+ * An important note about the input files:* several aspects of the
42+ output for ` roi ` depend on the number of sites within each region of
43+ interest. If the ` .counts ` file provided as input does not have all
44+ the sites you might expect, for example if it is missing sites that
45+ have been excluded from some earlier step in your pipeline, then the
46+ results will be affected. We hope to make ` roi ` more robust to this
47+ issue in the future, for example by accepting some information about
48+ the reference genome to ensure that the numbers of sites are as
49+ expected by the user.
50+
3851From there, the ` roi ` command can be run as follows:
3952``` shell
40- $ dnmtools roi -o output.bed regions.bed input-sorted.meth
53+ $ dnmtools roi -o output.bed regions.bed input-sorted.counts
4154```
4255
4356The default output format is a 6-column BED format file, with the
0 commit comments