Skip to content

Commit aa13f86

Browse files
Adding things left out before and formatting
1 parent c63d908 commit aa13f86

File tree

1 file changed

+15
-14
lines changed

1 file changed

+15
-14
lines changed

README.md

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ follows procedures already used in other publications. We first map
1414
the SMURF-seq reads using BWA:
1515
```
1616
bwa mem -x ont2d -k 12 -W 12 \
17-
-A 4 -B 10 -O 6 -E 3 -T 120 bwa-mem/index/hg19.fa \
17+
-A 4 -B 10 -O 6 -E 3 -T 120 bwa-mem/index/hg19.fa \
1818
smurf_reads.fa > mapped_smurf_reads.sam
1919
```
2020
The parameters for the Smith-Waterman scoring ('A', 'B', 'O' and 'E')
@@ -32,7 +32,7 @@ mapped fragments:
3232
-o unambig_smurf_frags.sam -s 120 -q 1
3333
```
3434

35-
Then the remaining fragments are given to a script that obtains
35+
Then the remaining fragments are given to a script that obtains
3636
the counts of reads in bins:
3737
```
3838
./getBinCounts.py -i unambig_smurf_frags.sam -c hg19.chrom.sizes \
@@ -52,21 +52,22 @@ counts divided by the average reads per bin. This information was
5252
determined based on what is required in the next script.
5353

5454
In the next step we use an adaptation of a script originally due to
55-
ASDF.
55+
Timour et al. (Nat. Protocols, 2014). The script is run
56+
as follows:
5657
```
5758
./cnvAnalysis.R bin_counts.bed SampleName bins_5k_hg19_gc.txt bins_5k_hg19_exclude.txt
5859
```
59-
The input file `bin_counts.bed` is the same as described above. The input file
60-
`bins_5k_hg19_gc.txt` is the GC content of each bin. The input `bins_5k_hg19_exclude.txt`
61-
is used to exclude certain parts of the genome that attract an unusual amount of reads.
62-
The format is simply the line numbers, in the corresponding bed file, of the bins
63-
to exclude from the CNV analysis. The first output is a PDF file
64-
`{SampleName}.5k.wg.nobad.pdf` for the CNV profile. In addition,
65-
two tables are saved: one table
66-
`{SampleName}.hg19.5k.nobad.varbin.data.txt` with the information
67-
(chromosome, genome position, GC content, bin count, segmented value) for each bin, and
68-
the other table `{SampleName}.hg19.5k.nobad.varbin.short.txt`
69-
summerizing the breakpoints in the CNV profile.
60+
The input file `bin_counts.bed` is the same as described above. The
61+
input file `bins_5k_hg19_gc.txt` is the GC content of each bin. The
62+
input `bins_5k_hg19_exclude.txt` is used to exclude certain parts of
63+
the genome that attract an unusual amount of reads. The format is
64+
simply the line numbers, in the corresponding bed file, of the bins to
65+
exclude from the CNV analysis. The first output is a PDF file
66+
`SampleName.pdf` for the CNV profile. In addition, two tables are
67+
saved: one table `SampleName.data.txt` with the information
68+
(chromosome, genome position, GC content, bin count, segmented value)
69+
for each bin, and the other table `SampleName.short.txt` summerizing
70+
the breakpoints in the CNV profile.
7071

7172
## Simulating SMURF-seq reads for evaluating mappers
7273

0 commit comments

Comments
 (0)