Minor cleanup.

pfh · pfh · commit 9600bd347f04 · 2017-03-13T15:08:07.000+11:00
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -10,7 +10,7 @@ Description: Transform RNA-Seq count data so that variance due to biological
 Authors@R: person("Paul", "Harrison", email = "paul.harrison@monash.edu", role = c("aut", "cre"))
 Maintainer: Paul Harrison <paul.harrison@monash.edu>
 URL: https://github.com/MonashBioinformaticsPlatform/varistran
-Version: 1.0.0
+Version: 1.0.1
 License: LGPL-2.1 | file LICENSE
 Depends:
     grid
diff --git a/README.md b/README.md
@@ -2,12 +2,12 @@
 
 Varistran is an R package providing a Variance Stabilizing Transformation appropriate for RNA-Seq data, and a variety of diagnostic plots based on such transformation.
 
+* [Function reference](http://logarithmic.net/varistran/reference/index.html)
+
 * [Online demo](http://rnasystems.erc.monash.edu:3838/pfh/2015/demo-varistran)
 
 * [A slideshow describing Varistran](http://rnasystems.erc.monash.edu:3838/pfh/2016/varistran/)
 
-* [Function reference](http://logarithmic.net/varistran/reference/index.html)
-
 * [Poster for ABACBS 2015](doc/varistran-poster-abacbs-2015.pdf) [(on F1000, doi: 10.7490/f1000research.1110757.1)](http://f1000research.com/posters/4-1041)
 
 Varistran is developed by Paul Harrison (paul.harrison@monash.edu, [@paulfharrisson](https://twitter.com/paulfharrison)) for the [Monash Bioinformatics platform](https://platforms.monash.edu/bioinformatics/).
@@ -44,7 +44,7 @@ y <- varistran::vst(counts, design=design)
 
 By default, Anscombe's variance stabilizing transformation for the negative binomial distribution is used. This behaves like log2 for large counts (log2 Counts-Per-Million if `cpm=T` is given).
 
-An appropraite dispersion is estimated with the aid of the design matrix. (If omitted, this defaults to a column of ones, for blind estimation of the dispersion. This might slightly over-estimate the dispersion. A third possibility is to estimate the dispersion with edgeR.)
+An appropraite dispersion is estimated with the aid of the design matrix. If omitted, this defaults to a column of ones, for blind estimation of the dispersion. This might slightly over-estimate the dispersion. A third possibility is to estimate the dispersion with edgeR.
 
 ### Diagnostic plots
 
@@ -84,7 +84,7 @@ varistran::shiny_report(counts=counts)
 * [Online demo](http://rnasystems.erc.monash.edu:3838/pfh/2015/demo-varistran)
 
 
-### Tests
+## Test suite
 
 After downloading the source code, a suite of tests can be run with:
 
@@ -100,3 +100,5 @@ Outputs are places in a directory called `test_output`.
 * [Monash Bioinformatics Platform, Monash University](https://platforms.monash.edu/bioinformatics)
 
 * [RNA Systems Laboratory, Monash University](http://rnasystems.erc.monash.edu)
+
+
diff --git a/paper.md b/paper.md
@@ -1,5 +1,5 @@
 ---
-title: "Varistran: Anscombe's variance stabilizing transformation for RNA-Seq gene expression data"
+title: "Varistran: Anscombe's variance stabilizing transformation for RNA-seq gene expression data"
 tags:
   - RNA-Seq
   - gene expression
@@ -18,7 +18,7 @@ bibliography: paper.bib
 
 # Summary
 
-RNA-Seq measures RNA expression levels in a biological sample using high-throughput cDNA sequencing, producing counts of the number of reads aligning to each gene. Noise in RNA-Seq read count data is commonly modelled as following a negative binomial distribution, where the variance is a quadratic function of the expression level. However many statistical, machine learning, and visualization methods work best when the noise in a data set has equal variance. Varistran is an R package that uses Anscombe's [-@Anscombe1948] variance stabilizing transformation for the negative binomial distribution to transform RNA-Seq count data, so that the noise has equal variance across all measured gene expression levels. The transformed data may be treated as log~2~ transformed gene expression levels, but with variability reduced at low read counts. Varistran also includes a function to open a Shiny report with simple diagnostic visualizations, including a plot to assess how effective the variance stabilization was, a biplot of samples and genes, and a heatmap. This allows defective samples, sample mislabling, and batch effects to be easily identified.
+RNA-seq measures RNA expression levels in a biological sample using high-throughput cDNA sequencing, producing counts of the number of reads aligning to each gene. Noise in RNA-seq read count data is commonly modelled as following a negative binomial distribution, where the variance is a quadratic function of the expression level. However many statistical, machine learning, and visualization methods work best when the noise in a data set has equal variance. Varistran is an R package that uses Anscombe's [-@Anscombe1948] variance stabilizing transformation for the negative binomial distribution to transform RNA-seq count data, so that the noise has equal variance across all measured gene expression levels. The transformed data may be treated as log~2~ transformed gene expression levels, but with variability reduced at low read counts. Varistran also includes a function to open a Shiny report with simple diagnostic visualizations, including a plot to assess how effective the variance stabilization was, a biplot of samples and genes, and a heatmap. This allows defective samples, sample mislabling, and batch effects to be easily identified.
 
 # References