Add support/contribute section and references to README.md

pfh · pfh · commit 91105e0014e7 · 2017-08-08T10:50:23.000+10:00
diff --git a/README.md b/README.md
@@ -23,7 +23,7 @@ devtools::install_github("MonashBioinformaticsPlatform/varistran")
 
 ### Dependencies
 
-By default, library size estimation is by TMM, implemented in edgeR from BioConductor. You will need to install this manually if you haven't already:
+By default, library size estimation is by TMM (Robinson and Oshlack, 2010), implemented in edgeR from BioConductor. You will need to install this manually if you haven't already:
 
 ```
 source("http://bioconductor.org/biocLite.R")
@@ -59,7 +59,7 @@ Say you have a count matrix `counts` and a design matrix `design`. To perform a
 y <- varistran::vst(counts, design=design)
 ```
 
-By default, Anscombe's variance stabilizing transformation for the negative binomial distribution is used. This behaves like log2 for large counts (log2 Counts-Per-Million if `cpm=T` is given).
+By default, Anscombe's (1948) variance stabilizing transformation for the negative binomial distribution is used. This behaves like log2 for large counts (log2 Counts-Per-Million if `cpm=T` is given).
 
 An appropraite dispersion is estimated with the aid of the design matrix. If omitted, this defaults to a column of ones, for blind estimation of the dispersion. This might slightly over-estimate the dispersion. A third possibility is to estimate the dispersion with edgeR.
 
@@ -115,6 +115,25 @@ make test
 
 Outputs are placed in a directory called `test_output`.
 
+Sources of data used in these tests are:
+
+* The [Bottomly dataset](http://bowtie-bio.sourceforge.net/recount/ExpressionSets/bottomly_eset.RData) from [ReCount](http://bowtie-bio.sourceforge.net/recount/).
+
+* The "arab" dataset provided in the [NBPSeq package](https://cran.rstudio.com/web/packages/NBPSeq/index.html).
+
+* Simulated data following negative binomial distributions.
+
+Dispersion estimates are compared to those calculated by the [edgeR biocnoductor package's](https://bioconductor.org/packages/release/bioc/html/edgeR.html) `estimateGLMCommonDisp` function (McCarthy, Chen and Smyth, 2012) and by the [DESeq2 bioconductor package's](https://bioconductor.org/packages/release/bioc/html/DESeq2.html) `DESeq` function (Love, Huber and Anders, 2014).
+
+
+## Supporting/contributing
+
+Please email questions about using this software to the author [paul.harrison@monash.edu](email:paul.harrison@monash.edu).
+
+Please file bug reports and feature requests by [filing a bug report](https://github.com/MonashBioinformaticsPlatform/varistran/issues), or by [contacting the author](email:paul.harrison@monash.edu).
+
+Pull requests gratefully considered.
+
 
 ## Links
 
@@ -123,3 +142,16 @@ Outputs are placed in a directory called `test_output`.
 * [RNA Systems Laboratory, Monash University](http://rnasystems.erc.monash.edu)
 
 
+## References
+
+Anscombe, Francis J. 1948. "The Transformation of Poisson, Binomial and Negative-Binomial Data." *Biometrika* 35 (3/4): 246–54.
+
+Love, Michael I., Wolfgang Huber and Simon Anders. 2014. "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2." *Genome Biology* 15 (12): 550. doi:10.1186/s13059-014-0550-8
+
+McCarthy, Davis J., Yunshun Chen and Gordon K. Smyth. 2012. "Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation." *Nucleic Acids Research* 40 (10): 4288-4297. doi:10.1093/nar/gks042
+
+Robinson, Mark D. and Alicia Oshlack. 2010. "A scaling normalization method for differential expression analysis of RNA-seq data." *Genome Biology* 11 (3): R25. doi:10.1186/gb-2010-11-3-r25
+
+
+
+