jonnylaw
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 0 deletions b/‎.gitignore‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎Makefile‎
Lines changed: 6 additions & 2 deletions b/‎Makefile‎
Lines changed: 6 additions & 2 deletions
diff --git a/‎R/.Rhistory‎
Lines changed: 0 additions & 196 deletions b/‎R/.Rhistory‎
Lines changed: 0 additions & 196 deletions
diff --git a/‎R/forecasting.Rmd‎
Lines changed: 1 addition & 3 deletions b/‎R/forecasting.Rmd‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎R/gibbs_sampling.Rmd‎
Lines changed: 0 additions & 3 deletions b/‎R/gibbs_sampling.Rmd‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎R/kalman_filter.Rmd‎
Lines changed: 0 additions & 3 deletions b/‎R/kalman_filter.Rmd‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎R/pmmh.Rmd‎
Lines changed: 41 additions & 14 deletions b/‎R/pmmh.Rmd‎
Lines changed: 41 additions & 14 deletions
diff --git a/‎build.sbt‎
Lines changed: 1 addition & 1 deletion b/‎build.sbt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/CorrelatedModel.html‎
Lines changed: 2 additions & 2 deletions b/‎docs/CorrelatedModel.html‎
Lines changed: 2 additions & 2 deletions
@@ -3,8 +3,11 @@
 *.dat
 project/*
 target/*
+tut/*
 src/main/scala/target/*
 .ensime_cache/
+R/.Rhistory
+R/.RData
 *.qsub
 *.jar
 *.zip
 
@@ -19,8 +19,8 @@ poisson_dglm_gibbs:
 	ssh topsy -t qstat
 
 seasonal_irregular_gibbs:
-#	sbt assembly
-#	cp target/scala-2.11/bayesian_dlms-assembly-0.2-SNAPSHOT.jar seasonal_dlm_irregular.jar
+	sbt assembly
+	cp target/scala-2.11/bayesian_dlms-assembly-0.2-SNAPSHOT.jar seasonal_dlm_irregular.jar
 	ssh topsy -t mkdir -p /share/nobackup/a9169110/seasonal_dlm_irregular/data
 	scp data/seasonal_dlm_irregular.csv topsy:/share/nobackup/a9169110/seasonal_dlm_irregular/data/.
 	scp seasonal_dlm_irregular.jar seasonal_dlm_irregular.qsub topsy:/share/nobackup/a9169110/seasonal_dlm_irregular/.
@@ -30,6 +30,10 @@ seasonal_irregular_gibbs:
 
 site: correlated second_order first_order seasonal
 
+knit_site:
+	sbt "tut"
+	RScript -e 'setwd(\"tut\"); rmarkdown::render_site()'
+
 correlated: simulate_correlated simulate_correlated_trend filter_correlated gibbs_correlated
 
 simulate_correlated_trend:
 
@@ -1,8 +1,5 @@
 ---
 title: "Forecasting"
-author: "Jonathan Law"
-date: "20 November 2017"
-output: html_document
 ---
 
 # Forecasting using a DLM
@@ -78,6 +75,7 @@ The results of the forecasting and 95% prediction intervals are below:
 
 ```{r forecast_seasonal_dlm, echo=FALSE, message=FALSE, warning=FALSE}
 forecast = read_csv("../data/seasonal_model_forecast.csv")
+sims = read_csv("../data/seasonal_dlm.csv")
 
 forecast %>%
   rename(time = Time, forecast = Observation) %>%
 
@@ -1,8 +1,5 @@
 ---
 title: "Gibbs Sampling"
-author: "Jonathan Law"
-date: "20 November 2017"
-output: html_document
 ---
 
 The DLM is simple to specify and the distribution of the latent-state can be learned on-line using the exact filtering recursions given in the [Kalman Filter](kalman_filter.html). It remains to determine the values of the static parameters, $V$ and $W$. A Gibbs sampler can be used for this purpose. Consider a general DLM given by the following equation:
 
@@ -1,8 +1,5 @@
 ---
 title: "Kalman Filtering"
-author: "Jonathan Law"
-date: "20 November 2017"
-output: html_document
 ---
 
 Dynamic Linear Models have a linear Gaussian latent-state and observation model which is amenable to exact filtering because of special properties of the Gaussian distribution. This means the distribution of the latent-state ($p(\textbf{x}_{0:T}|y_{1:T}, \theta)$) can be learned about exactly, this distribution is commonly called the filtering distribution. Suppose a time-dependent process is observed discretely at times $t = 1,\dots,T$, then a general DLM for this process can be written as:
 
@@ -1,30 +1,57 @@
 ---
 title: "Particle Marginal Metropolis Hastings"
-author: "Jonathan Law"
-date: "20 November 2017"
-output: html_document
 ---
 
 ```{r setup, include=FALSE}
-knitr::opts_chunk$set(echo = TRUE)
+knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
+library(tidyverse)
 ```
 
-## R Markdown
+The Particle Marginal Metropolis Hastings (PMMH) algorithm can be used to learn parameters of a Dynamic Generalised Linear Model (DGLM).
 
-This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
+The algorithm uses an unbiased estimate of the marginal likelihood, $p(y_{1:T} | x_{0:T}, V, W)$ in a [Metropolis-Hastings Algorithm](https://en.wikipedia.org/wiki/Metropolis–Hastings_algorithm). The pseudo marginal likelihood can be calculated using the Particle Filter.
 
-When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
+Firstly define a DGLM:
 
-```{r cars}
-summary(cars)
+```tut
+import dlm.model._
+
+val model = Dlm.polynomial(1)
+val dglm = Dglm.poisson(model)
+```
+
+Assume we have some observations of a process which can be modelled using the specified model:
+
+```scala
+val observations = // some observations
+```
+
+Then we need to define the proposal function, `def proposal: Parameters => Rand[Parameters]`. A symmetric proposal distribution is implemented in the `Metropolis` object. This proposal distribution is simply a Normal centered at the previous value of the parameters. In the case of the non-negative parameters, $V$, $W$ and $C_0$ the parameter is multiplied by the exponential of a Normal random number.
+
+The function accepts a parameter `delta`, a single `Double` value which controls the variance of the Gaussian proposal distribution:
+
+```tut
+def proposal = Metropolis.symmetricProposal(0.05) _
+```
+
+Then we need to specify a prior distribution for the parameters, which encapsulates our prior beliefs about the parameters.
+
+```scala
+def prior(p: Dlm.Parameters): Double = ???
 ```
 
-## Including Plots
+Then we must specify the initial value of the parameters to start the MCMC.
 
-You can also embed plots, for example:
+The PMMH algorithm is implemented in the package, using the bootstrap particle filter:
 
-```{r pressure, echo=FALSE}
-plot(pressure)
+```scala
+val iters = MetropolisHastings.pmmh(
+  dglm, 
+  observations,
+  proposal,
+  prior,
+  initP
+  n = 500)
 ```
 
-Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
+With this specification, there are two tuning parameters in the PMMH algorithm, the variance of the Gaussian proposal distribution and the number of particles in the Particle Filter. As the number of particles increases so does the accuracy of the likelihood estimate, however this also increases computational time. The variance of the proposal distribution should be such that the proportion of accepted moves in the pmmh algorithm is approximately one third. This will guarantee that the MCMC has explored the full parameter posterior.
@@ -4,7 +4,7 @@ name := "bayesian_dlms"
 
 organization := "com.github.jonnylaw"
 
-version := "0.2-SNAPSHOT"
+version := "0.2"
 
 libraryDependencies ++= Seq(
   "org.scalanlp"        %% "breeze"             % "0.13.2",
 
@@ -359,7 +359,7 @@ <h2>Example</h2>
      |   steps.
      |   <span class="fu">take</span>(<span class="dv">1000</span>).
      |   toArray
-sims: Array[(dlm.<span class="fu">model</span>.<span class="fu">Data</span>, breeze.<span class="fu">linalg</span>.<span class="fu">DenseVector</span>[Double])] = Array((<span class="fu">Data</span>(<span class="fl">1.0</span>,Some(<span class="fu">DenseVector</span>(<span class="fl">0.18780608371492558</span>, -<span class="fl">1.2811069517113705</span>))),<span class="fu">DenseVector</span>(-<span class="fl">1.6333598038175137</span>, -<span class="fl">1.2098642314446573</span>, <span class="fl">1.0611206088835845</span>)), (<span class="fu">Data</span>(<span class="fl">2.0</span>,Some(<span class="fu">DenseVector</span>(-<span class="fl">1.4079351946947054</span>, -<span class="fl">0.39579091796482335</span>))),<span class="fu">DenseVector</span>(-<span class="fl">1.280046324087481</span>, <span class="fl">1.3561555779052163</span>, <span class="fl">1.379094457757263</span>)), (<span class="fu">Data</span>(<span class="fl">3.0</span>,Some(<span class="fu">DenseVector</span>(-<span class="fl">2.0570321758361265</span>, -<span class="fl">0.48806716925707305</span>))),<span class="fu">DenseVector</span>(-<span class="fl">1.6696489916019377</span>, <span class="fl">1.2028039975126408</span>, <span class="fl">2.043780656767615</span>)), (<span class="fu">Data</span>(<span class="fl">4.0</span>,Some(<span class="fu">DenseVector</span>(-<span class="fl">0.5837527923101483</span>, <span class="fl">1.577550247930801</span>))),<span class="fu">DenseVector</span>(-<span class="fl">0.703575343271954</span>, <span class="fl">2.8795846326668597</span>, <span class="fl">2.821835578624862</span>)), (<span class="fu">Data</span>(<span class="fl">5.0</span>,Some(<span class="fu">DenseVector</span>(<span class="fl">1.4691063075964887</span>, <span class="fl">9.036001089172327</span>))),<span class="fu">DenseVector</span>(-<span class="fl">0.5682837648723814</span>, <span class="fl">6.305715528029837</span>, <span class="fl">3.337370040495271</span>)), (Da...</code></pre></div>
+sims: Array[(dlm.<span class="fu">model</span>.<span class="fu">Data</span>, breeze.<span class="fu">linalg</span>.<span class="fu">DenseVector</span>[Double])] = Array((<span class="fu">Data</span>(<span class="fl">1.0</span>,Some(<span class="fu">DenseVector</span>(<span class="fl">0.2795169095332988</span>, <span class="fl">0.4228914998737442</span>))),<span class="fu">DenseVector</span>(-<span class="fl">1.021997453107498</span>, <span class="fl">0.41788587205046346</span>, -<span class="fl">0.7676453289139624</span>)), (<span class="fu">Data</span>(<span class="fl">2.0</span>,Some(<span class="fu">DenseVector</span>(-<span class="fl">0.1467090536515817</span>, -<span class="fl">0.5514870663003566</span>))),<span class="fu">DenseVector</span>(<span class="fl">0.2976011394213929</span>, <span class="fl">0.2732836274171595</span>, <span class="fl">0.2581851139281812</span>)), (<span class="fu">Data</span>(<span class="fl">3.0</span>,Some(<span class="fu">DenseVector</span>(<span class="fl">0.5245299154943004</span>, -<span class="fl">0.31008920182005384</span>))),<span class="fu">DenseVector</span>(-<span class="fl">0.41720512294420886</span>, <span class="fl">0.7144327391964618</span>, -<span class="fl">1.119935516050294</span>)), (<span class="fu">Data</span>(<span class="fl">4.0</span>,Some(<span class="fu">DenseVector</span>(<span class="fl">1.261729015781143</span>, -<span class="fl">0.491160151330296</span>))),<span class="fu">DenseVector</span>(<span class="fl">0.562191402689851</span>, <span class="fl">0.16436818272845155</span>, -<span class="fl">0.42949904353276547</span>)), (<span class="fu">Data</span>(<span class="fl">5.0</span>,Some(<span class="fu">DenseVector</span>(<span class="fl">0.9794141811393307</span>, -<span class="fl">5.586163261365406</span>))),<span class="fu">DenseVector</span>(<span class="fl">2.045265636322294</span>, -<span class="fl">4.029750617159588</span>, -<span class="fl">0.9496851827116551</span>)), ...</code></pre></div>
 <p>The figure below shows a simulation from this composed model:</p>
 <p><img src="CorrelatedModel_files/figure-html/correlated-simulated-1.png" width="672" /></p>
 </div>
@@ -392,7 +392,7 @@ <h2>Gibbs Sampling Correlated Model</h2>
      |     <span class="fu">InverseGamma</span>(<span class="fl">3.0</span>, <span class="fl">3.0</span>), 
      |     p, 
      |     sims.<span class="fu">map</span>(_._<span class="dv">1</span>))
-iters: breeze.<span class="fu">stats</span>.<span class="fu">distributions</span>.<span class="fu">Process</span>[dlm.<span class="fu">model</span>.<span class="fu">GibbsSampling</span>.<span class="fu">State</span>] = breeze.<span class="fu">stats</span>.<span class="fu">distributions</span>.<span class="fu">MarkovChain</span>$$anon$<span class="dv">1</span>@1e11c919</code></pre></div>
+iters: breeze.<span class="fu">stats</span>.<span class="fu">distributions</span>.<span class="fu">Process</span>[dlm.<span class="fu">model</span>.<span class="fu">GibbsSampling</span>.<span class="fu">State</span>] = breeze.<span class="fu">stats</span>.<span class="fu">distributions</span>.<span class="fu">MarkovChain</span>$$anon$<span class="dv">1</span>@478cb7de</code></pre></div>
 <div class="figure">
 <img src="CorrelatedModel_files/figure-html/correlated-v-diagnostics-1.png" alt="Diagnostic plots for the MCMC draws from the posterior distribution of the non-zero diagonal elements of the Observation noise covariance matrix for the simulated correlated model" width="672" />
 <p class="caption">