Added Pages

mandar2812 · mandar2812 · commit 23c024025f59 · 2017-06-11T19:07:06.000+02:00
- MESN
 - ESGP
diff --git a/docs/core/core_esgp.md b/docs/core/core_esgp.md
@@ -0,0 +1,27 @@
+!!! summary
+    The _Extended Skew Gaussian Process_ (ESGP) uses the [MESN](/core/core_prob_dist/#mesn) distribution to define its finite dimensional probability distribution. It can be viewed as an generalization of the _Gaussian Process_ because when its skewness parameter approaches zero, the calculated probabilities are very close to gaussian probabilities.
+
+
+The ESGP model uses the conditioning property of the MESN distribution, just like the multivariate normal distribution, the MESN retains its form when conditioned on a subset of its dimensions.
+
+Creating an ESGP model is very similar to creating a GP model in DynaML. The class [`#!scala ESGPModel[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.models.sgp.ESGPModel) can be instantiated much like the [`#!scala AbstractGPRegressionModel[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/v1.4.2/dynaml-core/#io.github.mandar2812.dynaml.models.gp.AbstractGPRegressionModel), using the `apply` method.
+
+```scala
+//Obtain the data, some generic type
+val trainingdata: DataType = ...
+
+val kernel: LocalScalarKernel[I] = _
+val noiseKernel: LocalScalarKernel[I] = _
+val meanFunc: DataPipe[I, Double] = _
+
+val lambda = 1.5
+val tau = 0.5
+
+//Define how the data is converted to a compatible type
+implicit val transform: DataPipe[DataType, Seq[(I, Double)]] = _
+
+val model = ESGPModel(
+  kernel, noiseKernel,
+  meanFunc, lambda, tau,
+  trainingData)
+```
diff --git a/docs/core/core_gp.md b/docs/core/core_gp.md
@@ -11,12 +11,6 @@ _Gaussian Processes_ are powerful non-parametric predictive models, which repres
 
 _Gaussian Process_ models are well supported in DynaML, the [`#!scala AbstractGPRegressionModel[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.gp.AbstractGPRegressionModel) and [`#!scala AbstractGPClassification[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.gp.AbstractGPClassification) classes which extend the [`#!scala StochasticProcessModel[T, I, Y, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.StochasticProcessModel) base trait are the starting point for all GP implementations.
 
-The `#!scala StochasticProcessModel[T, I, Y, W]` trait contains the `#!scala predictiveDistribution[U <: Seq[I]](test: U): W` method which returns the posterior predictive distribution (represented by the generic type `#!scala W`).
-
-The base trait is extended by [`#!scala SecondOrderProcessModel[T, I, Y, K, M, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.SecondOrderProcessModel) which defines a skeleton for processes which are defined by their first and second order statistics (_mean functions_ and _covariance functions_).
-
-Since for most applications it is assumed that the training data is standardized, the mean function is often chosen to be zero $\mu(\mathbf{x}) = 0$, thus the covariance function or kernel defines all the interesting behavior of _second order processes_. For a more in depth information on the types of covariance functions available visit the [kernels](core/core_kernels.html) page.
-
 
 ## Gaussian Process Regression
 
diff --git a/docs/core/core_model_hierarchy.md b/docs/core/core_model_hierarchy.md
@@ -44,7 +44,9 @@ Stochastic processes (or random functions) are general probabilistic models whic
 
 ### Continuous Processes
 
-By continuous processes, we mean processes whose values lie on a continuous domain (such as $\mathbb{R}^d$). The [`#!scala ContinuousProcessModel[T, I, Y, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.ContinuousProcessModel) abstract class provides a template which can be extended to implement continuous random process models. 
+By continuous processes, we mean processes whose values lie on a continuous domain (such as $\mathbb{R}^d$). The [`#!scala ContinuousProcessModel[T, I, Y, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.ContinuousProcessModel) abstract class provides a template which can be extended to implement continuous random process models.
+
+The [`#!scala ContinuousProcessModel`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.ContinuousProcessModel) class contains the method `predictionWithErrorBars()` which takes inputs test data and number of standard deviations, and generates predictions with upper and lower error bars around them.
 
 ### Second Order Processes
 
diff --git a/docs/core/core_prob_dist.md b/docs/core/core_prob_dist.md
@@ -1,6 +1,6 @@
 
 !!! summary
-    The DynaML `#! dynaml.probability.distributions` package leverages and extends the `#! breeze.stats.distributions` package. Below is a list of distributions implemented.
+    The DynaML [`#! dynaml.probability.distributions`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.probability.distributions.package) package leverages and extends the `#! breeze.stats.distributions` package. Below is a list of distributions implemented.
 
 ## Specifying Distributions
 
@@ -11,7 +11,7 @@ Every probability density function $\rho(x)$ defined over some domain $x \in \ma
 An important analytical way to create skewed distributions was described by [Azzalani et. al](http://azzalini.stat.unipd.it/SN/skew-prop-aism.pdf). It consists of four components.
 
  * A symmetric probability density $\varphi(.)$
- * An odd function $w()$
+ * An odd function $w(.)$
  * A cumulative distribution function $G(.)$ of some symmetric density
  * A cut-off parameter $\tau$
 
@@ -21,13 +21,13 @@ $$
 
 ## Distributions API
 
-The `#!scala Density[T]` and `#!scala Rand[T]` traits form the API entry points for implementing probability distributions in breeze. In the `#!scala dynaml.probability.distributions` package, these two traits are inherited by `#!scala GenericDistribution[T]` which is extended by `#!scala AbstractContinuousDistr[T]` and `#!scala AbstractDiscreteDistr[T]` classes.
+The `#!scala Density[T]` and `#!scala Rand[T]` traits form the API entry points for implementing probability distributions in breeze. In the `#!scala dynaml.probability.distributions` package, these two traits are inherited by [`#!scala GenericDistribution[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/io/github/mandar2812/dynaml/probability/distributions/GenericDistribution) which is extended by [`#!scala AbstractContinuousDistr[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.probability.distributions.AbstractContinuousDistr) and [`#!scala AbstractDiscreteDistr[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/io/github/mandar2812/dynaml/probability/distributions/AbstractDiscreteDistr) classes.
 
 !!! tip "Distributions which can produce confidence intervals"
-    The trait `#!scala HasErrorBars[T]` can be used as a mix in to provide the ability of producing error bars to distributions. To extend it, one has to implement the `#!scala confidenceInterval(s: Double): (T, T)` method.
+    The trait [`#!scala HasErrorBars[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.probability.distributions.HasErrorBars) can be used as a mix in to provide the ability of producing error bars to distributions. To extend it, one has to implement the `#!scala confidenceInterval(s: Double): (T, T)` method.
 
 !!! tip "Skewness"
-    The `#!scala SkewSymmDistribution[T]` class is the generic base implementations for skew symmetric family of distributions in DynaML.
+    The [`#!scala SkewSymmDistribution[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/io/github/mandar2812/dynaml/probability/distributions/SkewSymmDistribution) class is the generic base implementations for skew symmetric family of distributions in DynaML.
 
 
 ## Distributions Library
@@ -111,39 +111,121 @@ val d = TruncatedGaussian(mean, sigma, a, b)
 
 ### Skew Gaussian  
 
-The univariate skew _Gaussian_ distribution.
+
+#### Univariate
 
 $\mathcal{X} \equiv  \mathbb{R}$
 
 $f(x) = \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}))$  
 
+$Z = \frac{1}{2}$
+
 $\phi()$ and $\Phi()$ being the standard gaussian density function and cumulative distribution function respectively
 
+
+#### Multivariate
+
+$\mathcal{X} \equiv  \mathbb{R}^d$
+
+$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu}, {\Sigma}) \Phi(\mathbf{\alpha}^{\intercal} L^{-1}(\mathbf{x} - \mathbf{\mu}))$  
+
 $Z = \frac{1}{2}$
 
+$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively and $L$ is the lower triangular Cholesky decomposition of $\Sigma$.
+
+!!! note "Skewness parameter $\alpha$"
+    The parameter $\alpha$ determines the skewness of the distribution and its sign tells us in which direction the distribution has a fatter tail. In the univariate case the parameter $\alpha$ is a scalar, while in the multivariate case $\alpha \in \mathbb{R}^d$, so for the multivariate skew gaussian distribution, there is a skewness value for each dimension.
+
 *Usage*:
 ```scala
+//Univariate
 val mean = 1.5
 val sigma = 1.5
 val a = -0.5
 val d = SkewGaussian(a, mean, sigma)
+
+//Multivariate
+val mu = DenseVector.ones[Double](4)
+val alpha = DenseVector.fill[Double](4)(1.2)
+val cov = DenseMatrix.eye[Double](4)*1.5
+val md = MultivariateSkewNormal(alpha, mu, cov)
 ```
 
 ### Extended Skew Gaussian  
 
+#### Univariate
+
 The generalization of the univariate skew _Gaussian_ distribution.
 
 $\mathcal{X} \equiv  \mathbb{R}$
 
-$f(x) = \frac{1}{\Phi(\tau)} \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}) + \tau\sqrt{1 + \alpha^{2}})$  
+$f(x) = \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}) + \tau\sqrt{1 + \alpha^{2}})$  
+
+$Z = \Phi(\tau)$
 
 $\phi()$ and $\Phi()$ being the standard gaussian density function and cumulative distribution function respectively
 
+#### Multivariate
+
+$\mathcal{X} \equiv  \mathbb{R}^d$
+
+$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu}, {\Sigma}) \Phi(\mathbf{\alpha}^{\intercal} L^{-1}(\mathbf{x} - \mathbf{\mu}) + \tau\sqrt{1 + \mathbf{\alpha}^{\intercal}\mathbf{\alpha}})$  
+
+$Z = \Phi(\tau)$
+
+$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively and $L$ is the lower triangular Cholesky decomposition of $\Sigma$.
+
+
 *Usage*:
 ```scala
+//Univariate
 val mean = 1.5
 val sigma = 1.5
 val a = -0.5
 val c = 0.5
 val d = ExtendedSkewGaussian(c, a, mean, sigma)
+
+//Multivariate
+val mu = DenseVector.ones[Double](4)
+val alpha = DenseVector.fill[Double](4)(1.2)
+val cov = DenseMatrix.eye[Double](4)*1.5
+val tau = 0.2
+val md = ExtendedMultivariateSkewNormal(tau, alpha, mu, cov)
+```
+
+!!! warning "Confusing Nomenclature"
+    The following distribution has a very similar form and name to the _extended skew gaussian_ distribution shown above. But despite its deceptively similar formula, it is a very different object.
+
+    We use the name MESN to denote the variant below instead of its expanded form.
+
+### MESN  
+
+The  _Multivariate Extended Skew Normal_ or MESN distribution was formulated by [Adcock and Schutes](https://www.sheffield.ac.uk/polopoly_fs/1.137010!/file/Adcock-Skew-normal-exponential-.pdf). It is given by
+
+$\mathcal{X} \equiv  \mathbb{R}^d$
+
+$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu} + \mathbf{\alpha}\tau, {\Sigma} + \mathbf{\alpha}\mathbf{\alpha}^\intercal) \Phi\left(\frac{\mathbf{\alpha}^{\intercal} \Sigma^{-1}(\mathbf{x} - \mathbf{\mu}) + \tau}{\sqrt{1 + \mathbf{\alpha}^{\intercal}\Sigma^{-1}\mathbf{\alpha}}}\right)$  
+
+$Z = \Phi(\tau)$
+
+$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively.
+
+*Usage*:
+```scala
+//Univariate
+val mean = 1.5
+val sigma = 1.5
+val a = -0.5
+val c = 0.5
+val d = UESN(c, a, mean, sigma)
+
+//Multivariate
+val mu = DenseVector.ones[Double](4)
+val alpha = DenseVector.fill[Double](4)(1.2)
+val cov = DenseMatrix.eye[Double](4)*1.5
+val tau = 0.2
+val md = MESN(tau, alpha, mu, cov)
 ```
+
+!!! seealso "_Extended Skew Gaussian Process_ ESGP"
+    The MESN distribution is used to define the finite dimensional probabilities for the [ESGP](/core/core_esgp.md) process.
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -37,6 +37,7 @@ pages:
             - Generalized Least Squares: 'core/core_gls.md'
           - Stochastic Processes:
             - Gaussian Processes: 'core/core_gp.md'
+            - Extended Skew Gaussian Processes: 'core/core_esgp.md'
             - Students T Processes: 'core/core_stp.md'
           - Neural Networks:
             - Feed Forward Networks: 'core/core_ffn_new.md'