Skip to content

Commit 23c0240

Browse files
committed
Added Pages
- MESN - ESGP
1 parent aef0f81 commit 23c0240

File tree

5 files changed

+120
-14
lines changed

5 files changed

+120
-14
lines changed

docs/core/core_esgp.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
!!! summary
2+
The _Extended Skew Gaussian Process_ (ESGP) uses the [MESN](/core/core_prob_dist/#mesn) distribution to define its finite dimensional probability distribution. It can be viewed as an generalization of the _Gaussian Process_ because when its skewness parameter approaches zero, the calculated probabilities are very close to gaussian probabilities.
3+
4+
5+
The ESGP model uses the conditioning property of the MESN distribution, just like the multivariate normal distribution, the MESN retains its form when conditioned on a subset of its dimensions.
6+
7+
Creating an ESGP model is very similar to creating a GP model in DynaML. The class [`#!scala ESGPModel[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.models.sgp.ESGPModel) can be instantiated much like the [`#!scala AbstractGPRegressionModel[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/v1.4.2/dynaml-core/#io.github.mandar2812.dynaml.models.gp.AbstractGPRegressionModel), using the `apply` method.
8+
9+
```scala
10+
//Obtain the data, some generic type
11+
val trainingdata: DataType = ...
12+
13+
val kernel: LocalScalarKernel[I] = _
14+
val noiseKernel: LocalScalarKernel[I] = _
15+
val meanFunc: DataPipe[I, Double] = _
16+
17+
val lambda = 1.5
18+
val tau = 0.5
19+
20+
//Define how the data is converted to a compatible type
21+
implicit val transform: DataPipe[DataType, Seq[(I, Double)]] = _
22+
23+
val model = ESGPModel(
24+
kernel, noiseKernel,
25+
meanFunc, lambda, tau,
26+
trainingData)
27+
```

docs/core/core_gp.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,6 @@ _Gaussian Processes_ are powerful non-parametric predictive models, which repres
1111

1212
_Gaussian Process_ models are well supported in DynaML, the [`#!scala AbstractGPRegressionModel[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.gp.AbstractGPRegressionModel) and [`#!scala AbstractGPClassification[T, I]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.gp.AbstractGPClassification) classes which extend the [`#!scala StochasticProcessModel[T, I, Y, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.StochasticProcessModel) base trait are the starting point for all GP implementations.
1313

14-
The `#!scala StochasticProcessModel[T, I, Y, W]` trait contains the `#!scala predictiveDistribution[U <: Seq[I]](test: U): W` method which returns the posterior predictive distribution (represented by the generic type `#!scala W`).
15-
16-
The base trait is extended by [`#!scala SecondOrderProcessModel[T, I, Y, K, M, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.SecondOrderProcessModel) which defines a skeleton for processes which are defined by their first and second order statistics (_mean functions_ and _covariance functions_).
17-
18-
Since for most applications it is assumed that the training data is standardized, the mean function is often chosen to be zero $\mu(\mathbf{x}) = 0$, thus the covariance function or kernel defines all the interesting behavior of _second order processes_. For a more in depth information on the types of covariance functions available visit the [kernels](core/core_kernels.html) page.
19-
2014

2115
## Gaussian Process Regression
2216

docs/core/core_model_hierarchy.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,9 @@ Stochastic processes (or random functions) are general probabilistic models whic
4444

4545
### Continuous Processes
4646

47-
By continuous processes, we mean processes whose values lie on a continuous domain (such as $\mathbb{R}^d$). The [`#!scala ContinuousProcessModel[T, I, Y, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.ContinuousProcessModel) abstract class provides a template which can be extended to implement continuous random process models.
47+
By continuous processes, we mean processes whose values lie on a continuous domain (such as $\mathbb{R}^d$). The [`#!scala ContinuousProcessModel[T, I, Y, W]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.ContinuousProcessModel) abstract class provides a template which can be extended to implement continuous random process models.
48+
49+
The [`#!scala ContinuousProcessModel`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/index.html#io.github.mandar2812.dynaml.models.ContinuousProcessModel) class contains the method `predictionWithErrorBars()` which takes inputs test data and number of standard deviations, and generates predictions with upper and lower error bars around them.
4850

4951
### Second Order Processes
5052

docs/core/core_prob_dist.md

Lines changed: 89 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11

22
!!! summary
3-
The DynaML `#! dynaml.probability.distributions` package leverages and extends the `#! breeze.stats.distributions` package. Below is a list of distributions implemented.
3+
The DynaML [`#! dynaml.probability.distributions`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.probability.distributions.package) package leverages and extends the `#! breeze.stats.distributions` package. Below is a list of distributions implemented.
44

55
## Specifying Distributions
66

@@ -11,7 +11,7 @@ Every probability density function $\rho(x)$ defined over some domain $x \in \ma
1111
An important analytical way to create skewed distributions was described by [Azzalani et. al](http://azzalini.stat.unipd.it/SN/skew-prop-aism.pdf). It consists of four components.
1212

1313
* A symmetric probability density $\varphi(.)$
14-
* An odd function $w()$
14+
* An odd function $w(.)$
1515
* A cumulative distribution function $G(.)$ of some symmetric density
1616
* A cut-off parameter $\tau$
1717

@@ -21,13 +21,13 @@ $$
2121

2222
## Distributions API
2323

24-
The `#!scala Density[T]` and `#!scala Rand[T]` traits form the API entry points for implementing probability distributions in breeze. In the `#!scala dynaml.probability.distributions` package, these two traits are inherited by `#!scala GenericDistribution[T]` which is extended by `#!scala AbstractContinuousDistr[T]` and `#!scala AbstractDiscreteDistr[T]` classes.
24+
The `#!scala Density[T]` and `#!scala Rand[T]` traits form the API entry points for implementing probability distributions in breeze. In the `#!scala dynaml.probability.distributions` package, these two traits are inherited by [`#!scala GenericDistribution[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/io/github/mandar2812/dynaml/probability/distributions/GenericDistribution) which is extended by [`#!scala AbstractContinuousDistr[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.probability.distributions.AbstractContinuousDistr) and [`#!scala AbstractDiscreteDistr[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/io/github/mandar2812/dynaml/probability/distributions/AbstractDiscreteDistr) classes.
2525

2626
!!! tip "Distributions which can produce confidence intervals"
27-
The trait `#!scala HasErrorBars[T]` can be used as a mix in to provide the ability of producing error bars to distributions. To extend it, one has to implement the `#!scala confidenceInterval(s: Double): (T, T)` method.
27+
The trait [`#!scala HasErrorBars[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.probability.distributions.HasErrorBars) can be used as a mix in to provide the ability of producing error bars to distributions. To extend it, one has to implement the `#!scala confidenceInterval(s: Double): (T, T)` method.
2828

2929
!!! tip "Skewness"
30-
The `#!scala SkewSymmDistribution[T]` class is the generic base implementations for skew symmetric family of distributions in DynaML.
30+
The [`#!scala SkewSymmDistribution[T]`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/io/github/mandar2812/dynaml/probability/distributions/SkewSymmDistribution) class is the generic base implementations for skew symmetric family of distributions in DynaML.
3131

3232

3333
## Distributions Library
@@ -111,39 +111,121 @@ val d = TruncatedGaussian(mean, sigma, a, b)
111111

112112
### Skew Gaussian
113113

114-
The univariate skew _Gaussian_ distribution.
114+
115+
#### Univariate
115116

116117
$\mathcal{X} \equiv \mathbb{R}$
117118

118119
$f(x) = \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}))$
119120

121+
$Z = \frac{1}{2}$
122+
120123
$\phi()$ and $\Phi()$ being the standard gaussian density function and cumulative distribution function respectively
121124

125+
126+
#### Multivariate
127+
128+
$\mathcal{X} \equiv \mathbb{R}^d$
129+
130+
$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu}, {\Sigma}) \Phi(\mathbf{\alpha}^{\intercal} L^{-1}(\mathbf{x} - \mathbf{\mu}))$
131+
122132
$Z = \frac{1}{2}$
123133

134+
$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively and $L$ is the lower triangular Cholesky decomposition of $\Sigma$.
135+
136+
!!! note "Skewness parameter $\alpha$"
137+
The parameter $\alpha$ determines the skewness of the distribution and its sign tells us in which direction the distribution has a fatter tail. In the univariate case the parameter $\alpha$ is a scalar, while in the multivariate case $\alpha \in \mathbb{R}^d$, so for the multivariate skew gaussian distribution, there is a skewness value for each dimension.
138+
124139
*Usage*:
125140
```scala
141+
//Univariate
126142
val mean = 1.5
127143
val sigma = 1.5
128144
val a = -0.5
129145
val d = SkewGaussian(a, mean, sigma)
146+
147+
//Multivariate
148+
val mu = DenseVector.ones[Double](4)
149+
val alpha = DenseVector.fill[Double](4)(1.2)
150+
val cov = DenseMatrix.eye[Double](4)*1.5
151+
val md = MultivariateSkewNormal(alpha, mu, cov)
130152
```
131153

132154
### Extended Skew Gaussian
133155

156+
#### Univariate
157+
134158
The generalization of the univariate skew _Gaussian_ distribution.
135159

136160
$\mathcal{X} \equiv \mathbb{R}$
137161

138-
$f(x) = \frac{1}{\Phi(\tau)} \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}) + \tau\sqrt{1 + \alpha^{2}})$
162+
$f(x) = \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}) + \tau\sqrt{1 + \alpha^{2}})$
163+
164+
$Z = \Phi(\tau)$
139165

140166
$\phi()$ and $\Phi()$ being the standard gaussian density function and cumulative distribution function respectively
141167

168+
#### Multivariate
169+
170+
$\mathcal{X} \equiv \mathbb{R}^d$
171+
172+
$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu}, {\Sigma}) \Phi(\mathbf{\alpha}^{\intercal} L^{-1}(\mathbf{x} - \mathbf{\mu}) + \tau\sqrt{1 + \mathbf{\alpha}^{\intercal}\mathbf{\alpha}})$
173+
174+
$Z = \Phi(\tau)$
175+
176+
$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively and $L$ is the lower triangular Cholesky decomposition of $\Sigma$.
177+
178+
142179
*Usage*:
143180
```scala
181+
//Univariate
144182
val mean = 1.5
145183
val sigma = 1.5
146184
val a = -0.5
147185
val c = 0.5
148186
val d = ExtendedSkewGaussian(c, a, mean, sigma)
187+
188+
//Multivariate
189+
val mu = DenseVector.ones[Double](4)
190+
val alpha = DenseVector.fill[Double](4)(1.2)
191+
val cov = DenseMatrix.eye[Double](4)*1.5
192+
val tau = 0.2
193+
val md = ExtendedMultivariateSkewNormal(tau, alpha, mu, cov)
194+
```
195+
196+
!!! warning "Confusing Nomenclature"
197+
The following distribution has a very similar form and name to the _extended skew gaussian_ distribution shown above. But despite its deceptively similar formula, it is a very different object.
198+
199+
We use the name MESN to denote the variant below instead of its expanded form.
200+
201+
### MESN
202+
203+
The _Multivariate Extended Skew Normal_ or MESN distribution was formulated by [Adcock and Schutes](https://www.sheffield.ac.uk/polopoly_fs/1.137010!/file/Adcock-Skew-normal-exponential-.pdf). It is given by
204+
205+
$\mathcal{X} \equiv \mathbb{R}^d$
206+
207+
$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu} + \mathbf{\alpha}\tau, {\Sigma} + \mathbf{\alpha}\mathbf{\alpha}^\intercal) \Phi\left(\frac{\mathbf{\alpha}^{\intercal} \Sigma^{-1}(\mathbf{x} - \mathbf{\mu}) + \tau}{\sqrt{1 + \mathbf{\alpha}^{\intercal}\Sigma^{-1}\mathbf{\alpha}}}\right)$
208+
209+
$Z = \Phi(\tau)$
210+
211+
$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively.
212+
213+
*Usage*:
214+
```scala
215+
//Univariate
216+
val mean = 1.5
217+
val sigma = 1.5
218+
val a = -0.5
219+
val c = 0.5
220+
val d = UESN(c, a, mean, sigma)
221+
222+
//Multivariate
223+
val mu = DenseVector.ones[Double](4)
224+
val alpha = DenseVector.fill[Double](4)(1.2)
225+
val cov = DenseMatrix.eye[Double](4)*1.5
226+
val tau = 0.2
227+
val md = MESN(tau, alpha, mu, cov)
149228
```
229+
230+
!!! seealso "_Extended Skew Gaussian Process_ ESGP"
231+
The MESN distribution is used to define the finite dimensional probabilities for the [ESGP](/core/core_esgp.md) process.

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ pages:
3737
- Generalized Least Squares: 'core/core_gls.md'
3838
- Stochastic Processes:
3939
- Gaussian Processes: 'core/core_gp.md'
40+
- Extended Skew Gaussian Processes: 'core/core_esgp.md'
4041
- Students T Processes: 'core/core_stp.md'
4142
- Neural Networks:
4243
- Feed Forward Networks: 'core/core_ffn_new.md'

0 commit comments

Comments
 (0)