Skip to content

Commit 593b6b5

Browse files
committed
more docs
1 parent 9847a49 commit 593b6b5

File tree

1 file changed

+27
-9
lines changed

1 file changed

+27
-9
lines changed

docs/src/affine.md

Lines changed: 27 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,25 +33,43 @@ Then given a (multivariate) standard normal ``z``, the covariance matrix of ``σ
3333
𝕍[σ z + μ] = Σ
3434
```
3535

36-
This is similar to the one dimensional case where
36+
Comparing to the one dimensional case where
3737

3838
```math
39-
𝕍[σ z + μ] = σ² ,
39+
𝕍[σ z + μ] = σ²
4040
```
4141

42-
and so the lower Cholesky factor of the covariance generalizes the concept of standard deviation, justifying the notation.
42+
shows that the lower Cholesky factor of the covariance generalizes the concept of standard deviation, justifying the notation.
4343

44-
## `Affine` and `AffineTransform`
44+
## The "Cholesky precision" parameterization
4545

46+
The ``(μ,σ)`` parameterization is especially convenient for random sampling. Any `z ~ Normal()` determines an `x ~ Normal(μ,σ)` through
4647

48+
```math
49+
x = σ z + μ
50+
```
51+
52+
On the other hand, the log-density computation is not quite so simple. Starting with an ``x``, we need to find ``z`` using
53+
54+
```math
55+
z = σ⁻¹ (x - μ)
56+
```
57+
58+
so the log-density is
59+
60+
```julia
61+
logdensity(d::Normal{(:μ,:σ)}, x) = logdensity(d.σ \ (x - d.μ)) - logdet(d.σ)
62+
```
63+
64+
Here the `- logdet(σ)` is the "log absolute Jacobian", required to account for the stretching of the space.
65+
66+
The above requires solving a linear system, which adds some overhead. Even with the convenience of a lower triangular system, it's still not quite a efficient as a multiplication.
4767

48-
unif = ∫(x -> 0<x<1, Lebesgue(ℝ))
49-
f = AffineTransform((μ=3,σ=2))
50-
g = AffineTransform((μ=3,ω=2))
68+
In addition to the covariance ``Σ``, it's also common to parameterize a multivariate normal by its _precision matrix_, ``Ω = Σ⁻¹``. Similarly to our use of ``σ``, we'll use ``ω`` for the lower Cholesky factor of ``Ω``.
5169

52-
So for example, the implementation of `StudentT(ν=1, μ=3, σ=4)` is equivalent to
70+
This allows a more efficient log-density,
5371

5472
```julia
55-
StudentT(nt::NamedTuple{(:ν,:μ,:σ)}) = Affine((μ=nt.μ, σ=nt.σ), StudentT((ν=1)))
73+
logdensity(d::Normal{(:μ,:ω)}, x) = logdensity(d.ω * (x - d.μ)) + logdet(d.ω)
5674
```
5775

0 commit comments

Comments
 (0)