@@ -291,13 +291,13 @@ frametitle("Monte-Carlo sampling")
291291md """
292292This can be approximated using Monte-Carlo given ``L`` samples ``\e psilon_1, \l dots \e psilon_L`` from the distribution ``\m athcal{N}(0, I)`` as
293293```math
294- \m athbb{E}[\l og(f_{X|Z}(x|Y))]] \a pprox \f rac{1}{L} \s um_{i=1}^L \l og(f_{X|Z}(x|E_\m u(x) + \e psilon_i \o dot E_\s igma(x))).
294+ \m athbb{E}[\l og(f_{X|Z}(x|Y))] \a pprox \f rac{1}{L} \s um_{i=1}^L \l og(f_{X|Z}(x|E_\m u(x) + \e psilon_i \o dot E_\s igma(x))).
295295```
296296In the simpler case where ``D_\s igma(z) = \m athbf{1}``, we recognize the classical L2 norm:
297297```math
298298\b egin{align}
299- \m athbb{E}[\l og(f_{X|Z}(x|Y))]]
300- & \a pprox -\f rac{\l og(2\p i)}{2}+\f rac{1}{L}\s um_{i=1}^L\| x - D_\m u(E_\m u(x) + \e psilon_i\| _2^2.
299+ \m athbb{E}[\l og(f_{X|Z}(x|Y))]
300+ & \a pprox -\f rac{\l og(2\p i)}{2}+\f rac{1}{L}\s um_{i=1}^L\| x - D_\m u(E_\m u(x) + \e psilon_i \o dot E_ \s igma(x)) \| _2^2.
301301\e nd{align}
302302```
303303"""
@@ -308,8 +308,8 @@ frametitle("Variational AutoEncoders (VAEs)")
308308# ╔═╡ 23f3de75-0617-4232-bb71-bd9f3e355a1e
309309md """
310310* We want to learn the distribution of our data represented by the random variable ``X``.
311- * The encoder maps a data point ``x`` to a Gaussian distribution ``Y \s im \m athcal{N}(E_\m u(x), E_{\S igma }(x))`` over the latent space
312- * The decoder maps a latent variable ``z \s im Z`` to a the Gaussian distribution ``\m athcal{N}(D_\m u(z), D_\s igma(Z ))``
311+ * The encoder maps a data point ``x`` to a Gaussian distribution ``Y \s im \m athcal{N}(E_\m u(x), E_{\s igma }(x))`` over the latent space
312+ * The decoder maps a latent variable ``z \s im Z`` to a the Gaussian distribution ``\m athcal{N}(D_\m u(z), D_\s igma(z ))``
313313
314314The Maximum Likelihood Estimator (MLE) maximizes the following sum over our datapoints ``x`` with its ELBO:
315315```math
@@ -425,10 +425,10 @@ We have (see $(cite("kingma2013AutoEncoding", "Appendix B")) for a proof):
425425For the second part of the ELBO, we have
426426```math
427427\b egin{align}
428- & \m athbb{E}[\l og(f_{X|Z}(x|Y))]] \\
429- & = \m athbb{E}[\l og(f_{X|Z}(x|E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))]] \\
430- & = \m athbb{E}[\l og(f_{\m athcal{E}_1}(\t ext{Diag}(D_\s igma(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))^{-1} (x - D_\m u(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))))]] \\
431- & = -\f rac{\l og(2\p i)}{2}+\m athbb{E}[\|\t ext{Diag}(D_\s igma(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))^{-1} (x - D_\m u(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))\| _2^2]] .
428+ & \m athbb{E}[\l og(f_{X|Z}(x|Y))]\\
429+ & = \m athbb{E}[\l og(f_{X|Z}(x|E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))]\\
430+ & = \m athbb{E}[\l og(f_{\m athcal{E}_1}(\t ext{Diag}(D_\s igma(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))^{-1} (x - D_\m u(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))))]\\
431+ & = -\f rac{\l og(2\p i)}{2}+\m athbb{E}[\|\t ext{Diag}(D_\s igma(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))^{-1} (x - D_\m u(E_\m u(x) + \m athcal{E}_2 \o dot E_\s igma(x)))\| _2^2].
432432\e nd{align}
433433```
434434"""
0 commit comments