Skip to content

Commit d092e26

Browse files
committed
latex debug lab2 part2 solutions
1 parent efcbcc8 commit d092e26

File tree

1 file changed

+26
-20
lines changed

1 file changed

+26
-20
lines changed

lab2/solutions/Part2_Debiasing_Solution.ipynb

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -477,21 +477,28 @@
477477
"In practice, how can we train a VAE? In learning the latent space, we constrain the means and standard deviations to approximately follow a unit Gaussian. Recall that these are learned parameters, and therefore must factor into the loss computation, and that the decoder portion of the VAE is using these parameters to output a reconstruction that should closely match the input image, which also must factor into the loss. What this means is that we'll have two terms in our VAE loss function:\n",
478478
"\n",
479479
"1. **Latent loss ($L_{KL}$)**: measures how closely the learned latent variables match a unit Gaussian and is defined by the Kullback-Leibler (KL) divergence.\n",
480-
"2. **Reconstruction loss ($L_{x}{(x,\\hat{x})}$)**: measures how accurately the reconstructed outputs match the input and is given by the $L^1$ norm of the input image and its reconstructed output. \n",
481-
"\n",
482-
"The equations for both of these losses are provided below:\n",
483-
"\n",
484-
"$$ L_{KL}(\\mu, \\sigma) = \\frac{1}{2}\\sum\\limits_{j=0}^{k-1}\\small{(\\sigma_j + \\mu_j^2 - 1 - \\log{\\sigma_j})} $$\n",
485-
"\n",
486-
"$$ L_{x}{(x,\\hat{x})} = ||x-\\hat{x}||_1 $$ \n",
487-
"\n",
488-
"Thus for the VAE loss we have: \n",
489-
"\n",
490-
"$$ L_{VAE} = c\\cdot L_{KL} + L_{x}{(x,\\hat{x})} $$\n",
491-
"\n",
492-
"where $c$ is a weighting coefficient used for regularization. \n",
493-
"\n",
494-
"Now we're ready to define our VAE loss function:"
480+
"2. **Reconstruction loss ($L_{x}{(x,\\hat{x})}$)**: measures how accurately the reconstructed outputs match the input and is given by the $L^1$ norm of the input image and its reconstructed output."
481+
]
482+
},
483+
{
484+
"cell_type": "markdown",
485+
"metadata": {
486+
"id": "Ux3jK2wc153s"
487+
},
488+
"source": [
489+
"The equation for the latent loss is provided by:\r\n",
490+
"\r\n",
491+
"$$L_{KL}(\\mu, \\sigma) = \\frac{1}{2}\\sum_{j=0}^{k-1} (\\sigma_j + \\mu_j^2 - 1 - \\log{\\sigma_j})$$\r\n",
492+
"\r\n",
493+
"The equation for the reconstruction loss is provided by:\r\n",
494+
"\r\n",
495+
"$$L_{x}{(x,\\hat{x})} = ||x-\\hat{x}||_1$$\r\n",
496+
"\r\n",
497+
"Thus for the VAE loss we have:\r\n",
498+
"\r\n",
499+
"$$L_{VAE} = c\\cdot L_{KL} + L_{x}{(x,\\hat{x})}$$\r\n",
500+
"\r\n",
501+
"where $c$ is a weighting coefficient used for regularization. Now we're ready to define our VAE loss function:"
495502
]
496503
},
497504
{
@@ -551,9 +558,9 @@
551558
"source": [
552559
"### Understanding VAEs: reparameterization \n",
553560
"\n",
554-
"As you may recall from lecture, VAEs use a \"reparameterization trick\" for sampling learned latent variables. Instead of the VAE encoder generating a single vector of real numbers for each latent variable, it generates a vector of means and a vector of standard deviations that are constrained to roughly follow Gaussian distributions. We then sample from the standard deviations and add back the mean to output this as our sampled latent vector. Formalizing this for a latent variable $z$ where we sample $\\epsilon \\sim \\mathcal{N}(0,(I))$ we have: \n",
561+
"As you may recall from lecture, VAEs use a \"reparameterization trick\" for sampling learned latent variables. Instead of the VAE encoder generating a single vector of real numbers for each latent variable, it generates a vector of means and a vector of standard deviations that are constrained to roughly follow Gaussian distributions. We then sample from the standard deviations and add back the mean to output this as our sampled latent vector. Formalizing this for a latent variable $z$ where we sample $\\epsilon \\sim \\mathcal{N}(0,(I))$ we have:\n",
555562
"\n",
556-
"$$ z = \\mathbb{\\mu} + e^{\\left(\\frac{1}{2} \\cdot \\log{\\Sigma}\\right)}\\circ \\epsilon $$\n",
563+
"$$z = \\mu + e^{\\left(\\frac{1}{2} \\cdot \\log{\\Sigma}\\right)}\\circ \\epsilon$$\n",
557564
"\n",
558565
"where $\\mu$ is the mean and $\\Sigma$ is the covariance matrix. This is useful because it will let us neatly define the loss function for the VAE, generate randomly sampled latent variables, achieve improved network generalization, **and** make our complete VAE network differentiable so that it can be trained via backpropagation. Quite powerful!\n",
559566
"\n",
@@ -640,8 +647,7 @@
640647
"\n",
641648
"$$L_{total} = L_y(y,\\hat{y}) + \\mathcal{I}_f(y)\\Big[L_{VAE}\\Big]$$\n",
642649
"\n",
643-
"Let's write a function to define the DB-VAE loss function:\n",
644-
"\n"
650+
"Let's write a function to define the DB-VAE loss function:\n"
645651
]
646652
},
647653
{
@@ -1087,7 +1093,7 @@
10871093
"\n",
10881094
"Hopefully this lab has shed some light on a few concepts, from vision based tasks, to VAEs, to algorithmic bias. We like to think it has, but we're biased ;). \n",
10891095
"\n",
1090-
"![Faces](https://media1.tenor.com/images/44e1f590924eca94fe86067a4cf44c72/tenor.gif?itemid=3394328)"
1096+
"<img src=\"https://i.ibb.co/BjLSRMM/ezgif-2-253dfd3f9097.gif\" />"
10911097
]
10921098
}
10931099
]

0 commit comments

Comments
 (0)