latex debug lab2 part2 solutions

avaamini · avaamini · commit d092e269ee3f · 2020-12-30T22:31:03.000-05:00
diff --git a/lab2/solutions/Part2_Debiasing_Solution.ipynb b/lab2/solutions/Part2_Debiasing_Solution.ipynb
@@ -477,21 +477,28 @@
         "In practice, how can we train a VAE? In learning the latent space, we constrain the means and standard deviations to approximately follow a unit Gaussian. Recall that these are learned parameters, and therefore must factor into the loss computation, and that the decoder portion of the VAE is using these parameters to output a reconstruction that should closely match the input image, which also must factor into the loss. What this means is that we'll have two terms in our VAE loss function:\n",
         "\n",
         "1.  **Latent loss ($L_{KL}$)**: measures how closely the learned latent variables match a unit Gaussian and is defined by the Kullback-Leibler (KL) divergence.\n",
-        "2.   **Reconstruction loss ($L_{x}{(x,\\hat{x})}$)**: measures how accurately the reconstructed outputs match the input and is given by the $L^1$ norm of the input image and its reconstructed output.  \n",
-        "\n",
-        "The equations for both of these losses are provided below:\n",
-        "\n",
-        "$$ L_{KL}(\\mu, \\sigma) = \\frac{1}{2}\\sum\\limits_{j=0}^{k-1}\\small{(\\sigma_j + \\mu_j^2 - 1 - \\log{\\sigma_j})} $$\n",
-        "\n",
-        "$$ L_{x}{(x,\\hat{x})} = ||x-\\hat{x}||_1 $$ \n",
-        "\n",
-        "Thus for the VAE loss we have: \n",
-        "\n",
-        "$$ L_{VAE} = c\\cdot L_{KL} + L_{x}{(x,\\hat{x})} $$\n",
-        "\n",
-        "where $c$ is a weighting coefficient used for regularization. \n",
-        "\n",
-        "Now we're ready to define our VAE loss function:"
+        "2.   **Reconstruction loss ($L_{x}{(x,\\hat{x})}$)**: measures how accurately the reconstructed outputs match the input and is given by the $L^1$ norm of the input image and its reconstructed output."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Ux3jK2wc153s"
+      },
+      "source": [
+        "The equation for the latent loss is provided by:\r\n",
+        "\r\n",
+        "$$L_{KL}(\\mu, \\sigma) = \\frac{1}{2}\\sum_{j=0}^{k-1} (\\sigma_j + \\mu_j^2 - 1 - \\log{\\sigma_j})$$\r\n",
+        "\r\n",
+        "The equation for the reconstruction loss is provided by:\r\n",
+        "\r\n",
+        "$$L_{x}{(x,\\hat{x})} = ||x-\\hat{x}||_1$$\r\n",
+        "\r\n",
+        "Thus for the VAE loss we have:\r\n",
+        "\r\n",
+        "$$L_{VAE} = c\\cdot L_{KL} + L_{x}{(x,\\hat{x})}$$\r\n",
+        "\r\n",
+        "where $c$ is a weighting coefficient used for regularization. Now we're ready to define our VAE loss function:"
       ]
     },
     {
@@ -551,9 +558,9 @@
       "source": [
         "### Understanding VAEs: reparameterization \n",
         "\n",
-        "As you may recall from lecture, VAEs use a \"reparameterization  trick\" for sampling learned latent variables. Instead of the VAE encoder generating a single vector of real numbers for each latent variable, it generates a vector of means and a vector of standard deviations that are constrained to roughly follow Gaussian distributions. We then sample from the standard deviations and add back the mean to output this as our sampled latent vector. Formalizing this for a latent variable $z$ where we sample $\\epsilon \\sim \\mathcal{N}(0,(I))$ we have: \n",
+        "As you may recall from lecture, VAEs use a \"reparameterization  trick\" for sampling learned latent variables. Instead of the VAE encoder generating a single vector of real numbers for each latent variable, it generates a vector of means and a vector of standard deviations that are constrained to roughly follow Gaussian distributions. We then sample from the standard deviations and add back the mean to output this as our sampled latent vector. Formalizing this for a latent variable $z$ where we sample $\\epsilon \\sim \\mathcal{N}(0,(I))$ we have:\n",
         "\n",
-        "$$ z = \\mathbb{\\mu} + e^{\\left(\\frac{1}{2} \\cdot \\log{\\Sigma}\\right)}\\circ \\epsilon $$\n",
+        "$$z = \\mu + e^{\\left(\\frac{1}{2} \\cdot \\log{\\Sigma}\\right)}\\circ \\epsilon$$\n",
         "\n",
         "where $\\mu$ is the mean and $\\Sigma$ is the covariance matrix. This is useful because it will let us neatly define the loss function for the VAE, generate randomly sampled latent variables, achieve improved network generalization, **and** make our complete VAE network differentiable so that it can be trained via backpropagation. Quite powerful!\n",
         "\n",
@@ -640,8 +647,7 @@
         "\n",
         "$$L_{total} = L_y(y,\\hat{y}) + \\mathcal{I}_f(y)\\Big[L_{VAE}\\Big]$$\n",
         "\n",
-        "Let's write a function to define the DB-VAE loss function:\n",
-        "\n"
+        "Let's write a function to define the DB-VAE loss function:\n"
       ]
     },
     {
@@ -1087,7 +1093,7 @@
         "\n",
         "Hopefully this lab has shed some light on a few concepts, from vision based tasks, to VAEs, to algorithmic bias. We like to think it has, but we're biased ;). \n",
         "\n",
-        "![Faces](https://media1.tenor.com/images/44e1f590924eca94fe86067a4cf44c72/tenor.gif?itemid=3394328)"
+        "<img src=\"https://i.ibb.co/BjLSRMM/ezgif-2-253dfd3f9097.gif\" />"
       ]
     }
   ]