|
478 | 478 | "In practice, how can we train a VAE? In learning the latent space, we constrain the means and standard deviations to approximately follow a unit Gaussian. Recall that these are learned parameters, and therefore must factor into the loss computation, and that the decoder portion of the VAE is using these parameters to output a reconstruction that should closely match the input image, which also must factor into the loss. What this means is that we'll have two terms in our VAE loss function:\n",
|
479 | 479 | "\n",
|
480 | 480 | "1. **Latent loss ($L_{KL}$)**: measures how closely the learned latent variables match a unit Gaussian and is defined by the Kullback-Leibler (KL) divergence.\n",
|
481 |
| - "2. **Reconstruction loss ($L_{x}{(x,\\hat{x})}$)**: measures how accurately the reconstructed outputs match the input and is given by the $L^1$ norm of the input image and its reconstructed output. \n", |
482 |
| - "\n", |
483 |
| - "The equations for both of these losses are provided below:\n", |
484 |
| - "\n", |
485 |
| - "\\begin{equation*}\n", |
486 |
| - "L_{KL}(\\mu, \\sigma) = \\frac{1}{2}\\sum\\limits_{j=0}^{k-1}\\small{(\\sigma_j + \\mu_j^2 - 1 - \\log{\\sigma_j})}\n", |
487 |
| - "\\end{equation*}\n", |
488 |
| - "\n", |
489 |
| - "\\begin{equation*}\n", |
490 |
| - "L_{x}{(x,\\hat{x})} = ||x-\\hat{x}||_1\n", |
491 |
| - "\\end{equation*}\n", |
492 |
| - "\n", |
493 |
| - "Thus for the VAE loss we have: \n", |
494 |
| - "\n", |
495 |
| - "\\begin{equation*}\n", |
496 |
| - "L_{VAE} = c\\cdot L_{KL} + L_{x}{(x,\\hat{x})}\n", |
497 |
| - "\\end{equation*}\n", |
498 |
| - "\n", |
499 |
| - "where $c$ is a weighting coefficient used for regularization. \n", |
500 |
| - "\n", |
501 |
| - "Now we're ready to define our VAE loss function:" |
| 481 | + "2. **Reconstruction loss ($L_{x}{(x,\\hat{x})}$)**: measures how accurately the reconstructed outputs match the input and is given by the $L^1$ norm of the input image and its reconstructed output." |
| 482 | + ] |
| 483 | + }, |
| 484 | + { |
| 485 | + "cell_type": "markdown", |
| 486 | + "metadata": { |
| 487 | + "id": "KV1khmq5wt-K" |
| 488 | + }, |
| 489 | + "source": [ |
| 490 | + "The equation for the latent loss is provided by:\r\n", |
| 491 | + "\r\n", |
| 492 | + "$$L_{KL}(\\mu, \\sigma) = \\frac{1}{2}\\sum_{j=0}^{k-1} (\\sigma_j + \\mu_j^2 - 1 - \\log{\\sigma_j})$$\r\n", |
| 493 | + "\r\n", |
| 494 | + "The equation for the reconstruction loss is provided by:\r\n", |
| 495 | + "\r\n", |
| 496 | + "$$L_{x}{(x,\\hat{x})} = ||x-\\hat{x}||_1$$\r\n", |
| 497 | + "\r\n", |
| 498 | + "Thus for the VAE loss we have:\r\n", |
| 499 | + "\r\n", |
| 500 | + "$$L_{VAE} = c\\cdot L_{KL} + L_{x}{(x,\\hat{x})}$$\r\n", |
| 501 | + "\r\n", |
| 502 | + "where $c$ is a weighting coefficient used for regularization. Now we're ready to define our VAE loss function:" |
502 | 503 | ]
|
503 | 504 | },
|
504 | 505 | {
|
|
557 | 558 | "\n",
|
558 | 559 | "As you may recall from lecture, VAEs use a \"reparameterization trick\" for sampling learned latent variables. Instead of the VAE encoder generating a single vector of real numbers for each latent variable, it generates a vector of means and a vector of standard deviations that are constrained to roughly follow Gaussian distributions. We then sample from the standard deviations and add back the mean to output this as our sampled latent vector. Formalizing this for a latent variable $z$ where we sample $\\epsilon \\sim \\mathcal{N}(0,(I))$ we have: \n",
|
559 | 560 | "\n",
|
560 |
| - "\\begin{equation}\n", |
561 |
| - "z = \\mathbb{\\mu} + e^{\\left(\\frac{1}{2} \\cdot \\log{\\Sigma}\\right)}\\circ \\epsilon\n", |
562 |
| - "\\end{equation}\n", |
| 561 | + "$$z = \\mu + e^{\\left(\\frac{1}{2} \\cdot \\log{\\Sigma}\\right)}\\circ \\epsilon$$\n", |
563 | 562 | "\n",
|
564 | 563 | "where $\\mu$ is the mean and $\\Sigma$ is the covariance matrix. This is useful because it will let us neatly define the loss function for the VAE, generate randomly sampled latent variables, achieve improved network generalization, **and** make our complete VAE network differentiable so that it can be trained via backpropagation. Quite powerful!\n",
|
565 | 564 | "\n",
|
|
643 | 642 | "\n",
|
644 | 643 | "We can write a single expression for the loss by defining an indicator variable $\\mathcal{I}_f$which reflects which training data are images of faces ($\\mathcal{I}_f(y) = 1$ ) and which are images of non-faces ($\\mathcal{I}_f(y) = 0$). Using this, we obtain:\n",
|
645 | 644 | "\n",
|
646 |
| - "\\begin{equation}\n", |
647 |
| - "L_{total} = L_y(y,\\hat{y}) + \\mathcal{I}_f(y)\\Big[L_{VAE}\\Big]\n", |
648 |
| - "\\end{equation}\n", |
| 645 | + "$$L_{total} = L_y(y,\\hat{y}) + \\mathcal{I}_f(y)\\Big[L_{VAE}\\Big]$$\n", |
649 | 646 | "\n",
|
650 | 647 | "Let's write a function to define the DB-VAE loss function:\n",
|
651 | 648 | "\n"
|
|
1035 | 1032 | "id": "Eo34xC7MbaiQ"
|
1036 | 1033 | },
|
1037 | 1034 | "source": [
|
1038 |
| - "## 2.6 Evaluation of DB-VAE on Test Dataset\n", |
| 1035 | + "## 2.6 Evaluation of DB-VAE on test dataset\n", |
1039 | 1036 | "\n",
|
1040 | 1037 | "Finally let's test our DB-VAE model on the test dataset, looking specifically at its accuracy on each the \"Dark Male\", \"Dark Female\", \"Light Male\", and \"Light Female\" demographics. We will compare the performance of this debiased model against the (potentially biased) standard CNN from earlier in the lab."
|
1041 | 1038 | ]
|
|
1065 | 1062 | "id": "rESoXRPQo_mq"
|
1066 | 1063 | },
|
1067 | 1064 | "source": [
|
1068 |
| - "## 2.7 Conclusion \n", |
| 1065 | + "## 2.7 Conclusion and submission information\n", |
1069 | 1066 | "\n",
|
1070 | 1067 | "We encourage you to think about and maybe even address some questions raised by the approach and results outlined here:\n",
|
1071 | 1068 | "\n",
|
1072 | 1069 | "* How does the accuracy of the DB-VAE across the four demographics compare to that of the standard CNN? Do you find this result surprising in any way?\n",
|
1073 |
| - "* How can the performance of the DB-VAE classifier be improved even further? We purposely did not optimize hyperparameters to leave this up to you! If you want to go further, try to optimize your model to achieve the best performance. **[Email us](mailto:[email protected]) a copy of your notebook with the 2.6 bar plot executed, and we'll give out prizes to the best performers!** \n", |
| 1070 | + "* How can the performance of the DB-VAE classifier be improved even further? We purposely did not optimize hyperparameters to leave this up to you!\n", |
1074 | 1071 | "* In which applications (either related to facial detection or not!) would debiasing in this way be desired? Are there applications where you may not want to debias your model? \n",
|
1075 | 1072 | "* Do you think it should be necessary for companies to demonstrate that their models, particularly in the context of tasks like facial detection, are not biased? If so, do you have thoughts on how this could be standardized and implemented?\n",
|
1076 | 1073 | "* Do you have ideas for other ways to address issues of bias, particularly in terms of the training data?\n",
|
1077 | 1074 | "\n",
|
1078 |
| - "Hopefully this lab has shed some light on a few concepts, from vision based tasks, to VAEs, to algorithmic bias. We like to think it has, but we're biased ;). \n", |
| 1075 | + "Try to optimize your model to achieve improved performance. **MIT students and affiliates will be eligible for prizes during the IAP offering.** To enter the competition, please [email us](mailto:[email protected]) with your name and the following:\n", |
| 1076 | + "\n", |
| 1077 | + "* Jupyter notebook with the code you used to generate your results;\n", |
| 1078 | + "* copy of the bar plot from section 2.6 showing the performance of your model;\n", |
| 1079 | + "* a description and/or diagram of the architecture and hyperparameters you used -- if there are any additional or interesting modifications you made to the template code, please include these in your description;\n", |
| 1080 | + "* discussion of why these modifications helped improve performance.\n", |
| 1081 | + "\n", |
| 1082 | + "Hopefully this lab has shed some light on a few concepts, from vision based tasks, to VAEs, to algorithmic bias. We like to think it has, but we're biased ;).\n", |
1079 | 1083 | "\n",
|
1080 |
| - "<img src=\"https://i.ibb.co/PmCSNXs/tenor.gif\" />" |
| 1084 | + "<img src=\"https://i.ibb.co/BjLSRMM/ezgif-2-253dfd3f9097.gif\" />" |
1081 | 1085 | ]
|
1082 | 1086 | }
|
1083 | 1087 | ]
|
|
0 commit comments