going back to debiasing name

avaamini · avaamini · commit 277d042d0d99 · 2023-01-08T23:03:52.000-05:00
diff --git a/lab2/solutions/Part2_Debiasing_Solution.ipynb b/lab2/solutions/Part2_Debiasing_Solution.ipynb
@@ -856,7 +856,15 @@
       "source": [
         "### Linking model performance to uncertainty and bias\n",
         "\n",
-        "We begin by considering the examples in the test dataset with the highest loss. What can you tell about which features seemed harder to learn for the VAE? What might this tell us about where the model struggles, and what predictions it may be more biased or uncertain about?"
+        "We begin by considering the examples in the test dataset with the highest loss. What can you tell about which features seemed harder to learn for the VAE? What might this tell us about where the model struggles, and what predictions it may be more biased or uncertain about?\n",
+        "\n",
+        "#### **TODO: Analysis and reflection**\n",
+        "\n",
+        "Complete the analysis in the code block below. Write short answers to the following questions and include them in your Lab 2 submission to complete the `TODO`s!\n",
+        "\n",
+        "1. What, if any, trends do you observe comparing the samples with the highest and lowest reconstruction loss?\n",
+        "2. Based on these observations, which features seemed harder to learn for the VAE?\n",
+        "3. How does reconstruction loss relate to uncertainty? Think back to our lecture on Robust & Trustworthy Deep Learning! What can you say about examples on which the model may be more or less uncertain?"
       ],
       "metadata": {
         "id": "QfVngr5J6sj3"
@@ -917,7 +925,16 @@
         "\n",
         "How can we determine the relative frequencies and distributions of different latent features learned by the model? How may these metrics reveal underlying biases?\n",
         "\n",
-        "Let's investigate how well the SS-VAE actually learned the latent features of the faces. To do this, we will inspect individual latent features -- holding all others constant -- and look at the distribution of these features in the data and their corresponding examples. We can examine how the shape and probability density of the learned latent features."
+        "Let's investigate how well the SS-VAE actually learned the latent features of the faces. To do this, we will inspect individual latent features -- holding all others constant -- and look at the distribution of these features in the data and their corresponding examples. We can examine the shape and probability density of the learned latent features. Further we directly compare different values of individual latent variables to corresponding relative classification accuracies (marginalizing out the effects of the other latent variables).\n",
+        "\n",
+        "#### **TODO: Analysis and reflection**\n",
+        "\n",
+        "Complete the analysis in the code blocks below. Carefully inspect the different latent variables and their corresponding frequency distributions. Write short answers to the following questions and include them in your Lab 2 submission to complete the `TODO`s!\n",
+        "\n",
+        "1. Pick two latent variables and describe what semantic meaning they reflect. Include screenshots of the realizations and probability distribution for the latent variables you select.\n",
+        "2. For the latent variables selected, what can you tell about which features are under- or over-represented in the data? What might this tell us about how the model is biased?\n",
+        "3. For the latent variables selected, how do these feature distribution differences affect classification performance? What, if any, general trends do you observe across the latent variables?\n",
+        "4. Based on these observations, please describe your understanding of the bias of the facial detection classifier."
       ]
     },
     {
@@ -948,24 +965,12 @@
         "ax[1].set_ylabel(\"Visualization\");\n"
       ],
       "metadata": {
-        "id": "8qcR9uvfCJku"
+        "id": "8qcR9uvfCJku",
+        "cellView": "form"
       },
       "execution_count": null,
       "outputs": []
     },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "Carefully inspect the different latent variables and their corresponding frequency distributions. What can you tell about which features are under- or over-represented in the data? What might this tell us about how the model is biased?\n",
-        "\n",
-        "How do these feature distribution differences affect classification performance? In addition to these qualitative inspections, we can directly compare different values of individual latent variables to corresponding relative classification accuracies (marginalizing out the effects of the other latent variables).\n",
-        "\n",
-        "What trends do you observe with this evaluation? How does this affect your understanding of the bias of the facial detection classifier?"
-      ],
-      "metadata": {
-        "id": "y97C5Qsh8GvB"
-      }
-    },
     {
       "cell_type": "code",
       "source": [
@@ -1005,26 +1010,42 @@
       "source": [
         "## 2.8 Conclusion and submission information\n",
         "\n",
-        "We encourage you to think about and maybe even address some questions raised by the approach and results outlined here:\n",
-        "* How do the samples with highest reconstruction loss and samples with highest bias compare? Which features is each one highlighting? Why do you think this is?\n",
-        "* In what ways is the dataset biased so far? Can you imagine other features that the dataset is biased against that we haven't uncovered yet?\n",
-        "*  How does the supervised VAE's performance so far compare with the regular classifier on the test dataset? Is this surprising in any way?\n",
-        "*  How can the performance of the supervised VAE classifier and the bias score be improved even further? We purposely did not optimize hyperparameters to leave this up to you!\n",
-        "*  In which applications (either related to facial detection or not!) would debiasing in this way be desired? Are there applications where you may not want to debias your model? \n",
-        "* Do you think it should be necessary for companies to demonstrate that their models, particularly in the context of tasks like facial detection, are not biased? If so, do you have thoughts on how this could be standardized and implemented?\n",
-        "* Do you have ideas for other ways to address issues of bias, particularly in terms of the training data?\n",
+        "**To be eligible for the Debiasing Faces Lab prize, you must submit a document of your answers to the short-answer `TODO`s with your complete lab submission.** Please see the short-answer `TODO`s replicated again here:\n",
+        "\n",
+        "#### **TODO: Linking model performance to uncertainty and bias**\n",
+        "\n",
+        "1. What, if any, trends do you observe comparing the samples with the highest and lowest reconstruction loss?\n",
+        "2. Based on these observations, which features seemed harder to learn for the VAE?\n",
+        "3. How does reconstruction loss relate to uncertainty? Think back to our lecture on Robust & Trustworthy Deep Learning! What can you say about examples on which the model may be more or less uncertain?\n",
+        "\n",
+        "#### **TODO: Uncovering hidden biases through learned latent features**\n",
         "\n",
-        "Try to optimize your model to achieve improved performance. **MIT students and affiliates will be eligible for prizes during the IAP offering.** To enter the competition, MIT students and affiliates should upload the following to the course Canvas:\n",
+        "1. Pick two latent variables and describe what semantic meaning they reflect. Include screenshots of the realizations and probability distribution for the latent variables you select.\n",
+        "2. For the latent variables selected, what can you tell about which features are under- or over-represented in the data? What might this tell us about how the model is biased?\n",
+        "3. For the latent variables selected, how do these feature distribution differences affect classification performance? What, if any, general trends do you observe across the latent variables?\n",
+        "4. Based on these observations, please describe your understanding of the bias of the facial detection classifier.\n",
+        "\n",
+        "**MIT students, employees, and affiliates will be eligible for prizes during the IAP offering. To enter the competition, MIT students, employees, and affiliates should upload a document write-up as part of their complete lab submission for the Debiasing Faces Lab ([submission upload link](https://www.dropbox.com/request/TTYz3Ikx5wIgOITmm5i2)).**"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## 2.9 Thinking ahead\n",
         "\n",
-        "* Jupyter notebook with the code you used to generate your results;\n",
-        "* copy of the bar plot from section 2.6 showing the performance of your model;\n",
-        "* a description and/or diagram of the architecture and hyperparameters you used -- if there are any additional or interesting modifications you made to the template code, please include these in your description;\n",
-        "* discussion of why these modifications helped improve performance.\n",
+        "Beyond this, we encourage you to think about the following questions as you prepare for the next lab, which will focus on mitigating the issues of bias and uncertainty that you just uncovered. Consider:\n",
+        "* How do the samples with highest reconstruction loss and samples with highest bias compare? Which features is each one highlighting? Why do you think this is?\n",
+        "* In what ways is the dataset biased so far? Can you imagine other features that the dataset is biased against that we have not uncovered yet?\n",
+        "*  How can the performance of the supervised VAE classifier be improved?\n",
+        "* Do you have ideas for other ways to address issues of bias, particularly in terms of the training data?\n",
         "\n",
         "Hopefully this lab has shed some light on a few concepts, from vision based tasks, to VAEs, to algorithmic bias. We like to think it has, but we're biased ;).\n",
         "\n",
         "<img src=\"https://i.ibb.co/BjLSRMM/ezgif-2-253dfd3f9097.gif\" />"
-      ]
+      ],
+      "metadata": {
+        "id": "mPRZReq4p68k"
+      }
     }
   ],
   "metadata": {