comments to last code blocks

avaamini · avaamini · commit 7dac460e21c0 · 2023-01-08T23:45:58.000-05:00
diff --git a/lab2/solutions/Part2_FaceDetection_Solution.ipynb b/lab2/solutions/Part2_FaceDetection_Solution.ipynb
@@ -48,14 +48,13 @@
         "\n",
         "# Part 2: Diagnosing Bias in Facial Detection Systems\n",
         "\n",
-        "In the second portion of Lab 2, we'll explore a prominent aspect of applied deep learning for computer vision: facial detection. \n",
+        "In this lab, we'll explore a prominent aspect of applied deep learning for computer vision: facial detection. \n",
         "\n",
         "Consider the task of facial detection: given an image, is it an image of a face?  This seemingly simple -- but extremely important and pervasive -- task is subject to significant amounts of algorithmic bias among select demographics, as [seminal studies](https://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf) have shown.\n",
         "\n",
         "Deploying fair, unbiased AI systems is critical to their long-term acceptance. In this lab, we will build computer vision models for facial detection. We will extend beyond that to build a model to **uncover and diagnose** the biases and issues that exist with standard facial detection models. To do this, we will build a semi-supervised variational autoencoder (SS-VAE) that learns the *latent distribution* of features underlying face image datasets in order to [uncover hidden biases](http://introtodeeplearning.com/AAAI_MitigatingAlgorithmicBias.pdf).\n",
         "\n",
-        "Our work here will set the foundation for Lab 3, where we'll build automated tools to mitigate the underlying issues of bias and uncertainty in facial detection.\n",
-        "\n"
+        "Our work here will set the foundation for the next lab, where we'll build automated tools to mitigate the underlying issues of bias and uncertainty in facial detection."
       ]
     },
     {
@@ -190,7 +189,7 @@
       "source": [
         "### Thinking about bias\n",
         "\n",
-        "We will be training our facial detection classifiers on the large, well-curated CelebA dataset (and ImageNet), and then evaluating their accuracy by testing them on an independent test dataset. Our goal is to identify any potential issues and biases that may exist with the trained facial detection classifiers, and then diagnose what those issues and biases are.\n",
+        "We will be training our facial detection classifiers on the large, well-curated CelebA dataset (and ImageNet), and then evaluate their accuracy as well as inspect and diagnose their hidden flaws. Our goal is to identify any potential issues and biases that may exist with the trained facial detection classifiers, and then diagnose what those issues and biases are.\n",
         "\n",
         "What exactly do we mean when we say a classifier is biased? In order to formalize this, we'll need to think about [*latent variables*](https://en.wikipedia.org/wiki/Latent_variable), variables that define a dataset but are not strictly observed. As defined in the generative modeling lecture, we use the term *latent space* to refer to the probability distributions of the aforementioned latent variables. Putting these ideas together, we consider a classifier *biased* if its classification decision changes after it sees some additional latent features or variables. This definition of bias will be helpful to keep in mind throughout the rest of the lab."
       ]
@@ -856,11 +855,11 @@
       "source": [
         "### Linking model performance to uncertainty and bias\n",
         "\n",
-        "We begin by considering the examples in the test dataset with the highest loss. What can you tell about which features seemed harder to learn for the VAE? What might this tell us about where the model struggles, and what predictions it may be more biased or uncertain about?\n",
+        "We begin by considering the examples in the dataset with the highest loss. What can you tell about which features seemed harder to learn for the VAE? What might this tell us about where the model struggles, and what predictions it may be more biased or uncertain about?\n",
         "\n",
         "#### **TODO: Analysis and reflection**\n",
         "\n",
-        "Complete the analysis in the code block below. Write short answers to the following questions and include them in your Lab 2 submission to complete the `TODO`s!\n",
+        "Complete the analysis in the code block below. Write short answers to the following questions and include them in your Debiasing Faces Lab submission to complete the `TODO`s!\n",
         "\n",
         "1. What, if any, trends do you observe comparing the samples with the highest and lowest reconstruction loss?\n",
         "2. Based on these observations, which features seemed harder to learn for the VAE?\n",
@@ -887,28 +886,13 @@
         "vae_loss = vae_loss.numpy()\n",
         "ind = np.argsort(vae_loss, axis=None)\n",
         "\n",
-        "def create_grid_of_images(xs, size=(5,5)):\n",
-        "  grid = []\n",
-        "  counter = 0\n",
-        "  for i in range(size[0]):\n",
-        "    row = []\n",
-        "    for j in range(size[1]):\n",
-        "      row.append(xs[counter])\n",
-        "      counter += 1\n",
-        "    row = np.hstack(row)\n",
-        "    grid.append(row)\n",
-        "  grid = np.vstack(grid)\n",
-        "  return grid\n",
-        "\n",
-        "# mdl.util.create_grid_of_images\n",
-        "\n",
         "# Plot the 25 samples with the highest and lowest reconstruction losses\n",
         "fig, ax = plt.subplots(1, 2, figsize=(16, 8))\n",
-        "ax[0].imshow(create_grid_of_images(x[ind[:25]]))\n",
+        "ax[0].imshow(mdl.util.create_grid_of_images(x[ind[:25]]))\n",
         "ax[0].set_title(\"Samples with the lowest reconstruction loss \\n\" + \n",
         "                f\"Average recon loss: {np.mean(vae_loss[ind[:25]]):.2f}\")\n",
         "\n",
-        "ax[1].imshow(create_grid_of_images(x[ind[-25:]]))\n",
+        "ax[1].imshow(mdl.util.create_grid_of_images(x[ind[-25:]]))\n",
         "ax[1].set_title(\"Samples with the highest reconstruction loss \\n\" + \n",
         "                f\"Average recon loss: {np.mean(vae_loss[ind[-25:]]):.2f}\");"
       ]
@@ -929,7 +913,7 @@
         "\n",
         "#### **TODO: Analysis and reflection**\n",
         "\n",
-        "Complete the analysis in the code blocks below. Carefully inspect the different latent variables and their corresponding frequency distributions. Write short answers to the following questions and include them in your Lab 2 submission to complete the `TODO`s!\n",
+        "Complete the analysis in the code blocks below. Carefully inspect the different latent variables and their corresponding frequency distributions. Write short answers to the following questions and include them in your Debiasing Faces Lab submission to complete the `TODO`s!\n",
         "\n",
         "1. Pick two latent variables and describe what semantic meaning they reflect. Include screenshots of the realizations and probability distribution for the latent variables you select.\n",
         "2. For the latent variables selected, what can you tell about which features are under- or over-represented in the data? What might this tell us about how the model is biased?\n",
@@ -941,57 +925,78 @@
       "cell_type": "code",
       "source": [
         "#@title Change the sliders to inspect different latent features! { run: \"auto\" }\n",
-        "idx_latent = 25 #@param {type:\"slider\", min:0, max:50, step:1}\n",
+        "idx_latent = 25 #@param {type:\"slider\", min:0, max:31, step:1}\n",
         "num_steps = 15\n",
         "\n",
+        "# Extract all latent samples from the desired dimension\n",
         "latent_samples = z_mean[:, idx_latent]\n",
-        "density, latent_bins = np.histogram(latent_samples, num_steps, density=True)\n",
         "\n",
+        "# Compute their density and plot\n",
+        "density, latent_bins = np.histogram(latent_samples, num_steps, density=True)\n",
         "fig, ax = plt.subplots(2, 1, figsize=(15, 4))\n",
         "ax[0].bar(latent_bins[1:], density)\n",
         "ax[0].set_ylabel(\"Data density\")\n",
         "\n",
+        "# Visualize reconstructions as we walk across the latent space\n",
         "latent_steps = np.linspace(np.min(latent_samples), np.max(latent_samples), num_steps)\n",
-        "\n",
         "baseline_latent = tf.reduce_mean(z_mean, 0, keepdims=True)\n",
+        "\n",
         "recons = []\n",
         "for step in latent_steps: \n",
+        "  # Adjust the latent vector according to our step\n",
         "  latent = baseline_latent.numpy()\n",
         "  latent[0, idx_latent] = step\n",
+        "  # Decode the reconstruction and store\n",
         "  recons.append(dbvae.decode(latent)[0])\n",
         "\n",
-        "ax[1].imshow(create_grid_of_images(recons, (1, num_steps)))\n",
+        "# Visualize all of the reconstructions!\n",
+        "ax[1].imshow(mdl.util.create_grid_of_images(recons, (1, num_steps)))\n",
         "ax[1].set_xlabel(\"Latent step\")\n",
         "ax[1].set_ylabel(\"Visualization\");\n"
       ],
       "metadata": {
-        "id": "8qcR9uvfCJku",
-        "cellView": "form"
+        "id": "8qcR9uvfCJku"
       },
       "execution_count": null,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "\n",
+        "### Inspect how the accuracy changes as a function of density in the latent space\n"
+      ],
+      "metadata": {
+        "id": "3ExRRPO2z27z"
+      }
+    },
     {
       "cell_type": "code",
       "source": [
+        "# Loop through every latent dimension\n",
         "avg_logit_per_bin = []\n",
         "for idx_latent in range(latent_dim): \n",
         "  latent_samples = z_mean[:, idx_latent]\n",
         "  start = np.percentile(latent_samples, 5)\n",
         "  end = np.percentile(latent_samples, 95)\n",
         "  latent_steps = np.linspace(start, end, num_steps)\n",
         "\n",
+        "  # Find which samples fall in which bin of the latent dimension\n",
         "  which_latent_bin = np.digitize(latent_samples, latent_steps)\n",
+        "  \n",
+        "  # For each latent bin, compute the accuracy (average logit score)\n",
         "  avg_logit = []\n",
         "  for j in range(0, num_steps+1): \n",
         "    inds_in_bin = np.where(which_latent_bin == j)\n",
         "    avg_logit.append(y_logit.numpy()[inds_in_bin].mean())\n",
         "\n",
         "  avg_logit_per_bin.append(avg_logit)\n",
         "  \n",
+        "# Average the results across all latent dimensions and all samples\n",
         "accuracy_per_latent = np.mean(avg_logit_per_bin, 0)\n",
         "accuracy_per_latent = (accuracy_per_latent - accuracy_per_latent.min()) / np.ptp(accuracy_per_latent)\n",
         "\n",
+        "# Plot the results\n",
         "plt.plot(np.linspace(start, end, num_steps+1), accuracy_per_latent,'-o')\n",
         "plt.xlabel(\"Latent step\")\n",
         "plt.ylabel(\"Relative accuracy\")"