diff --git a/colab.ipynb b/colab.ipynb
index e251169..25aad78 100644
--- a/colab.ipynb
+++ b/colab.ipynb
@@ -1,1444 +1,1504 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "view-in-github"
- },
- "source": [
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "copyright"
- },
- "source": [
- "#### Copyright 2020 Google LLC."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "pDv-M1JQH0nc"
- },
- "outputs": [],
- "source": [
- "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
- "# you may not use this file except in compliance with the License.\n",
- "# You may obtain a copy of the License at\n",
- "#\n",
- "# https://www.apache.org/licenses/LICENSE-2.0\n",
- "#\n",
- "# Unless required by applicable law or agreed to in writing, software\n",
- "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
- "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
- "# See the License for the specific language governing permissions and\n",
- "# limitations under the License."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "PB5N5exO33YO"
- },
- "source": []
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "CVmV0M74xwm7"
- },
- "source": [
- "# Introduction to Image Classification"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "sdfkasjfskjf"
- },
- "source": [
- "We have learned about binary and multiclass classification, and we've done so using datasets consisting of feature columns that contain numeric and string values. The numbers could be continuous or categorical. The strings we have used so far were all categorical features.\n",
- "\n",
- "In this lab we will perform another type of classification: **image classification**.\n",
- "\n",
- "Image classification can be binary: \"*Is this an image of a dog?*\" \n",
- "It can also be multiclass: \"*Is this an image of a cat, dog, horse, or cow?*\"\n",
- "\n",
- "The questions above assume there is only one item in an image. There is an even more advanced form of multiclass classification that answers the following question: What are all of the classes in an image and where are they located? For example: \"*Where are all of the cats, dogs, horses, and cows in this image?*\".\n",
- "\n",
- "In this introduction to image classification, we'll focus on classification where there is only one item depicted in each image. In future labs we'll learn about the more advanced forms of image classification.\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "DLdCchMdCaWQ"
- },
- "source": [
- "## The Dataset\n",
- "\n",
- "The dataset we'll use for this Colab is the [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset, which contains 70,000 grayscale images labeled with one of ten categories.\n",
- "\n",
- "The categories are:\n",
- "\n",
- "Label\t| Class\n",
- "------|------------\n",
- "0 | T-shirt/top\n",
- "1 | Trouser\n",
- "2 | Pullover\n",
- "3 | Dress\n",
- "4 | Coat\n",
- "5 | Sandal\n",
- "6 | Shirt\n",
- "7 | Sneaker\n",
- "8 | Bag\n",
- "9 | Ankle boot\n",
- "\n",
- "\n",
- "The images show individual articles of clothing at low resolution (28 by 28 pixels), as seen here:\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " |
\n",
- " \n",
- " Figure 1. Fashion-MNIST samples (by Zalando, MIT License). \n",
- " |
\n",
- "
\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "t9FDsUlxCaWW"
- },
- "source": [
- "### Load the Data"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "f-0Gffhfmotj"
- },
- "source": [
- "Now that we have a rough understanding of the data we're going to use in our model, let's load the data into this lab. The Fashion MNIST dataset is conveniently available from the [Keras Datasets repository](https://keras.io/datasets/) along with a [utility function](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/fashion_mnist/load_data) for downloading and loading the data into NumPy arrays.\n",
- "\n",
- "In the code cell below, we import TensorFlow and download the Fashion-MNIST data."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "7MqDQO0KCaWS"
- },
- "outputs": [],
- "source": [
- "import tensorflow as tf\n",
- "\n",
- "(train_images, train_labels), (test_images, test_labels) = \\\n",
- " tf.keras.datasets.fashion_mnist.load_data()\n",
- "\n",
- "print(train_images.shape)\n",
- "print(train_labels.shape)\n",
- "print(test_images.shape)\n",
- "print(test_labels.shape)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "j5gG9S5Wb6cM"
- },
- "source": [
- "`load_data()` returns two tuples, one for the training dataset and the other for the testing dataset. As you can see from the output of the code cell above, we have `60,000` training samples and `10,000` testing samples. This makes for a `14%` holdout of the data.\n",
- "\n",
- "You might be wondering what that `28, 28` is in the image data. That is a two-dimensional representation of the image. This is our feature data. Each pixel of the image is a feature. A `28` by `28` image has `784` pixels.\n",
- "\n",
- "As you can see, even a tiny image generates quite a few features. If we were processing 4k-resolution images, which are often `3840` by `2160` pixels, then we would have `8,294,400` features! Over eight million features is quite a bit. In later labs we'll address some strategies for working with this massive amount of data."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "Brm0b_KACaWX"
- },
- "source": [
- "### Exploratory Data Analysis"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "tRBC2X79qXTH"
- },
- "source": [
- "It is always a good idea to look at your data before diving in to building your model. Remember that our data is divided across four NumPy arrays, two of which are three-dimensional arrays:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "zW5k_xz1CaWX"
- },
- "outputs": [],
- "source": [
- "print('Training images:', train_images.shape)\n",
- "print('Training labels:', train_labels.shape)\n",
- "print('Test images:', test_images.shape)\n",
- "print('Test labels:', test_labels.shape)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "d4LAdvnOrO3a"
- },
- "source": [
- "To make our exploration tasks a little easier, let's put the data into a Pandas `DataFrame`. One way to do this is to flatten the `28` by `28` image into a flat array of `784` pixels, with the pixel number being the column name. We then add the labels to a `target` column."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "wC7C8_6Fqwg3"
- },
- "outputs": [],
- "source": [
- "import numpy as np\n",
- "import pandas as pd\n",
- "\n",
- "train_df = pd.DataFrame(\n",
- " np.array([x.flatten() for x in train_images]),\n",
- " columns=[i for i in range(784)]\n",
- ")\n",
- "train_df['target'] = train_labels\n",
- "\n",
- "train_df.describe()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "aEUjCBEZstGT"
- },
- "source": [
- "With so many columns, reading the output of `describe()` is nearly impossible. Let's instead do our analysis a little differently.\n",
- "\n",
- "To begin, we will find the minimum value of every pixel column and output the sorted list of unique values."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "UP6QFiMXruBz"
- },
- "outputs": [],
- "source": [
- "FEATURES = train_df.columns[:-1]\n",
- "\n",
- "sorted(train_df.loc[:, FEATURES].min().unique())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "ifk1Cu4ns8qD"
- },
- "source": [
- "All of the values were `0`.\n",
- "\n",
- "Let's do the same for the maximum values."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "6uF0vRn1tE8B"
- },
- "outputs": [],
- "source": [
- "sorted(train_df.loc[:, FEATURES].max().unique())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "A_8-Qwu2tEBm"
- },
- "source": [
- "That is more interesting. We seem to have values ranging from `16` through `255`. These values represent color intensities for grayscale images. `0`, which we saw as a minimum value, maps to black in the color map that we will use, while `255` is white.\n",
- "\n",
- "Let's see a histogram distribution of our max pixel values."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "fGIe1xy3thhB"
- },
- "outputs": [],
- "source": [
- "import matplotlib.pyplot as plt\n",
- "\n",
- "_ = plt.hist(train_df.loc[:, FEATURES].max().unique())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "RrLPDMpetr-r"
- },
- "source": [
- "Unsurprisingly, higher intensity values seem to be more prevalent as maximum pixel values than lower intensity values."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "a7hdwFBtt5CO"
- },
- "source": [
- "#### Exercise 1: Charting Pixel Intensities\n",
- "\n",
- "In the example above, we created a histogram containing the maximum pixel intensities. In this exercise you will create a histogram for all pixel intensities in the training dataset.\n",
- "\n",
- "If some intensities are outliers, remove them to get a more meaningful histogram.\n",
- "\n",
- "Hint: The NumPy [`where`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html) and [`flatten`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html) can come in handy for this exercise. "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "UjvjiRQ_uJy4"
- },
- "source": [
- "##### **Student Solution**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "Wn2Z6_QMuMH_"
- },
- "outputs": [],
- "source": [
- "# Your code goes here"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "VDbc80D8uOWI"
- },
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "TNKwNfv7xQ9d"
- },
- "source": [
- "#### Continuing on With EDA"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "zgPxBAdY1uX-"
- },
- "source": [
- "Now that we have a basic idea of the values in our dataset, let's see if any are missing."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "PirHFXhFxbuc"
- },
- "outputs": [],
- "source": [
- "train_df.isna().any().any()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "gB_Zwi5kxkNe"
- },
- "source": [
- "Good. We now know we aren't missing any values, and our pixel values range from `0` through `255`.\n",
- "\n",
- "Let's now see if our target values are what we expect."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "o-r8BxRKxryp"
- },
- "outputs": [],
- "source": [
- "sorted(train_df['target'].unique())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "omUG0nkIzP3_"
- },
- "source": [
- "Let's see the distribution."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "-NlAF5BZzBgc"
- },
- "outputs": [],
- "source": [
- "_ = train_df['target'].hist(bins=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "5LrAHbI1zSON"
- },
- "source": [
- "The class types seem evenly distributed. We have `6,000` of each."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "Doat5-gY3rPw"
- },
- "source": [
- "The numeric values should map to these clothing types:\n",
- "\n",
- "Label\t| Class\n",
- "------|------------\n",
- "0 | T-shirt/top\n",
- "1 | Trouser\n",
- "2 | Pullover\n",
- "3 | Dress\n",
- "4 | Coat\n",
- "5 | Sandal\n",
- "6 | Shirt\n",
- "7 | Sneaker\n",
- "8 | Bag\n",
- "9 | Ankle boot"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "IAqKCu8ix42l"
- },
- "source": [
- "We can spot check this by looking at some of the images. Let's check a random 'T-shirt/top'.\n",
- "\n",
- "To do this we select a random index from the 'T-shirt/top' items (`target = 0`). We then reshape the pixel columns back into a `28` by `28` two-dimensional array, which are the dimensions of the image. We then use `imshow()` to display the image."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "Ex7WQYrgyz7O"
- },
- "outputs": [],
- "source": [
- "index = np.random.choice(train_df[train_df['target'] == 0].index.values)\n",
- "\n",
- "pixels = train_df.loc[index, FEATURES].to_numpy().reshape(28, 28)\n",
- "\n",
- "_ = plt.imshow(pixels, cmap='gray')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "2n1ZJeoK1CZ6"
- },
- "source": [
- "In our sample we got an image that looked like a very low resolution t-shirt. You should see the same. Note: every time you rerun the above cell, a new random index will be chosen, so feel free to cycle through some of the values to see the different types of t-shirt/top images included in the dataset."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "0bVHDcVv1Kaq"
- },
- "source": [
- "This single image spot checking is okay, but it doesn't scale well.\n",
- "\n",
- "We can view multiple images at a time using the [`GridSpec`](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.gridspec.GridSpec.html) class from Matplotlib.\n",
- "\n",
- "In the code below, we build a visualization with a `10` by `10` grid of images in our t-shirt class.\n",
- "\n",
- "The code imports `gridspec`, sets the number of rows and columns, and then sets the figure size so the image is large enough for us to actually see different samples.\n",
- "\n",
- "After that bit of setup, we create a `10` by `10` `GridSpec`. The other parameters to the constructor are there to ensure the images are tightly packed into the grid. Try experimenting with some other values.\n",
- "\n",
- "Next we randomly choose `100` indexes from items labelled with class `0` our training data.\n",
- "\n",
- "The remainder of the code should look pretty familiar. We used similar code above to show a single image. The difference in this code is that we are adding `100` subplots using the `GridSpec`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "n8NmTQzavWuS"
- },
- "outputs": [],
- "source": [
- "from matplotlib import gridspec\n",
- "\n",
- "# Row and column count (100 samples)\n",
- "rows = 10\n",
- "cols = 10\n",
- "\n",
- "# Size of the final output image\n",
- "plt.figure(figsize=(12, 12)) \n",
- "\n",
- "# Grid that will be used to organize our samples\n",
- "gspec = gridspec.GridSpec(\n",
- " rows,\n",
- " cols,\n",
- " wspace = 0.0,\n",
- " hspace = 0.0,\n",
- " top = 1.0,\n",
- " bottom = 0.0,\n",
- " left = 0.00, \n",
- " right = 1.0,\n",
- ") \n",
- "\n",
- "# Randomly choose a sample of t-shirts\n",
- "T_SHIRTS = 0\n",
- "indexes = np.random.choice(\n",
- " train_df[train_df['target'] == T_SHIRTS].index.values,\n",
- " rows*cols\n",
- ")\n",
- "\n",
- "# Add each sample to a plot using the GridSpec\n",
- "cnt = 0\n",
- "for r in range(rows):\n",
- " for c in range(cols):\n",
- " row = train_df.loc[indexes[cnt], FEATURES]\n",
- " img = row.to_numpy().reshape((28, 28))\n",
- "\n",
- " ax = plt.subplot(gspec[r, c])\n",
- " ax.imshow(img, cmap='gray')\n",
- " ax.xaxis.set_visible(False)\n",
- " ax.yaxis.set_visible(False)\n",
- " cnt = cnt + 1\n",
- "\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "kOpQEsRI6yla"
- },
- "source": [
- "#### Exercise 2: Visualizing Every Class\n",
- "\n",
- "In this exercise, you'll take the code that we used above to visualize t-shirts and use it to visualize every class represented in our dataset. You'll need to print out the class name and then show a `10` by `10` grid of samples from that class. Try to minimize the amount of repeated code in your solution."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "6f1pYNnXGBBy"
- },
- "source": [
- "##### **Student Solution**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "E91p1ryQ7KfR"
- },
- "outputs": [],
- "source": [
- "# Your code goes here"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "ydpE8MjX7Lt-"
- },
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "aI-rapAs7UPG"
- },
- "source": [
- "#### Wrapping Up EDA"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "FvCRgygY7V_p"
- },
- "source": [
- "From our visual analysis, our samples seem reasonable.\n",
- "\n",
- "First off, class names seem to match pictures. This can give us some confidence that our data is labelled correctly.\n",
- "\n",
- "Another nice thing is that all of the clothing items seem to be oriented in the same direction for the most part. If shoes were pointing in different directions, or if any images were rotated, then we would have had a lot more processing to do.\n",
- "\n",
- "And finally, all of our images are the same dimensions and are encoded with a single numeric grayscale intensity. In the real world, you'll likely not get so lucky. Images are acquired in different sizes and with different color encodings. We'll get to some examples of this in future labs.\n",
- "\n",
- "Based on our analysis so far, we can end our EDA and move on to model building."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "59veuiEZCaW4"
- },
- "source": [
- "## Modeling"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "9KGrH87n8Qmk"
- },
- "source": [
- "We have many options for building a multiclass classification model for images. In this lab we will build a deep neural network using TensorFlow Keras."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "Oy5hRX8WCQ7a"
- },
- "source": [
- "### Preparing the Data"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "5knAfp_WCUAd"
- },
- "source": [
- "Our feature data is on a scale from `0` to `255`, and our target data is categorically encoded. Fortunately, all of the features are on the same scale, so we don't have to worry about standardizing scale. However, we'll need to do a little data preprocessing in order to get our data ready for modeling."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "18B1xVFgDvbK"
- },
- "source": [
- "The first bit of data preprocessing we'll do is bring the feature values into the range of `0.0` and `1.0`. We could perform normalization to do this, but normalization actually isn't the only solution in this case.\n",
- "\n",
- "We know that all of our features are pixel values in the range of `0` to `255`. We also know from our EDA that every feature has a minimum value of `0`, but that the max values have a pretty wide range. It is possible we would make our model worse by normalizing, since we'd be making the same values across pixels not map to the same color.\n",
- "\n",
- "Instead of normalizing, we can just divide every feature by `255.0`. This keeps the relative values the same across pixels."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "mGIomoO4EmbK"
- },
- "outputs": [],
- "source": [
- "train_df[FEATURES] = train_df[FEATURES] / 255.0\n",
- "\n",
- "train_df[FEATURES].describe()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "M8BZRIlUEw8x"
- },
- "source": [
- "#### Exercise 3: One-Hot Encoding\n",
- "\n",
- "Our target values are categorical values in a column named `target`. In this exercise, you will one-hot encode the target values. Your code should:\n",
- "\n",
- "1. Create ten new columns named `target_0` through `target_9`.\n",
- "1. Create a variable called `TARGETS` that contains the `10` target column names.\n",
- "1. `describe()` the ten new target column values to ensure that they have values between 0 and 1 and that the one-hot encoding looks evenly distributed."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "eCJVMjO3F8_T"
- },
- "source": [
- "##### **Student Solution**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "s-xkoR97FXbu"
- },
- "outputs": [],
- "source": [
- "# Your code goes here"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "l2qLaT8zFV-I"
- },
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "Gxg1XGm0eOBy"
- },
- "source": [
- "### Configure and Compile the Model"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "NPY2MBfCGP2A"
- },
- "source": [
- "We'll be relying on the TensorFlow Keras [`Sequential` model](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential) and [`Dense` layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense) that we used in previous labs.\n",
- "\n",
- "In this case our input shape needs to be the size of our feature count. We'll then add a few hidden layers and then use a softmax layer the same width as our target count. This layer should output the probability that a given set of input features maps to each of our targets. The sum of the probabilities will equal `1.0`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "9ODch-OFCaW4"
- },
- "outputs": [],
- "source": [
- "model = tf.keras.Sequential([\n",
- " tf.keras.layers.Dense(128, input_shape=(len(FEATURES),)),\n",
- " tf.keras.layers.Dense(64, activation=tf.nn.relu),\n",
- " tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
- " tf.keras.layers.Dense(len(TARGETS), activation=tf.nn.softmax)\n",
- "])\n",
- "\n",
- "model.summary()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "pTyZGa-eH3xJ"
- },
- "source": [
- "Note that our images are actually `28` by `28` images. We flattened the images when we loaded them into a dataframe for EDA. However, flattening outside of the model isn't necessary. TensorFlow works in many dimensions. If we wanted to keep our images as `28` by `28` matrices, we could have added a [`Flatten`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Flatten) layer as shown below.\n",
- "\n",
- "```python\n",
- "model = tf.keras.Sequential([\n",
- " tf.keras.layers.Flatten(input_shape=(28, 28)),\n",
- " tf.keras.layers.Dense(128, activation=tf.nn.relu),\n",
- " tf.keras.layers.Dense(64, activation=tf.nn.relu),\n",
- " tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
- " tf.keras.layers.Dense(len(TARGETS), activation=tf.nn.softmax)\n",
- "])\n",
- "\n",
- "model.summary()\n",
- "```\n",
- "\n",
- "```\n",
- "Model: \"sequential_8\"\n",
- "_________________________________________________________________\n",
- "Layer (type) Output Shape Param # \n",
- "=================================================================\n",
- "flatten_5 (Flatten) (None, 784) 0 \n",
- "_________________________________________________________________\n",
- "dense_28 (Dense) (None, 128) 100480 \n",
- "_________________________________________________________________\n",
- "dense_29 (Dense) (None, 64) 8256 \n",
- "_________________________________________________________________\n",
- "dense_30 (Dense) (None, 32) 2080 \n",
- "_________________________________________________________________\n",
- "dense_31 (Dense) (None, 10) 330 \n",
- "=================================================================\n",
- "Total params: 111,146\n",
- "Trainable params: 111,146\n",
- "Non-trainable params: 0\n",
- "_________________________________________________________________\n",
- "```\n",
- "\n",
- "This model results in the same number of trainable parameters as our pre-flattened model; it just saves us from having to flatten the images outside of TensorFlow.\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "gut8A_7rCaW6"
- },
- "source": [
- "Before the model is ready for training, it needs a few more settings. These are added during the model's *compile* step:\n",
- "\n",
- "* *Loss function* — This measures how well the model is doing during training. We want to minimize this function to \"steer\" the model in the right direction. A large loss would indicate the model is performing poorly in classification tasks, meaning it is not matching input images to the correct class names. (It might classify a boot as a coat, for example.)\n",
- "* *Optimizer* — This is how the model is updated based on the data it sees and its loss function.\n",
- "* *Metrics* — This is used to monitor the training and testing steps. The following example uses *accuracy*, the fraction of the images that are correctly classified."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "Lhan11blCaW7"
- },
- "outputs": [],
- "source": [
- "model.compile(\n",
- " loss='categorical_crossentropy',\n",
- " optimizer='Adam',\n",
- " metrics=['accuracy'],\n",
- ")\n",
- "\n",
- "model.summary()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "qKF6uW-BCaW-"
- },
- "source": [
- "## Train the Model\n",
- "\n",
- "Training a Keras API neural network model to classify images looks just like all of the other Keras models we have worked with so far. We call this the `model.fit` method, passing it our training data and any other parameters we'd like to use."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "xvwvpA64CaW_"
- },
- "outputs": [],
- "source": [
- "callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)\n",
- "\n",
- "history = model.fit(\n",
- " train_df[FEATURES],\n",
- " train_df[TARGETS],\n",
- " epochs=500,\n",
- " callbacks=[callback]\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "W3ZVOhugCaXA"
- },
- "source": [
- "As the model trains, the loss and accuracy metrics are displayed. We also store the progression in history.\n",
- "\n",
- "You'll notice that this took longer to train per epoch than many of the models we've built previously in this course. That's because of the large number of features. Even with these tiny `28` by `28` grayscale images, we are still dealing with `784` features. This is orders of magnitude larger than the 10 or so features we are used to using."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "i6WTGUBIJpzc"
- },
- "source": [
- "### Exercise 4: Graph Model Progress\n",
- "\n",
- "In this exercise you'll create two graphs. The first will show the model loss over each epoch. The second will show the model accuracy over each epoch. Feel free to use any graphical toolkit we have used so far."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "vT42LGuNJ7d1"
- },
- "outputs": [],
- "source": [
- "# Your code goes here"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "29fLrNFaJ9c6"
- },
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "BuTr50NNM3kV"
- },
- "source": [
- "## Evaluate the Model\n",
- "\n",
- "Now that our model is trained, let's evaluate it using an independent test data set. Then let's see if the model quality holds up. We'll use `model.evaluate()` and pass in the test dataset. `model.evaluate()` returns a `test_loss` and `test_accuracy`.\n",
- "\n",
- "Also note that we need to apply the same feature preprocessing to the test data that we did to the train data."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "VflXLEeECaXC"
- },
- "outputs": [],
- "source": [
- "test_df = pd.DataFrame(\n",
- " np.array([x.flatten() for x in test_images]),\n",
- " columns=[i for i in range(784)]\n",
- ")\n",
- "test_df['target'] = test_labels\n",
- "\n",
- "test_df[FEATURES] = test_df[FEATURES] / 255.0\n",
- "\n",
- "for class_i in sorted(test_df['target'].unique()):\n",
- " column_name = f'target_{class_i}'\n",
- " test_df[column_name] = (test_df['target'] == class_i).astype(int)\n",
- "\n",
- "train_loss = history.history['loss'][-1]\n",
- "train_accuracy = history.history['accuracy'][-1]\n",
- "(test_loss, test_accuracy) = model.evaluate(test_df[FEATURES], test_df[TARGETS])\n",
- "\n",
- "print('Training loss:', train_loss)\n",
- "print('Training accuracy:', train_accuracy)\n",
- "\n",
- "print('Test loss:', test_loss)\n",
- "print('Test accuracy:', test_accuracy)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "yWfgsmVXCaXG"
- },
- "source": [
- "The accuracy on the test dataset is noticeably less than the accuracy on the training dataset. This gap between training accuracy and test accuracy is an example of **overfitting**. Overfitting is when a machine learning model tends to perform worse on new data than on the training data. The trained model is unable to **generalize** to data that it has not seen before.\n",
- "\n",
- "There are many ways to try to reduce overfitting. One that we have seen is **early stopping**. This causes training to stop when loss stops changing significantly for a model. Without early stopping, the model would continue training, becoming more and more tailored to the training data and likely less to be able to generalize across new data.\n",
- "\n",
- "Another method for reducing overfitting in deep neural networks is **dropout**. A dropout layer is a layer that sits between two regular layers (in our case, dense layers) and randomly sets some of the values passed between layers to `0`.\n",
- "\n",
- "In TensorFlow the [`Dropout`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout) class is capable of doing this. To use `Dropout` you simply add a `Dropout` layer between other layers of the model. Each dropout layer has a percentage of values that it will set to `0`.\n",
- "\n",
- "```python\n",
- "model = tf.keras.Sequential([\n",
- " ...\n",
- " tf.keras.layers.Dense(name='L14', 431, activation=tf.nn.relu),\n",
- " # Randomly sets 15% of values between L14 and L15 to 0\n",
- " tf.keras.layers.Dropout(rate=0.15),\n",
- " tf.keras.layers.Dense(name='L15', 257, activation=tf.nn.relu),\n",
- " # Randomly sets 1% of values between L15 and L16 to 0\n",
- " tf.keras.layers.Dropout(rate=0.01),\n",
- " tf.keras.layers.Dense(name='L16', 57, activation=tf.nn.relu),\n",
- " ...\n",
- "])\n",
- "```"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "IlBarsvH6Wee"
- },
- "source": [
- "### Exercise 5: Dropout Layers\n",
- "\n",
- "In this exercise take the model from above and add a [`Dropout`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout) layer or layers between the `Dense` layers. See if you can find a configuration that reduces the gap between the training loss and accuracy and the test loss and accuracy. Document your findings."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "6VO8ZidHmK_X"
- },
- "source": [
- "#### **Student Solution**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "JSeNZ3RC6vGD"
- },
- "outputs": [],
- "source": [
- "# Your code goes here"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "3yNtz-8e87Fk"
- },
- "source": [
- "Iterate a few times and find a dropout model that seems to bring the testing and training numbers closer together. When you are done, document your findings in the table below. The **?**s are placeholders for accuracy and loss values."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "INWU28f477nR"
- },
- "source": [
- "Dropout (Y/N) | Train/Test | Accuracy | Loss\n",
- "--------------|------------| ---------|------\n",
- "N | Train | *?* | *?*\n",
- "N | Test | *?* | *?*\n",
- "Y | Train | *?* | *?*\n",
- "Y | Test | *?* | *?*\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "Jy5-RDqg6w4Q"
- },
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "xsoS7CPDCaXH"
- },
- "source": [
- "## Make Predictions\n",
- "\n",
- "We have now trained the model while trying to reduce overfitting. Let's say we're happy with our numbers and are ready to deploy the model. Now it is time to make predictions.\n",
- "\n",
- "We could now snap an image of a clothing item, resize it to `28` by `28`, and grayscale it. But that is a lot of work and outside the scope of this class. For simplicity, let's use the test images as input to the model and see what predictions we get.\n",
- "\n",
- "We'll use the `model.predict()` function to do this. Let's make our predictions and peek at the first result."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "Gl91RPhdCaXI"
- },
- "outputs": [],
- "source": [
- "predictions = model.predict(test_df[FEATURES])\n",
- "\n",
- "predictions[0]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "C7Z21UdulbIg"
- },
- "source": [
- "What are those numbers?\n",
- "\n",
- "For each image:\n",
- " * the prediction result is in the form of 10 numbers, one for each possible label\n",
- " * each number represents the level of confidence that a label is the correct label for the particular image\n",
- " * all 10 numbers should add up to the sum of 1\n",
- "\n",
- "Let's see if that is true."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "6ls2RrRlAo0f"
- },
- "outputs": [],
- "source": [
- "sum(predictions[0]), sum(predictions[1])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "FySF9kRTAvhl"
- },
- "source": [
- "Well, maybe not `1`, but the result definitely approaches `1`. Floating point math makes summing to exactly `1` a little difficult."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "-hw1hgeSCaXN"
- },
- "source": [
- "Let's find out which label has the highest predicted number and whether it matches with the actual test label.\n",
- "\n",
- "To find the highest predicted number we will use Numpy's [`argmax`](https://docs.scipy.org/doc/numpy-1.9.3/reference/generated/numpy.argmax.html) function which returns the index of the maximum value in an array."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "cellView": "both",
- "colab": {},
- "colab_type": "code",
- "id": "qsqenuPnCaXO"
- },
- "outputs": [],
- "source": [
- "import numpy as np\n",
- "\n",
- "print('Label with the highest confidence: {predicted_label}'.format(\n",
- " predicted_label = np.argmax(predictions[0])))\n",
- "\n",
- "print('Actual label: {actual_label}'.format(actual_label = test_labels[0]))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "E51yS7iCCaXO"
- },
- "source": [
- "With our model the predicted class was class `9`, and the actual class was class `9`. Success!"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "wbhFDQYEDomf"
- },
- "source": [
- "### Exercise 6: Thresholds\n",
- "\n",
- "When making our predictions, we blindly accepted the output of `argmax` without really understanding what `argmax` was doing.\n",
- "\n",
- "`argmax` returns the index of the maximum value in an array. What if there are ties? What happens for a 10-element array that looks like:\n",
- "\n",
- "```python\n",
- " [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]\n",
- "```\n",
- "\n",
- "In this case it is a virtual tie between all of the classes. `argmax` will return the first value in the case of a tie. This is problematic for a few reasons. In this case we clearly have little confidence in any class, yet an algorithm that relies on `argmax` would naively predict the first class.\n",
- "\n",
- "For this exercise, discuss ways we can get around relying solely on `argmax`. Are there better ways of finding a prediction algorithm?"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "CFkLIeLpLKZK"
- },
- "source": [
- "#### **Student Solution**"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "zcoXZXgPLPDF"
- },
- "source": [
- "> *Your argument goes here*"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "iUJrlcqDLT9L"
- },
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "izM_qZ6DNORz"
- },
- "source": [
- "## Exercise 7: MNIST Digits"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "0xsLlZ5vOwlo"
- },
- "source": [
- "Another popular MNIST dataset is the [digits dataset](https://en.wikipedia.org/wiki/MNIST_database). This dataset consists of images of labelled, hand-written digits ranging from `0` through `9`.\n",
- "\n",
- "In this exercise you will build a model that predicts the class of MNIST digit images.\n",
- "\n",
- "The dataset is part of scikit-learn."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "L8emy15ndfaN"
- },
- "outputs": [],
- "source": [
- "from sklearn import datasets\n",
- "\n",
- "import pandas as pd\n",
- "\n",
- "digits_bunch = datasets.load_digits()\n",
- "digits = pd.DataFrame(digits_bunch.data)\n",
- "digits['digit'] = digits_bunch.target\n",
- "\n",
- "digits.describe()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "Pp_e6eMrdtOD"
- },
- "source": [
- "You will need to:\n",
- "\n",
- "* Perform EDA on the data\n",
- "* Choose a model (or models) to use to predict digits\n",
- "* Perform any model-specific data manipulation\n",
- "* Train the model and, if possible, visualize training progression\n",
- "* Perform a final test of the model on holdout data\n",
- "\n",
- "Use as many code and text cells as you need to. Explain your work."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "UBE2iEK9ojL1"
- },
- "source": [
- "### **Student Solution**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 0,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "CTuaRd71eKLN"
- },
- "outputs": [],
- "source": [
- "# Your code goes here"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "colab_type": "text",
- "id": "0W3di7FXeMJ7"
- },
- "source": [
- "---"
- ]
- }
- ],
- "metadata": {
- "colab": {
- "collapsed_sections": [
- "copyright",
- "blY3aYfJuPMl",
- "30Jqtsyj7M7_",
- "5hr08k9NFZ6B",
- "KnOr9nc46yKQ",
- "Uprlp83fLVfH",
- "rwOXrQvteM7e"
- ],
- "include_colab_link": true,
- "name": "Introduction to Image Classification",
- "private_outputs": true,
- "provenance": [],
- "toc_visible": true
- },
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 0
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "view-in-github"
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "copyright"
+ },
+ "source": [
+ "#### Copyright 2020 Google LLC."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "pDv-M1JQH0nc"
+ },
+ "outputs": [],
+ "source": [
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "PB5N5exO33YO"
+ },
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "CVmV0M74xwm7"
+ },
+ "source": [
+ "# Introduction to Image Classification"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "sdfkasjfskjf"
+ },
+ "source": [
+ "We have learned about binary and multiclass classification, and we've done so using datasets consisting of feature columns that contain numeric and string values. The numbers could be continuous or categorical. The strings we have used so far were all categorical features.\n",
+ "\n",
+ "In this lab we will perform another type of classification: **image classification**.\n",
+ "\n",
+ "Image classification can be binary: \"*Is this an image of a dog?*\" \n",
+ "It can also be multiclass: \"*Is this an image of a cat, dog, horse, or cow?*\"\n",
+ "\n",
+ "The questions above assume there is only one item in an image. There is an even more advanced form of multiclass classification that answers the following question: What are all of the classes in an image and where are they located? For example: \"*Where are all of the cats, dogs, horses, and cows in this image?*\".\n",
+ "\n",
+ "In this introduction to image classification, we'll focus on classification where there is only one item depicted in each image. In future labs we'll learn about the more advanced forms of image classification.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DLdCchMdCaWQ"
+ },
+ "source": [
+ "## The Dataset\n",
+ "\n",
+ "The dataset we'll use for this Colab is the [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset, which contains 70,000 grayscale images labeled with one of ten categories.\n",
+ "\n",
+ "The categories are:\n",
+ "\n",
+ "Label\t| Class\n",
+ "------|------------\n",
+ "0 | T-shirt/top\n",
+ "1 | Trouser\n",
+ "2 | Pullover\n",
+ "3 | Dress\n",
+ "4 | Coat\n",
+ "5 | Sandal\n",
+ "6 | Shirt\n",
+ "7 | Sneaker\n",
+ "8 | Bag\n",
+ "9 | Ankle boot\n",
+ "\n",
+ "\n",
+ "The images show individual articles of clothing at low resolution (28 by 28 pixels), as seen here:\n",
+ "\n",
+ "\n",
+ " \n",
+ " \n",
+ " |
\n",
+ " \n",
+ " Figure 1. Fashion-MNIST samples (by Zalando, MIT License). \n",
+ " |
\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "t9FDsUlxCaWW"
+ },
+ "source": [
+ "### Load the Data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "f-0Gffhfmotj"
+ },
+ "source": [
+ "Now that we have a rough understanding of the data we're going to use in our model, let's load the data into this lab. The Fashion MNIST dataset is conveniently available from the [Keras Datasets repository](https://keras.io/datasets/) along with a [utility function](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/fashion_mnist/load_data) for downloading and loading the data into NumPy arrays.\n",
+ "\n",
+ "In the code cell below, we import TensorFlow and download the Fashion-MNIST data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "7MqDQO0KCaWS"
+ },
+ "outputs": [],
+ "source": [
+ "import tensorflow as tf\n",
+ "\n",
+ "(train_images, train_labels), (test_images, test_labels) = \\\n",
+ " tf.keras.datasets.fashion_mnist.load_data()\n",
+ "\n",
+ "print(train_images.shape)\n",
+ "print(train_labels.shape)\n",
+ "print(test_images.shape)\n",
+ "print(test_labels.shape)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "j5gG9S5Wb6cM"
+ },
+ "source": [
+ "`load_data()` returns two tuples, one for the training dataset and the other for the testing dataset. As you can see from the output of the code cell above, we have `60,000` training samples and `10,000` testing samples. This makes for a `14%` holdout of the data.\n",
+ "\n",
+ "You might be wondering what that `28, 28` is in the image data. That is a two-dimensional representation of the image. This is our feature data. Each pixel of the image is a feature. A `28` by `28` image has `784` pixels.\n",
+ "\n",
+ "As you can see, even a tiny image generates quite a few features. If we were processing 4k-resolution images, which are often `3840` by `2160` pixels, then we would have `8,294,400` features! Over eight million features is quite a bit. In later labs we'll address some strategies for working with this massive amount of data."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Brm0b_KACaWX"
+ },
+ "source": [
+ "### Exploratory Data Analysis"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tRBC2X79qXTH"
+ },
+ "source": [
+ "It is always a good idea to look at your data before diving in to building your model. Remember that our data is divided across four NumPy arrays, two of which are three-dimensional arrays:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "zW5k_xz1CaWX"
+ },
+ "outputs": [],
+ "source": [
+ "#There are 60,000 images and they are each 28 by 28 which is 784 pixels\n",
+ "print('Training images:', train_images.shape)\n",
+ "print('Training labels:', train_labels.shape)\n",
+ "#There are 10,000 images and each are 28 by 28 resulting in a total \n",
+ "#of 784\n",
+ "print('Test images:', test_images.shape)\n",
+ "print('Test labels:', test_labels.shape)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "d4LAdvnOrO3a"
+ },
+ "source": [
+ "To make our exploration tasks a little easier, let's put the data into a Pandas `DataFrame`. One way to do this is to flatten the `28` by `28` image into a flat array of `784` pixels, with the pixel number being the column name. We then add the labels to a `target` column."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "wC7C8_6Fqwg3"
+ },
+ "outputs": [],
+ "source": [
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "\n",
+ "train_df = pd.DataFrame(\n",
+ " np.array([x.flatten() for x in train_images]),\n",
+ " columns=[i for i in range(784)]\n",
+ ")\n",
+ "train_df['target'] = train_labels\n",
+ "\n",
+ "train_df.describe()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "aEUjCBEZstGT"
+ },
+ "source": [
+ "With so many columns, reading the output of `describe()` is nearly impossible. Let's instead do our analysis a little differently.\n",
+ "\n",
+ "To begin, we will find the minimum value of every pixel column and output the sorted list of unique values."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "UP6QFiMXruBz"
+ },
+ "outputs": [],
+ "source": [
+ "FEATURES = train_df.columns[:-1]\n",
+ "\n",
+ "min_features = sorted(train_df.loc[:, FEATURES].min().unique())\n",
+ "min_features"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ifk1Cu4ns8qD"
+ },
+ "source": [
+ "All of the values were `0`.\n",
+ "\n",
+ "Let's do the same for the maximum values."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "6uF0vRn1tE8B"
+ },
+ "outputs": [],
+ "source": [
+ "max_features = sorted(train_df.loc[:, FEATURES].max().unique())\n",
+ "max_features"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "A_8-Qwu2tEBm"
+ },
+ "source": [
+ "That is more interesting. We seem to have values ranging from `16` through `255`. These values represent color intensities for grayscale images. `0`, which we saw as a minimum value, maps to black in the color map that we will use, while `255` is white.\n",
+ "\n",
+ "Let's see a histogram distribution of our max pixel values."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "fGIe1xy3thhB"
+ },
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "\n",
+ "_ = plt.hist(train_df.loc[:, FEATURES].max().unique())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "RrLPDMpetr-r"
+ },
+ "source": [
+ "Unsurprisingly, higher intensity values seem to be more prevalent as maximum pixel values than lower intensity values."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a7hdwFBtt5CO"
+ },
+ "source": [
+ "#### Exercise 1: Charting Pixel Intensities\n",
+ "\n",
+ "In the example above, we created a histogram containing the maximum pixel intensities. In this exercise you will create a histogram for all pixel intensities in the training dataset.\n",
+ "\n",
+ "If some intensities are outliers, remove them to get a more meaningful histogram.\n",
+ "\n",
+ "Hint: The NumPy [`where`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html) and [`flatten`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html) can come in handy for this exercise. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "UjvjiRQ_uJy4"
+ },
+ "source": [
+ "##### **Student Solution**"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "VDbc80D8uOWI"
+ },
+ "source": [
+ "---"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "hlDuo-8pvbYO"
+ },
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import pandas as pd\n",
+ "import numpy as np\n",
+ "#Reimported features just to be sure\n",
+ "#FEATURES is just the amount of pixels in the image\n",
+ "#As each image is 784 pixels it is a 28 by 28 image. \n",
+ "#However we flattened the features to be it's 784 pixel self\n",
+ "FEATURES = train_df.columns[:-1]\n",
+ "#Instead of finding unique values like those above\n",
+ "#I simply stored the data of each image in training category\n",
+ "#The gray scale to be correct. \n",
+ "min_features_all = sorted(train_df.loc[:, FEATURES].min())\n",
+ "max_features_all = sorted(train_df.loc[:, FEATURES].max())\n",
+ "#Created an empty list to store the data from min_features and\n",
+ "#max_features\n",
+ "all_features = []\n",
+ "#Show all the values between in minimum and maximum values\n",
+ "#And put that into a new list which will be displayed\n",
+ "#As a histogram \n",
+ "for i in min_features_all + max_features_all:\n",
+ " all_features.append(i)\n",
+ " \n",
+ "#Showing the actual data\n",
+ "#Because this isn't our final product and just EDA\n",
+ "#I'm not going to bother putting labels and any other \n",
+ "#extraneous details.\n",
+ "_= plt.hist(all_features)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "TNKwNfv7xQ9d"
+ },
+ "source": [
+ "#### Continuing on With EDA"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "zgPxBAdY1uX-"
+ },
+ "source": [
+ "Now that we have a basic idea of the values in our dataset, let's see if any are missing."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PirHFXhFxbuc"
+ },
+ "outputs": [],
+ "source": [
+ "train_df.isna().any().any()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "gB_Zwi5kxkNe"
+ },
+ "source": [
+ "Good. We now know we aren't missing any values, and our pixel values range from `0` through `255`.\n",
+ "\n",
+ "Let's now see if our target values are what we expect."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "o-r8BxRKxryp"
+ },
+ "outputs": [],
+ "source": [
+ "sorted(train_df['target'].unique())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "omUG0nkIzP3_"
+ },
+ "source": [
+ "Let's see the distribution."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "-NlAF5BZzBgc"
+ },
+ "outputs": [],
+ "source": [
+ "_ = train_df['target'].hist(bins=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5LrAHbI1zSON"
+ },
+ "source": [
+ "The class types seem evenly distributed. We have `6,000` of each."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Doat5-gY3rPw"
+ },
+ "source": [
+ "The numeric values should map to these clothing types:\n",
+ "\n",
+ "Label\t| Class\n",
+ "------|------------\n",
+ "0 | T-shirt/top\n",
+ "1 | Trouser\n",
+ "2 | Pullover\n",
+ "3 | Dress\n",
+ "4 | Coat\n",
+ "5 | Sandal\n",
+ "6 | Shirt\n",
+ "7 | Sneaker\n",
+ "8 | Bag\n",
+ "9 | Ankle boot"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "IAqKCu8ix42l"
+ },
+ "source": [
+ "We can spot check this by looking at some of the images. Let's check a random 'T-shirt/top'.\n",
+ "\n",
+ "To do this we select a random index from the 'T-shirt/top' items (`target = 0`). We then reshape the pixel columns back into a `28` by `28` two-dimensional array, which are the dimensions of the image. We then use `imshow()` to display the image."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Ex7WQYrgyz7O"
+ },
+ "outputs": [],
+ "source": [
+ "index = np.random.choice(train_df[train_df['target'] == 0].index.values)\n",
+ "\n",
+ "pixels = train_df.loc[index, FEATURES].to_numpy().reshape(28, 28)\n",
+ "\n",
+ "_ = plt.imshow(pixels, cmap='gray')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2n1ZJeoK1CZ6"
+ },
+ "source": [
+ "In our sample we got an image that looked like a very low resolution t-shirt. You should see the same. Note: every time you rerun the above cell, a new random index will be chosen, so feel free to cycle through some of the values to see the different types of t-shirt/top images included in the dataset."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0bVHDcVv1Kaq"
+ },
+ "source": [
+ " This single image spot checking is okay, but it doesn't scale well.\n",
+ "\n",
+ "We can view multiple images at a time using the [`GridSpec`](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.gridspec.GridSpec.html) class from Matplotlib.\n",
+ "\n",
+ "In the code below, we build a visualization with a `10` by `10` grid of images in our t-shirt class.\n",
+ "\n",
+ "The code imports `gridspec`, sets the number of rows and columns, and then sets the figure size so the image is large enough for us to actually see different samples.\n",
+ "\n",
+ "After that bit of setup, we create a `10` by `10` `GridSpec`. The other parameters to the constructor are there to ensure the images are tightly packed into the grid. Try experimenting with some other values.\n",
+ "\n",
+ "Next we randomly choose `100` indexes from items labelled with class `0` our training data.\n",
+ "\n",
+ "The remainder of the code should look pretty familiar. We used similar code above to show a single image. The difference in this code is that we are adding `100` subplots using the `GridSpec`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "n8NmTQzavWuS"
+ },
+ "outputs": [],
+ "source": [
+ "from matplotlib import gridspec\n",
+ "\n",
+ "# Row and column count (100 samples)\n",
+ "rows = 10\n",
+ "cols = 10\n",
+ "\n",
+ "# Size of the final output image\n",
+ "plt.figure(figsize=(12, 12)) \n",
+ "\n",
+ "# Grid that will be used to organize our samples\n",
+ "gspec = gridspec.GridSpec(\n",
+ " rows,\n",
+ " cols,\n",
+ " wspace = 0.0,\n",
+ " hspace = 0.0,\n",
+ " top = 1.0,\n",
+ " bottom = 0.0,\n",
+ " left = 0.00, \n",
+ " right = 1.0,\n",
+ ") \n",
+ "\n",
+ "# Randomly choose a sample of t-shirts\n",
+ "T_SHIRTS = 0\n",
+ "indexes = np.random.choice(\n",
+ " train_df[train_df['target'] == T_SHIRTS].index.values,\n",
+ " rows*cols\n",
+ ")\n",
+ "\n",
+ "# Add each sample to a plot using the GridSpec\n",
+ "cnt = 0\n",
+ "for r in range(rows):\n",
+ " for c in range(cols):\n",
+ " row = train_df.loc[indexes[cnt], FEATURES]\n",
+ " img = row.to_numpy().reshape((28, 28))\n",
+ "\n",
+ " ax = plt.subplot(gspec[r, c])\n",
+ " ax.imshow(img, cmap='gray')\n",
+ " ax.xaxis.set_visible(False)\n",
+ " ax.yaxis.set_visible(False)\n",
+ " cnt = cnt + 1\n",
+ "\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kOpQEsRI6yla"
+ },
+ "source": [
+ "#### Exercise 2: Visualizing Every Class\n",
+ "\n",
+ "In this exercise, you'll take the code that we used above to visualize t-shirts and use it to visualize every class represented in our dataset. You'll need to print out the class name and then show a `10` by `10` grid of samples from that class. Try to minimize the amount of repeated code in your solution."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6f1pYNnXGBBy"
+ },
+ "source": [
+ "##### **Student Solution**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "E91p1ryQ7KfR"
+ },
+ "outputs": [],
+ "source": [
+ "from numpy.random.mtrand import sample\n",
+ "from matplotlib import gridspec\n",
+ "import matplotlib.pyplot as plt\n",
+ "import pandas as pd\n",
+ "import numpy as np\n",
+ "\n",
+ "# Row and column count (100 samples)\n",
+ "rows = 10\n",
+ "cols = 10\n",
+ "\n",
+ "# Size of the final output image\n",
+ "plt.figure(figsize=(12, 12)) \n",
+ "\n",
+ "\n",
+ "# Grid that will be used to organize our samples\n",
+ "gspec = gridspec.GridSpec(\n",
+ " rows,\n",
+ " cols,\n",
+ " wspace = 0.0,\n",
+ " hspace = 0.0,\n",
+ " top = 1.0,\n",
+ " bottom = 0.0,\n",
+ " left = 0.00, \n",
+ " right = 1.0,\n",
+ ") \n",
+ "\n",
+ "#for each sample run it through and display the index\n",
+ "#values\n",
+ "T_SHIRTS, TROUSER, PULLOVER, DRESS, COAT, SANDAL, SHIRT, SNEAKER, BAG, ANKLE_BOOT = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9\n",
+ "clothing = [T_SHIRTS, TROUSER, PULLOVER, DRESS, COAT, SANDAL, SHIRT, SNEAKER, BAG, ANKLE_BOOT]\n",
+ "clothing_labels = ['T_SHIRTS', 'TROUSERS', 'PULLOVERS', 'DRESSES', 'COATS', 'SANDALS', 'SHIRTS', 'SNEAKERS', 'BAGS', 'ANKLE_BOOTS']\n",
+ "for item in clothing_labels:\n",
+ " print(item)\n",
+ "for item in clothing:\n",
+ " #Changing the indexes changes what is displayed\n",
+ " indexes = np.random.choice(\n",
+ " train_df[train_df['target'] == item].index.values,\n",
+ " rows*cols\n",
+ " )\n",
+ "\n",
+ " # Add each sample to a plot using the GridSpec\n",
+ " cnt = 0\n",
+ " for r in range(rows):\n",
+ " for c in range(cols):\n",
+ " row = train_df.loc[indexes[cnt], FEATURES]\n",
+ " img = row.to_numpy().reshape((28, 28))\n",
+ "\n",
+ " ax = plt.subplot(gspec[r, c])\n",
+ " ax.imshow(img, cmap='gray')\n",
+ " ax.xaxis.set_visible(False)\n",
+ " ax.yaxis.set_visible(False)\n",
+ " cnt = cnt + 1\n",
+ "\n",
+ " plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ydpE8MjX7Lt-"
+ },
+ "source": [
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "aI-rapAs7UPG"
+ },
+ "source": [
+ "#### Wrapping Up EDA"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FvCRgygY7V_p"
+ },
+ "source": [
+ "From our visual analysis, our samples seem reasonable.\n",
+ "\n",
+ "First off, class names seem to match pictures. This can give us some confidence that our data is labelled correctly.\n",
+ "\n",
+ "Another nice thing is that all of the clothing items seem to be oriented in the same direction for the most part. If shoes were pointing in different directions, or if any images were rotated, then we would have had a lot more processing to do.\n",
+ "\n",
+ "And finally, all of our images are the same dimensions and are encoded with a single numeric grayscale intensity. In the real world, you'll likely not get so lucky. Images are acquired in different sizes and with different color encodings. We'll get to some examples of this in future labs.\n",
+ "\n",
+ "Based on our analysis so far, we can end our EDA and move on to model building."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "59veuiEZCaW4"
+ },
+ "source": [
+ "## Modeling"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "9KGrH87n8Qmk"
+ },
+ "source": [
+ "We have many options for building a multiclass classification model for images. In this lab we will build a deep neural network using TensorFlow Keras."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Oy5hRX8WCQ7a"
+ },
+ "source": [
+ "### Preparing the Data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5knAfp_WCUAd"
+ },
+ "source": [
+ "Our feature data is on a scale from `0` to `255`, and our target data is categorically encoded. Fortunately, all of the features are on the same scale, so we don't have to worry about standardizing scale. However, we'll need to do a little data preprocessing in order to get our data ready for modeling."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "18B1xVFgDvbK"
+ },
+ "source": [
+ "The first bit of data preprocessing we'll do is bring the feature values into the range of `0.0` and `1.0`. We could perform normalization to do this, but normalization actually isn't the only solution in this case.\n",
+ "\n",
+ "We know that all of our features are pixel values in the range of `0` to `255`. We also know from our EDA that every feature has a minimum value of `0`, but that the max values have a pretty wide range. It is possible we would make our model worse by normalizing, since we'd be making the same values across pixels not map to the same color.\n",
+ "\n",
+ "Instead of normalizing, we can just divide every feature by `255.0`. This keeps the relative values the same across pixels."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "mGIomoO4EmbK"
+ },
+ "outputs": [],
+ "source": [
+ "train_df[FEATURES] = train_df[FEATURES] / 255.0\n",
+ "\n",
+ "train_df[FEATURES].describe()\n",
+ "train_df['target']"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "M8BZRIlUEw8x"
+ },
+ "source": [
+ "#### Exercise 3: One-Hot Encoding\n",
+ "\n",
+ "Our target values are categorical values in a column named `target`. In this exercise, you will one-hot encode the target values. Your code should:\n",
+ "\n",
+ "1. Create ten new columns named `target_0` through `target_9`.\n",
+ "1. Create a variable called `TARGETS` that contains the `10` target column names.\n",
+ "1. `describe()` the ten new target column values to ensure that they have values between 0 and 1 and that the one-hot encoding looks evenly distributed."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "eCJVMjO3F8_T"
+ },
+ "source": [
+ "##### **Student Solution**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "s-xkoR97FXbu"
+ },
+ "outputs": [],
+ "source": [
+ "i = 0\n",
+ "while i <= 9:\n",
+ " train_df[str(\"target_\" + str(i))] = list(map(lambda x: 1 if x == i else 0, train_df['target']))\n",
+ " i=i+1\n",
+ "TARGETS = train_df.columns[len(train_df.columns)-10:].tolist()\n",
+ "TARGETS\n",
+ "train_df"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "l2qLaT8zFV-I"
+ },
+ "source": [
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Gxg1XGm0eOBy"
+ },
+ "source": [
+ "### Configure and Compile the Model"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "NPY2MBfCGP2A"
+ },
+ "source": [
+ "We'll be relying on the TensorFlow Keras [`Sequential` model](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential) and [`Dense` layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense) that we used in previous labs.\n",
+ "\n",
+ "In this case our input shape needs to be the size of our feature count. We'll then add a few hidden layers and then use a softmax layer the same width as our target count. This layer should output the probability that a given set of input features maps to each of our targets. The sum of the probabilities will equal `1.0`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "9ODch-OFCaW4"
+ },
+ "outputs": [],
+ "source": [
+ "model = tf.keras.Sequential([\n",
+ " tf.keras.layers.Dense(128, input_shape=(len(FEATURES),)),\n",
+ " tf.keras.layers.Dense(64, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(len(TARGETS), activation=tf.nn.softmax)\n",
+ "])\n",
+ "\n",
+ "model.summary()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "pTyZGa-eH3xJ"
+ },
+ "source": [
+ "Note that our images are actually `28` by `28` images. We flattened the images when we loaded them into a dataframe for EDA. However, flattening outside of the model isn't necessary. TensorFlow works in many dimensions. If we wanted to keep our images as `28` by `28` matrices, we could have added a [`Flatten`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Flatten) layer as shown below.\n",
+ "\n",
+ "```python\n",
+ "model = tf.keras.Sequential([\n",
+ " tf.keras.layers.Flatten(input_shape=(28, 28)),\n",
+ " tf.keras.layers.Dense(128, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(64, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(len(TARGETS), activation=tf.nn.softmax)\n",
+ "])\n",
+ "\n",
+ "model.summary()\n",
+ "```\n",
+ "\n",
+ "```\n",
+ "Model: \"sequential_8\"\n",
+ "_________________________________________________________________\n",
+ "Layer (type) Output Shape Param # \n",
+ "=================================================================\n",
+ "flatten_5 (Flatten) (None, 784) 0 \n",
+ "_________________________________________________________________\n",
+ "dense_28 (Dense) (None, 128) 100480 \n",
+ "_________________________________________________________________\n",
+ "dense_29 (Dense) (None, 64) 8256 \n",
+ "_________________________________________________________________\n",
+ "dense_30 (Dense) (None, 32) 2080 \n",
+ "_________________________________________________________________\n",
+ "dense_31 (Dense) (None, 10) 330 \n",
+ "=================================================================\n",
+ "Total params: 111,146\n",
+ "Trainable params: 111,146\n",
+ "Non-trainable params: 0\n",
+ "_________________________________________________________________\n",
+ "```\n",
+ "\n",
+ "This model results in the same number of trainable parameters as our pre-flattened model; it just saves us from having to flatten the images outside of TensorFlow.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "gut8A_7rCaW6"
+ },
+ "source": [
+ "Before the model is ready for training, it needs a few more settings. These are added during the model's *compile* step:\n",
+ "\n",
+ "* *Loss function* — This measures how well the model is doing during training. We want to minimize this function to \"steer\" the model in the right direction. A large loss would indicate the model is performing poorly in classification tasks, meaning it is not matching input images to the correct class names. (It might classify a boot as a coat, for example.)\n",
+ "* *Optimizer* — This is how the model is updated based on the data it sees and its loss function.\n",
+ "* *Metrics* — This is used to monitor the training and testing steps. The following example uses *accuracy*, the fraction of the images that are correctly classified."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "Lhan11blCaW7"
+ },
+ "outputs": [],
+ "source": [
+ "model.compile(\n",
+ " loss='categorical_crossentropy',\n",
+ " optimizer='Adam',\n",
+ " metrics=['accuracy'],\n",
+ ")\n",
+ "\n",
+ "model.summary()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "qKF6uW-BCaW-"
+ },
+ "source": [
+ "## Train the Model\n",
+ "\n",
+ "Training a Keras API neural network model to classify images looks just like all of the other Keras models we have worked with so far. We call this the `model.fit` method, passing it our training data and any other parameters we'd like to use."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "xvwvpA64CaW_"
+ },
+ "outputs": [],
+ "source": [
+ "callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)\n",
+ "\n",
+ "history = model.fit(\n",
+ " train_df[FEATURES],\n",
+ " train_df[TARGETS],\n",
+ " epochs=500,\n",
+ " callbacks=[callback]\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "W3ZVOhugCaXA"
+ },
+ "source": [
+ "As the model trains, the loss and accuracy metrics are displayed. We also store the progression in history.\n",
+ "\n",
+ "You'll notice that this took longer to train per epoch than many of the models we've built previously in this course. That's because of the large number of features. Even with these tiny `28` by `28` grayscale images, we are still dealing with `784` features. This is orders of magnitude larger than the 10 or so features we are used to using."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "i6WTGUBIJpzc"
+ },
+ "source": [
+ "### Exercise 4: Graph Model Progress\n",
+ "\n",
+ "In this exercise you'll create two graphs. The first will show the model loss over each epoch. The second will show the model accuracy over each epoch. Feel free to use any graphical toolkit we have used so far."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "vT42LGuNJ7d1"
+ },
+ "outputs": [],
+ "source": [
+ "# Your code goes here"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "29fLrNFaJ9c6"
+ },
+ "source": [
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "BuTr50NNM3kV"
+ },
+ "source": [
+ "## Evaluate the Model\n",
+ "\n",
+ "Now that our model is trained, let's evaluate it using an independent test data set. Then let's see if the model quality holds up. We'll use `model.evaluate()` and pass in the test dataset. `model.evaluate()` returns a `test_loss` and `test_accuracy`.\n",
+ "\n",
+ "Also note that we need to apply the same feature preprocessing to the test data that we did to the train data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "VflXLEeECaXC"
+ },
+ "outputs": [],
+ "source": [
+ "test_df = pd.DataFrame(\n",
+ " np.array([x.flatten() for x in test_images]),\n",
+ " columns=[i for i in range(784)]\n",
+ ")\n",
+ "test_df['target'] = test_labels\n",
+ "\n",
+ "test_df[FEATURES] = test_df[FEATURES] / 255.0\n",
+ "\n",
+ "for class_i in sorted(test_df['target'].unique()):\n",
+ " column_name = f'target_{class_i}'\n",
+ " test_df[column_name] = (test_df['target'] == class_i).astype(int)\n",
+ "\n",
+ "train_loss = history.history['loss'][-1]\n",
+ "train_accuracy = history.history['accuracy'][-1]\n",
+ "(test_loss, test_accuracy) = model.evaluate(test_df[FEATURES], test_df[TARGETS])\n",
+ "\n",
+ "print('Training loss:', train_loss)\n",
+ "print('Training accuracy:', train_accuracy)\n",
+ "\n",
+ "print('Test loss:', test_loss)\n",
+ "print('Test accuracy:', test_accuracy)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "yWfgsmVXCaXG"
+ },
+ "source": [
+ "The accuracy on the test dataset is noticeably less than the accuracy on the training dataset. This gap between training accuracy and test accuracy is an example of **overfitting**. Overfitting is when a machine learning model tends to perform worse on new data than on the training data. The trained model is unable to **generalize** to data that it has not seen before.\n",
+ "\n",
+ "There are many ways to try to reduce overfitting. One that we have seen is **early stopping**. This causes training to stop when loss stops changing significantly for a model. Without early stopping, the model would continue training, becoming more and more tailored to the training data and likely less to be able to generalize across new data.\n",
+ "\n",
+ "Another method for reducing overfitting in deep neural networks is **dropout**. A dropout layer is a layer that sits between two regular layers (in our case, dense layers) and randomly sets some of the values passed between layers to `0`.\n",
+ "\n",
+ "In TensorFlow the [`Dropout`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout) class is capable of doing this. To use `Dropout` you simply add a `Dropout` layer between other layers of the model. Each dropout layer has a percentage of values that it will set to `0`.\n",
+ "\n",
+ "```python\n",
+ "model = tf.keras.Sequential([\n",
+ " ...\n",
+ " tf.keras.layers.Dense(name='L14', 431, activation=tf.nn.relu),\n",
+ " # Randomly sets 15% of values between L14 and L15 to 0\n",
+ " tf.keras.layers.Dropout(rate=0.15),\n",
+ " tf.keras.layers.Dense(name='L15', 257, activation=tf.nn.relu),\n",
+ " # Randomly sets 1% of values between L15 and L16 to 0\n",
+ " tf.keras.layers.Dropout(rate=0.01),\n",
+ " tf.keras.layers.Dense(name='L16', 57, activation=tf.nn.relu),\n",
+ " ...\n",
+ "])\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "IlBarsvH6Wee"
+ },
+ "source": [
+ "### Exercise 5: Dropout Layers\n",
+ "\n",
+ "In this exercise take the model from above and add a [`Dropout`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout) layer or layers between the `Dense` layers. See if you can find a configuration that reduces the gap between the training loss and accuracy and the test loss and accuracy. Document your findings."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6VO8ZidHmK_X"
+ },
+ "source": [
+ "#### **Student Solution**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "JSeNZ3RC6vGD"
+ },
+ "outputs": [],
+ "source": [
+ "model = tf.keras.Sequential([\n",
+ " tf.keras.layers.Dropout(rate=0.15),\n",
+ " tf.keras.layers.Dense(128, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dropout(rate=0.15),\n",
+ " tf.keras.layers.Dense(64, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dropout(rate=0.15),\n",
+ " tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dropout(rate=0.15),\n",
+ " tf.keras.layers.Dense(len(TARGETS), activation=tf.nn.softmax)\n",
+ "])\n",
+ "\n",
+ "model.compile(\n",
+ " loss='categorical_crossentropy',\n",
+ " optimizer='Adam',\n",
+ " metrics=['accuracy']\n",
+ ")\n",
+ "\n",
+ "callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=5)\n",
+ "model.fit(train_df[FEATURES],train_df[TARGETS],callbacks=[callback],epochs = 500)\n",
+ "\n",
+ "model.summary()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3yNtz-8e87Fk"
+ },
+ "source": [
+ "Iterate a few times and find a dropout model that seems to bring the testing and training numbers closer together. When you are done, document your findings in the table below. The **?**s are placeholders for accuracy and loss values."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "INWU28f477nR"
+ },
+ "source": [
+ "Dropout (Y/N) | Train/Test | Accuracy | Loss\n",
+ "--------------|------------| ---------|------\n",
+ "N | Train | *?* | *?*\n",
+ "N | Test | *?* | *?*\n",
+ "Y | Train | *?* | *?*\n",
+ "Y | Test | *?* | *?*\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Jy5-RDqg6w4Q"
+ },
+ "source": [
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "xsoS7CPDCaXH"
+ },
+ "source": [
+ "## Make Predictions\n",
+ "\n",
+ "We have now trained the model while trying to reduce overfitting. Let's say we're happy with our numbers and are ready to deploy the model. Now it is time to make predictions.\n",
+ "\n",
+ "We could now snap an image of a clothing item, resize it to `28` by `28`, and grayscale it. But that is a lot of work and outside the scope of this class. For simplicity, let's use the test images as input to the model and see what predictions we get.\n",
+ "\n",
+ "We'll use the `model.predict()` function to do this. Let's make our predictions and peek at the first result."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "Gl91RPhdCaXI"
+ },
+ "outputs": [],
+ "source": [
+ "predictions = model.predict(test_df[FEATURES])\n",
+ "\n",
+ "predictions[0]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "C7Z21UdulbIg"
+ },
+ "source": [
+ "What are those numbers?\n",
+ "\n",
+ "For each image:\n",
+ " * the prediction result is in the form of 10 numbers, one for each possible label\n",
+ " * each number represents the level of confidence that a label is the correct label for the particular image\n",
+ " * all 10 numbers should add up to the sum of 1\n",
+ "\n",
+ "Let's see if that is true."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "6ls2RrRlAo0f"
+ },
+ "outputs": [],
+ "source": [
+ "sum(predictions[0]), sum(predictions[1])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FySF9kRTAvhl"
+ },
+ "source": [
+ "Well, maybe not `1`, but the result definitely approaches `1`. Floating point math makes summing to exactly `1` a little difficult."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-hw1hgeSCaXN"
+ },
+ "source": [
+ "Let's find out which label has the highest predicted number and whether it matches with the actual test label.\n",
+ "\n",
+ "To find the highest predicted number we will use Numpy's [`argmax`](https://docs.scipy.org/doc/numpy-1.9.3/reference/generated/numpy.argmax.html) function which returns the index of the maximum value in an array."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "both",
+ "id": "qsqenuPnCaXO"
+ },
+ "outputs": [],
+ "source": [
+ "import numpy as np\n",
+ "\n",
+ "print('Label with the highest confidence: {predicted_label}'.format(\n",
+ " predicted_label = np.argmax(predictions[0])))\n",
+ "\n",
+ "print('Actual label: {actual_label}'.format(actual_label = test_labels[0]))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "E51yS7iCCaXO"
+ },
+ "source": [
+ "With our model the predicted class was class `9`, and the actual class was class `9`. Success!"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "wbhFDQYEDomf"
+ },
+ "source": [
+ "### Exercise 6: Thresholds\n",
+ "\n",
+ "When making our predictions, we blindly accepted the output of `argmax` without really understanding what `argmax` was doing.\n",
+ "\n",
+ "`argmax` returns the index of the maximum value in an array. What if there are ties? What happens for a 10-element array that looks like:\n",
+ "\n",
+ "```python\n",
+ " [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]\n",
+ "```\n",
+ "\n",
+ "In this case it is a virtual tie between all of the classes. `argmax` will return the first value in the case of a tie. This is problematic for a few reasons. In this case we clearly have little confidence in any class, yet an algorithm that relies on `argmax` would naively predict the first class.\n",
+ "\n",
+ "For this exercise, discuss ways we can get around relying solely on `argmax`. Are there better ways of finding a prediction algorithm?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "CFkLIeLpLKZK"
+ },
+ "source": [
+ "#### **Student Solution**"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "zcoXZXgPLPDF"
+ },
+ "source": [
+ "> *Your argument goes here*"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We could simply create a minimum threshold. If the confidence is below 50% then a different argument should be used. We could make it an else if type statement in an if statement. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "iUJrlcqDLT9L"
+ },
+ "source": [
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "izM_qZ6DNORz"
+ },
+ "source": [
+ "## Exercise 7: MNIST Digits"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0xsLlZ5vOwlo"
+ },
+ "source": [
+ "Another popular MNIST dataset is the [digits dataset](https://en.wikipedia.org/wiki/MNIST_database). This dataset consists of images of labelled, hand-written digits ranging from `0` through `9`.\n",
+ "\n",
+ "In this exercise you will build a model that predicts the class of MNIST digit images.\n",
+ "\n",
+ "The dataset is part of scikit-learn."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "L8emy15ndfaN"
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn import datasets\n",
+ "\n",
+ "import pandas as pd\n",
+ "\n",
+ "digits_bunch = datasets.load_digits()\n",
+ "digits = pd.DataFrame(digits_bunch.data)\n",
+ "digits['digit'] = digits_bunch.target\n",
+ "\n",
+ "digits.describe()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Pp_e6eMrdtOD"
+ },
+ "source": [
+ "You will need to:\n",
+ "\n",
+ "* Perform EDA on the data\n",
+ "* Choose a model (or models) to use to predict digits\n",
+ "* Perform any model-specific data manipulation\n",
+ "* Train the model and, if possible, visualize training progression\n",
+ "* Perform a final test of the model on holdout data\n",
+ "\n",
+ "Use as many code and text cells as you need to. Explain your work."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "UBE2iEK9ojL1"
+ },
+ "source": [
+ "### **Student Solution**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "CTuaRd71eKLN"
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn import datasets\n",
+ "import pandas as pd\n",
+ "\n",
+ "digits_bunch = datasets.load_digits()\n",
+ "digits = pd.DataFrame(digits_bunch.data)\n",
+ "digits['digit'] = digits_bunch.target\n",
+ "\n",
+ "digits.describe()\n",
+ "\n",
+ "digits.columns = [str(i).replace(' ', '') for i in digits.columns]\n",
+ "FEATURES = [str(i) for i in range(64)]\n",
+ "\n",
+ "from sklearn import datasets\n",
+ "import pandas as pd\n",
+ "\n",
+ "digits_bunch = datasets.load_digits()\n",
+ "digits = pd.DataFrame(digits_bunch.data)\n",
+ "digits['digit'] = digits_bunch.target\n",
+ "\n",
+ "digits.describe()\n",
+ "\n",
+ "digits.columns = [str(i).replace(' ', '') for i in digits.columns]\n",
+ "FEATURES = [str(i) for i in range(64)]\n",
+ "\n",
+ "import tensorflow as tf\n",
+ "from tensorflow.keras.models import Sequential\n",
+ "from tensorflow.keras.layers import Dense, Dropout\n",
+ "\n",
+ "\n",
+ "model = tf.keras.Sequential([\n",
+ " tf.keras.layers.Dense(128, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(64, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
+ " tf.keras.layers.Dense(len(TARGETS), activation=tf.nn.softmax)\n",
+ "])\n",
+ "\n",
+ "model.compile(\n",
+ " loss='categorical_crossentropy',\n",
+ " optimizer='Adam',\n",
+ " metrics=['accuracy'],\n",
+ ")\n",
+ "\n",
+ "callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)\n",
+ "\n",
+ "history = model.fit(\n",
+ " digits[FEATURES],\n",
+ " digits[TARGETS],\n",
+ " epochs=500,\n",
+ " callbacks=[callback]\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "map = model.history.history['accuracy']\n",
+ "plt.plot(map, 'g-')\n",
+ "map = model.history.history['loss']\n",
+ "plt.plot(map, 'y-')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0W3di7FXeMJ7"
+ },
+ "source": [
+ "---"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "collapsed_sections": [
+ "copyright"
+ ],
+ "name": "ImageClass",
+ "private_outputs": true,
+ "provenance": [],
+ "toc_visible": true
+ },
+ "gpuClass": "standard",
+ "kernelspec": {
+ "display_name": "Python 3.10.4 64-bit",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "name": "python",
+ "version": "3.10.4"
+ },
+ "vscode": {
+ "interpreter": {
+ "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
+ }
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
}