Update regression.ipynb to explicitly cast one-hot encoding values

jperk224 · web-flow · commit fe0d2851998c · 2024-06-23T00:07:55.000-04:00
As currently written, the one-hot encoding step leaves the user with boolean dummy values that cause errors later in the tutorial as presently written, when passing the np.array() argument to Normalization.adapt().  The values need to be cast to a numerical type (e.g., int) either at the one-hot encoding step or the adapt() step.
diff --git a/site/en/tutorials/keras/regression.ipynb b/site/en/tutorials/keras/regression.ipynb
@@ -251,6 +251,7 @@
       },
       "source": [
         "The `\"Origin\"` column is categorical, not numeric. So the next step is to one-hot encode the values in the column with [pd.get_dummies](https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html).\n",
+        "Neglecting to specify a data type by way of a `dtype` argument will leave you with boolean values, causing errors during normalization if the feature values are not cast when passing them into to `tf.keras.layers.Normalization.adapt()`.\n",
         "\n",
         "Note: You can set up the `tf.keras.Model` to do this kind of transformation for you but that's beyond the scope of this tutorial. Check out the [Classify structured data using Keras preprocessing layers](../structured_data/preprocessing_layers.ipynb) or [Load CSV data](../load_data/csv.ipynb) tutorials for examples."
       ]
@@ -274,7 +275,7 @@
       },
       "outputs": [],
       "source": [
-        "dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='')\n",
+        "dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='', dtype=int)\n",
         "dataset.tail()"
       ]
     },