Skip to content

Commit fe0d285

Browse files
authored
Update regression.ipynb to explicitly cast one-hot encoding values
As currently written, the one-hot encoding step leaves the user with boolean dummy values that cause errors later in the tutorial as presently written, when passing the np.array() argument to Normalization.adapt(). The values need to be cast to a numerical type (e.g., int) either at the one-hot encoding step or the adapt() step.
1 parent 75b2672 commit fe0d285

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

site/en/tutorials/keras/regression.ipynb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -251,6 +251,7 @@
251251
},
252252
"source": [
253253
"The `\"Origin\"` column is categorical, not numeric. So the next step is to one-hot encode the values in the column with [pd.get_dummies](https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html).\n",
254+
"Neglecting to specify a data type by way of a `dtype` argument will leave you with boolean values, causing errors during normalization if the feature values are not cast when passing them into to `tf.keras.layers.Normalization.adapt()`.\n",
254255
"\n",
255256
"Note: You can set up the `tf.keras.Model` to do this kind of transformation for you but that's beyond the scope of this tutorial. Check out the [Classify structured data using Keras preprocessing layers](../structured_data/preprocessing_layers.ipynb) or [Load CSV data](../load_data/csv.ipynb) tutorials for examples."
256257
]
@@ -274,7 +275,7 @@
274275
},
275276
"outputs": [],
276277
"source": [
277-
"dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='')\n",
278+
"dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='', dtype=int)\n",
278279
"dataset.tail()"
279280
]
280281
},

0 commit comments

Comments
 (0)