Update section on nested models in migration shim guide.

tensorflower-gardener · copybara-github · commit 3ec9a34ef227 · 2021-11-19T12:17:58.000-08:00
PiperOrigin-RevId: 411117072
diff --git a/site/en/guide/migrate/model_mapping.ipynb b/site/en/guide/migrate/model_mapping.ipynb
@@ -106,7 +106,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": null,
+      "execution_count": 2,
       "metadata": {
         "id": "PzkV-2cna823"
       },
@@ -594,65 +594,71 @@
       "source": [
         "## Nesting `tf.Variable`s, `tf.Module`s, `tf.keras.layers` & `tf.keras.models` in decorated calls\n",
         "\n",
-        "Decorating your layer call in `tf.compat.v1.keras.utils.track_tf1_style_variables` will only add automatic implicit tracking of variables created (and reused) via `tf.compat.v1.get_variable`. It will not capture weights directly created by `tf.Variable` calls, such as those used by typical Keras layers and most `tf.Module`s. You still need to explicitly track these in the same way you would for any other Keras layer or `tf.Module`.\n",
-        "\n",
-        "If you need to embed `tf.Variable` calls, Keras layers/models, or `tf.Module`s in your decorators (either because you are following the incremental migration to Native TF2 described later in this guide, or because your TF1.x code partially consisted of Keras modules):\n",
-        "* Explicitly make sure that the variable/module/layer is only created once\n",
-        "* Explicitly attach them as instance attributes just as you would when defining a [typical module/layer](https://www.tensorflow.org/guide/intro_to_modules#defining_models_and_layers_in_tensorflow)\n",
-        "* Explicitly reuse the already-created object in follow-on calls\n",
+        "Decorating your layer call in `tf.compat.v1.keras.utils.track_tf1_style_variables` will only add automatic implicit tracking of variables created (and reused) via `tf.compat.v1.get_variable`. It will not capture weights directly created by `tf.Variable` calls, such as those used by typical Keras layers and most `tf.Module`s. This section describes how to handle these nested cases.\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Azxza3bVOZlv"
+      },
+      "source": [
+        "###  (Pre-existing usages) `tf.keras.layers` and `tf.keras.models`\n",
         "\n",
-        "This ensures that weights are not created new and are correctly resued. Additionally, this also ensures that existing weights and regularization losses get tracked.\n",
+        "For pre-existing usages of nested Keras layers and models, use `tf.compat.v1.keras.utils.get_or_create_layer`. This is only recommended for easing migration of existing TF1.x nested Keras usages; new code should use explicit attribute setting as described below for tf.Variables and tf.Modules.\n",
         "\n",
-        "Here is an example of how this could look:"
+        "To use `tf.compat.v1.keras.utils.get_or_create_layer`, wrap the code that constructs your nested model into a method, and pass it in to the method. Example:"
       ]
     },
     {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {
-        "id": "mrRPPoJ5ap5U"
+        "id": "LN15TcRgHKsq"
       },
       "outputs": [],
       "source": [
-        "class WrappedDenseLayer(tf.keras.layers.Layer):\n",
+        "class NestedModel(tf.keras.Model):\n",
         "\n",
-        "  def __init__(self, units, **kwargs):\n",
-        "    super().__init__(**kwargs)\n",
+        "  def __init__(self, units, *args, **kwargs):\n",
+        "    super().__init__(*args, **kwargs)\n",
         "    self.units = units\n",
-        "    self._dense_model = None\n",
+        "\n",
+        "  def build_model(self):\n",
+        "    inp = tf.keras.Input(shape=(5, 5))\n",
+        "    dense_layer = tf.keras.layers.Dense(\n",
+        "        10, name=\"dense\", kernel_regularizer=\"l2\",\n",
+        "        kernel_initializer=tf.compat.v1.ones_initializer())\n",
+        "    model = tf.keras.Model(inputs=inp, outputs=dense_layer(inp))\n",
+        "    return model\n",
         "\n",
         "  @tf.compat.v1.keras.utils.track_tf1_style_variables\n",
         "  def call(self, inputs):\n",
-        "    # Create the nested tf.variable/module/layer/model\n",
-        "    # only if it has not been created already\n",
-        "    if not self._dense_model:\n",
-        "      inp = tf.keras.Input(shape=inputs.shape)\n",
-        "      dense_layer = tf.keras.layers.Dense(\n",
-        "          self.units, name=\"dense\",\n",
-        "          kernel_regularizer=\"l2\")\n",
-        "      self._dense_model = tf.keras.Model(\n",
-        "          inputs=inp, outputs=dense_layer(inp))\n",
-        "    return self._dense_model(inputs)\n",
+        "    # Get or create a nested model without assigning it as an explicit property\n",
+        "    model = tf.compat.v1.keras.utils.get_or_create_layer(\n",
+        "        \"dense_model\", self.build_model)\n",
+        "    return model(inputs)\n",
         "\n",
-        "layer = WrappedDenseLayer(10)\n",
-        "\n",
-        "layer(tf.ones(shape=(5, 5)))"
+        "layer = NestedModel(10)\n",
+        "layer(tf.ones(shape=(5,5)))"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
-        "id": "Lo9h6wc6bmEF"
+        "id": "DgsKlltPHI8z"
       },
       "source": [
-        "The weights are correctly tracked:"
+        "This method ensures that these nested layers are correctly reused and tracked by tensorflow. Note that the `@track_tf1_style_variables` decorator is still required on the appropriate method. The model builder method passed into `get_or_create_layer` (in this case, `self.build_model`), should take no arguments.\n",
+        "\n",
+        "Weights are tracked:"
       ]
     },
     {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {
-        "id": "Qt6USaTVbauM"
+        "id": "3zO5A78MJsqO"
       },
       "outputs": [],
       "source": [
@@ -667,55 +673,46 @@
     {
       "cell_type": "markdown",
       "metadata": {
-        "id": "oyH4lIcPb45r"
+        "id": "o3Xsi-JbKTuj"
       },
       "source": [
-        "As is the regularization loss (if present):"
+        "And regularization loss as well:"
       ]
     },
     {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {
-        "id": "N7cmuhRGbfFt"
+        "id": "mdK5RGm5KW5C"
       },
       "outputs": [],
       "source": [
-        "regularization_loss = tf.add_n(layer.losses)\n",
-        "regularization_loss"
+        "tf.add_n(layer.losses)"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
-        "id": "FsTgnydkdezQ"
+        "id": "J_VRycQYJrXu"
       },
       "source": [
-        "### Guidance on variable names\n",
+        "###  Incremental migration: `tf.Variables` and `tf.Modules`\n",
         "\n",
-        "Explicit `tf.Variable` calls and Keras layers use a different layer name / variable name autogeneration mechanism than you may be used to from the combination of `get_variable` and `variable_scopes`. Although the shim will make your variable names match for variables created by `get_variable` even when going from TF1.x graphs to TF2 eager execution & `tf.function`, it cannot guarantee the same for the variable names generated for `tf.Variable` calls and Keras layers that you embed within your method decorators. It is even possible for multiple variables to share the same name in TF2 eager execution and `tf.function`.\n",
-        "\n",
-        "You should take special care with this when following the sections on validating correctness and mapping TF1.x checkpoints later on in this guide."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "mSFaHTCvhUso"
-      },
-      "source": [
-        "### Nesting layers/modules that use `@track_tf1_style_variables`\n",
+        "If you need to embed `tf.Variable` calls or `tf.Module`s in your decorated methods (for example, if you are following the incremental migration to non-legacy TF2 APIs described later in this guide), you still need to explicitly track these, with the following requirements:\n",
+        "* Explicitly make sure that the variable/module/layer is only created once\n",
+        "* Explicitly attach them as instance attributes just as you would when defining a [typical module or layer](https://www.tensorflow.org/guide/intro_to_modules#defining_models_and_layers_in_tensorflow)\n",
+        "* Explicitly reuse the already-created object in follow-on calls\n",
         "\n",
-        "If you are nesting one layer that uses the `@track_tf1_style_variables` decorator inside of another, you should treat it the same way you would treat any Keras layer or `tf.Module` that did not use `get_variable` to create its variables.\n",
+        "This ensures that weights are not created new each call and are correctly reused. Additionally, this also ensures that existing weights and regularization losses get tracked.\n",
         "\n",
-        "For example,"
+        "Here is an example of how this could look:"
       ]
     },
     {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {
-        "id": "SI5V-1JLhTfW"
+        "id": "mrRPPoJ5ap5U"
       },
       "outputs": [],
       "source": [
@@ -726,9 +723,9 @@
         "    self.units = units\n",
         "\n",
         "  @tf.compat.v1.keras.utils.track_tf1_style_variables\n",
-        "  def call(self, inputs):\n",
+        "  def __call__(self, inputs):\n",
         "    out = inputs\n",
-        "    with tf.compat.v1.variable_scope(\"dense\"):\n",
+        "    with tf.compat.v1.variable_scope(\"inner_dense\"):\n",
         "      # The weights are created with a `regularizer`,\n",
         "      # so the layer should track their regularization losses\n",
         "      kernel = tf.compat.v1.get_variable(\n",
@@ -762,29 +759,81 @@
         "\n",
         "layer = WrappedDenseLayer(10)\n",
         "\n",
-        "layer(tf.ones(shape=(5, 5)))\n",
+        "layer(tf.ones(shape=(5, 5)))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Lo9h6wc6bmEF"
+      },
+      "source": [
+        "Note that explicit tracking of the nested module is needed even though it is decorated with the `track_tf1_style_variables` decorator. This is because each module/layer with decorated methods has its own variable store associated with it. \n",
+        "\n",
+        "The weights are correctly tracked:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Qt6USaTVbauM"
+      },
+      "outputs": [],
+      "source": [
+        "assert len(layer.weights) == 6\n",
+        "weights = {x.name: x for x in layer.variables}\n",
+        "\n",
+        "assert set(weights.keys()) == {\"outer/inner_dense/bias:0\",\n",
+        "                               \"outer/inner_dense/kernel:0\",\n",
+        "                               \"outer/dense/bias:0\",\n",
+        "                               \"outer/dense/kernel:0\",\n",
+        "                               \"outer/dense_1/bias:0\",\n",
+        "                               \"outer/dense_1/kernel:0\"}\n",
         "\n",
-        "# Recursively track weights and regularization losses\n",
-        "layer.trainable_weights\n",
+        "layer.trainable_weights"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "dHn-bJoNJw7l"
+      },
+      "source": [
+        "As well as regularization loss:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "pq5GFtXjJyut"
+      },
+      "outputs": [],
+      "source": [
         "layer.losses"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
-        "id": "DkEkLnGbipSS"
+        "id": "p7VKJj3JOCEk"
       },
       "source": [
-        "Notice that `variable_scope`s set in the outer layer may affect the naming of variables set in the nested layer, *but* `get_variable` will not share variables by name across the outer shim-based layer and the nested shim-based layer even if they have the same name, because the nested and outer layer utilize different internal variable stores."
+        "Note that if the `NestedLayer` were a non-Keras `tf.Module` instead, variables would still be tracked but regularization losses would not be automatically tracked, so you would have to explicitly track them separately."
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
-        "id": "PfbiY08UizLz"
+        "id": "FsTgnydkdezQ"
       },
       "source": [
-        "As mentioned previously, if you are using a shim-decorated `tf.Module` there is no `losses` property to recursively and automatically track the regularization loss of your nested layer, and you will have to track it separately."
+        "### Guidance on variable names\n",
+        "\n",
+        "Explicit `tf.Variable` calls and Keras layers use a different layer name / variable name autogeneration mechanism than you may be used to from the combination of `get_variable` and `variable_scopes`. Although the shim will make your variable names match for variables created by `get_variable` even when going from TF1.x graphs to TF2 eager execution & `tf.function`, it cannot guarantee the same for the variable names generated for `tf.Variable` calls and Keras layers that you embed within your method decorators. It is even possible for multiple variables to share the same name in TF2 eager execution and `tf.function`.\n",
+        "\n",
+        "You should take special care with this when following the sections on validating correctness and mapping TF1.x checkpoints later on in this guide."
       ]
     },
     {