cleanup

guschmue · guschmue · commit 18b11ff218d4 · 2019-06-10T17:57:38.000-07:00
diff --git a/tutorials/BertTutorial.ipynb b/tutorials/BertTutorial.ipynb
@@ -4,40 +4,35 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## How to run a bert model under ONNX\n",
+    "# Converting a Tensorflow Bert model to ONNX\n",
     "\n",
-    "This tutorial shows how to convert the original tensorflow bert model to ONNX. In this example we use a bert model that is fine tuned for squad-1.1 on top of [BERT-Base, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip).\n",
+    "This tutorial shows how to convert the original Tensorflow Bert model to ONNX. \n",
+    "In this example we fine tune Bert for squad-1.1 on top of [BERT-Base, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip).\n",
     "\n",
-    "To keep this tuturial at a resonable size we reuse tokenizer and utilities defined in the bert source tree for onnx.\n",
-    "We used the following versions:\n",
+    "Since this tutorial cares mostly about the conversion process we reuse tokenizer and utilities defined in the Bert source tree as much as possible.\n",
+    "\n",
+    "This should work with all versions supported by the [tensorflow-onnx converter](https://github.com/onnx/tensorflow-onnx), we used the following versions while writing the tutorial:\n",
     "```\n",
     "tensorflow-gpu: 1.13.1\n",
     "onnx: 1.5.1\n",
     "tf2onnx: 1.5.1\n",
     "onnxruntime: 0.4\n",
     "```\n",
     "\n",
-    "The steps to convert the models:\n",
-    "1. setup our environment\n",
-    "2. clone the tensorflow bert model from https://github.com/google-research/bert\n",
-    "3. download the pretrained model and the squad-1.1 dataset\n",
-    "4. fine tune on squad\n",
-    "5. export the inference graph as saved_model format\n",
-    "6. convert the saved_model to onnx\n",
-    "7. run the converted model in onnxruntime"
+    "To make the fine tuning work on my Gtx-1080 gpu, we changed the MAX_SEQ_LENGTH to 256 and used a training batch size of 8."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 1\n",
-    "Before we start, lets setup some varibales where to find things."
+    "## Step 1 - define some environment variables\n",
+    "Before we start, lets setup some variables where to find things."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 1,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -62,8 +57,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 2 \n",
-    "Clone https://github.com/google-research/bert"
+    "## Step 2 - clone the Bert github repository"
    ]
   },
   {
@@ -92,8 +86,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 3\n",
-    "Download the pretrained bert model and the squad-1.1 dataset"
+    "## Step 3 - download the pretrained Bert model and squad-1.1 dataset"
    ]
   },
   {
@@ -112,24 +105,12 @@
     "!wget -O squad-1.1/evaluate-v1.1.json  https://rajpurkar.github.io/SQuAD-explorer/dataset/evaluate-v1.1.json "
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!mkdir squad-1.1 out\n",
-    "!wget -O squad-1.1/train-v1.1.json  https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json \n",
-    "!wget -O squad-1.1/dev-v1.1.json  https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json \n",
-    "!wget -O squad-1.1/evaluate-v1.1.json  https://rajpurkar.github.io/SQuAD-explorer/dataset/evaluate-v1.1.json "
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 4\n",
-    "Fine tune the bert model on squad-1.1. This is the same as described in the bert repository. We use a smaller MAX_SEQ_LENGTH and batch size so this trains nicely on a Gtx1080. If you already have a fined tuned model you can just copy it into the ```out``` folder."
+    "## Step 4 - fine tune the Bert model for squad-1.1\n",
+    "This is the same as described in the [Bert repository](https://github.com/google-research/bert). You need to do this only once.\n"
    ]
   },
   {
@@ -164,8 +145,10 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 5\n",
-    "With a fined tuned model in hands we want to create a inference graph for it and save it to saved_model format."
+    "## Step 5 - create the inference graph and save it\n",
+    "With a fined tuned model in hands we want to create the inference graph for it and save it as saved_model format.\n",
+    "\n",
+    "***We assune that after 2 epochs the checkpoint is model.ckpt-21899 - if the following code does not find it, check the $OUT directory for the higest checkpoint***."
    ]
   },
   {
@@ -297,13 +280,14 @@
     }
    ],
    "source": [
+    "# N is the number of examples we are evaluating. On the CPU this might take a bit.\n",
+    "# During development you can set N to some more practical\n",
     "N = len(eval_features)\n",
-    "N = 100\n",
     "\n",
     "all_results = []\n",
     "for result in estimator.predict(predict_input_fn, yield_single_examples=True):\n",
     "    if len(all_results) % 1000 == 0:\n",
-    "        print(\"example: %d\" % (len(all_results)))\n",
+    "        print(\"sample: %d\" % (len(all_results)))\n",
     "    unique_id = int(result[\"unique_ids\"])\n",
     "    start_logits = [float(x) for x in result[\"start_logits\"].flat]\n",
     "    end_logits = [float(x) for x in result[\"end_logits\"].flat]\n",
@@ -346,7 +330,6 @@
     "    }\n",
     "    return tf.estimator.export.ServingInputReceiver(receiver_tensors, receiver_tensors)\n",
     "\n",
-    "#estimator._export_to_tpu = False\n",
     "path = estimator.export_savedmodel(os.path.join(OUT, \"export\"), serving_input_fn)\n",
     "os.environ['LAST_SAVED_MODEL'] = path.decode('utf-8')"
    ]
@@ -366,8 +349,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# install tf2onnx if needed\n",
-    "!pip install tf2onnx"
+    "# install the latest version of tf2onnx if needed\n",
+    "!pip install -U tf2onnx"
    ]
   },
   {
@@ -394,6 +377,8 @@
    ],
    "source": [
     "# convert model\n",
+    "# because we still have a tensorflow session open in this notebook, force the converter to use the CPU.\n",
+    "#\n",
     "!CUDA_VISIBLE_DEVICES='' python -m tf2onnx.convert --saved-model $LAST_SAVED_MODEL --output $OUT/bert.onnx --opset 8"
    ]
   },
@@ -408,7 +393,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Lets look at the inputs to the ONNX model. The input 'unique_ids' is special and creates some issue in onnx: the input is passed directly to the output and in tensorflow both have the same name. In ONNX that is not supported and the converter creates a name. We need to use that created name so we remember it."
+    "Lets look at the inputs to the ONNX model. The input 'unique_ids' is special and creates some issue in ONNX: the input is passed directly to the output and in Tensorflow both have the same name. In ONNX that is not supported and the converter creates a new name for the input. We need to use that created name so we remember it."
    ]
   },
   {
@@ -456,29 +441,22 @@
    "source": [
     "RawResult = collections.namedtuple(\"RawResult\", [\"unique_id\", \"start_logits\", \"end_logits\"])\n",
     "\n",
-    "batch_size = 1\n",
-    "N = len(eval_features)\n",
-    "N = 100\n",
-    "\n",
     "all_results = []\n",
     "for idx in range(0, N):\n",
     "    item = eval_features[idx]\n",
     "    # this is using batch_size=1\n",
+    "    # feed the input data as int64\n",
     "    data = {\"unique_ids_raw_output___9:0\": np.array([item.unique_id], dtype=np.int64),\n",
     "            \"input_ids:0\": np.array([item.input_ids], dtype=np.int64),\n",
     "            \"input_mask:0\": np.array([item.input_mask], dtype=np.int64),\n",
     "            \"segment_ids:0\": np.array([item.segment_ids], dtype=np.int64)}\n",
     "    result = sess.run([\"unique_ids:0\", \"unstack:0\", \"unstack:1\"], data)\n",
     "    unique_id = result[0][0]\n",
-    "    start_logits = result[1][0]\n",
-    "    end_logits =  result[2][0]\n",
-    "    start_logits = [float(x) for x in start_logits.flat]\n",
-    "    end_logits = [float(x) for x in end_logits.flat]\n",
-    "\n",
-    "    # all_results.append(RawResult(unique_id=unique_id, start_logits=result[0][0][i], end_logits=result[1][0][i]))\n",
+    "    start_logits = [float(x) for x in result[1][0].flat]\n",
+    "    end_logits = [float(x) for x in result[2][0].flat]\n",
     "    all_results.append(RawResult(unique_id=unique_id, start_logits=start_logits, end_logits=end_logits))\n",
     "    if unique_id % 1000 == 0:\n",
-    "        print(\"example: %d\" % (len(all_results)))\n",
+    "        print(\"sample: %d\" % (len(all_results)))\n",
     "    if len(all_results) >= N:\n",
     "        break\n",
     "\n",
@@ -493,7 +471,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Compare some results between tensorflow and ONNX:"
+    "Compare some results between Tensorflow and ONNX:"
    ]
   },
   {
@@ -568,6 +546,17 @@
     "!head -20 $OUT/onnx_predictions.json"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Summary\n",
+    "\n",
+    "That was all it takes to convert a relativly complex model from Tensorflow to ONNX. \n",
+    "\n",
+    "You find more documentation about tensorflow-onnx [here](https://github.com/onnx/tensorflow-onnx)."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,