Skip to content

Commit 18b11ff

Browse files
committed
cleanup
1 parent 53bf20d commit 18b11ff

File tree

1 file changed

+42
-53
lines changed

1 file changed

+42
-53
lines changed

tutorials/BertTutorial.ipynb

Lines changed: 42 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -4,40 +4,35 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"## How to run a bert model under ONNX\n",
7+
"# Converting a Tensorflow Bert model to ONNX\n",
88
"\n",
9-
"This tutorial shows how to convert the original tensorflow bert model to ONNX. In this example we use a bert model that is fine tuned for squad-1.1 on top of [BERT-Base, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip).\n",
9+
"This tutorial shows how to convert the original Tensorflow Bert model to ONNX. \n",
10+
"In this example we fine tune Bert for squad-1.1 on top of [BERT-Base, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip).\n",
1011
"\n",
11-
"To keep this tuturial at a resonable size we reuse tokenizer and utilities defined in the bert source tree for onnx.\n",
12-
"We used the following versions:\n",
12+
"Since this tutorial cares mostly about the conversion process we reuse tokenizer and utilities defined in the Bert source tree as much as possible.\n",
13+
"\n",
14+
"This should work with all versions supported by the [tensorflow-onnx converter](https://github.com/onnx/tensorflow-onnx), we used the following versions while writing the tutorial:\n",
1315
"```\n",
1416
"tensorflow-gpu: 1.13.1\n",
1517
"onnx: 1.5.1\n",
1618
"tf2onnx: 1.5.1\n",
1719
"onnxruntime: 0.4\n",
1820
"```\n",
1921
"\n",
20-
"The steps to convert the models:\n",
21-
"1. setup our environment\n",
22-
"2. clone the tensorflow bert model from https://github.com/google-research/bert\n",
23-
"3. download the pretrained model and the squad-1.1 dataset\n",
24-
"4. fine tune on squad\n",
25-
"5. export the inference graph as saved_model format\n",
26-
"6. convert the saved_model to onnx\n",
27-
"7. run the converted model in onnxruntime"
22+
"To make the fine tuning work on my Gtx-1080 gpu, we changed the MAX_SEQ_LENGTH to 256 and used a training batch size of 8."
2823
]
2924
},
3025
{
3126
"cell_type": "markdown",
3227
"metadata": {},
3328
"source": [
34-
"## Step 1\n",
35-
"Before we start, lets setup some varibales where to find things."
29+
"## Step 1 - define some environment variables\n",
30+
"Before we start, lets setup some variables where to find things."
3631
]
3732
},
3833
{
3934
"cell_type": "code",
40-
"execution_count": 4,
35+
"execution_count": 1,
4136
"metadata": {},
4237
"outputs": [],
4338
"source": [
@@ -62,8 +57,7 @@
6257
"cell_type": "markdown",
6358
"metadata": {},
6459
"source": [
65-
"## Step 2 \n",
66-
"Clone https://github.com/google-research/bert"
60+
"## Step 2 - clone the Bert github repository"
6761
]
6862
},
6963
{
@@ -92,8 +86,7 @@
9286
"cell_type": "markdown",
9387
"metadata": {},
9488
"source": [
95-
"## Step 3\n",
96-
"Download the pretrained bert model and the squad-1.1 dataset"
89+
"## Step 3 - download the pretrained Bert model and squad-1.1 dataset"
9790
]
9891
},
9992
{
@@ -112,24 +105,12 @@
112105
"!wget -O squad-1.1/evaluate-v1.1.json https://rajpurkar.github.io/SQuAD-explorer/dataset/evaluate-v1.1.json "
113106
]
114107
},
115-
{
116-
"cell_type": "code",
117-
"execution_count": null,
118-
"metadata": {},
119-
"outputs": [],
120-
"source": [
121-
"!mkdir squad-1.1 out\n",
122-
"!wget -O squad-1.1/train-v1.1.json https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json \n",
123-
"!wget -O squad-1.1/dev-v1.1.json https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json \n",
124-
"!wget -O squad-1.1/evaluate-v1.1.json https://rajpurkar.github.io/SQuAD-explorer/dataset/evaluate-v1.1.json "
125-
]
126-
},
127108
{
128109
"cell_type": "markdown",
129110
"metadata": {},
130111
"source": [
131-
"## Step 4\n",
132-
"Fine tune the bert model on squad-1.1. This is the same as described in the bert repository. We use a smaller MAX_SEQ_LENGTH and batch size so this trains nicely on a Gtx1080. If you already have a fined tuned model you can just copy it into the ```out``` folder."
112+
"## Step 4 - fine tune the Bert model for squad-1.1\n",
113+
"This is the same as described in the [Bert repository](https://github.com/google-research/bert). You need to do this only once.\n"
133114
]
134115
},
135116
{
@@ -164,8 +145,10 @@
164145
"cell_type": "markdown",
165146
"metadata": {},
166147
"source": [
167-
"## Step 5\n",
168-
"With a fined tuned model in hands we want to create a inference graph for it and save it to saved_model format."
148+
"## Step 5 - create the inference graph and save it\n",
149+
"With a fined tuned model in hands we want to create the inference graph for it and save it as saved_model format.\n",
150+
"\n",
151+
"***We assune that after 2 epochs the checkpoint is model.ckpt-21899 - if the following code does not find it, check the $OUT directory for the higest checkpoint***."
169152
]
170153
},
171154
{
@@ -297,13 +280,14 @@
297280
}
298281
],
299282
"source": [
283+
"# N is the number of examples we are evaluating. On the CPU this might take a bit.\n",
284+
"# During development you can set N to some more practical\n",
300285
"N = len(eval_features)\n",
301-
"N = 100\n",
302286
"\n",
303287
"all_results = []\n",
304288
"for result in estimator.predict(predict_input_fn, yield_single_examples=True):\n",
305289
" if len(all_results) % 1000 == 0:\n",
306-
" print(\"example: %d\" % (len(all_results)))\n",
290+
" print(\"sample: %d\" % (len(all_results)))\n",
307291
" unique_id = int(result[\"unique_ids\"])\n",
308292
" start_logits = [float(x) for x in result[\"start_logits\"].flat]\n",
309293
" end_logits = [float(x) for x in result[\"end_logits\"].flat]\n",
@@ -346,7 +330,6 @@
346330
" }\n",
347331
" return tf.estimator.export.ServingInputReceiver(receiver_tensors, receiver_tensors)\n",
348332
"\n",
349-
"#estimator._export_to_tpu = False\n",
350333
"path = estimator.export_savedmodel(os.path.join(OUT, \"export\"), serving_input_fn)\n",
351334
"os.environ['LAST_SAVED_MODEL'] = path.decode('utf-8')"
352335
]
@@ -366,8 +349,8 @@
366349
"metadata": {},
367350
"outputs": [],
368351
"source": [
369-
"# install tf2onnx if needed\n",
370-
"!pip install tf2onnx"
352+
"# install the latest version of tf2onnx if needed\n",
353+
"!pip install -U tf2onnx"
371354
]
372355
},
373356
{
@@ -394,6 +377,8 @@
394377
],
395378
"source": [
396379
"# convert model\n",
380+
"# because we still have a tensorflow session open in this notebook, force the converter to use the CPU.\n",
381+
"#\n",
397382
"!CUDA_VISIBLE_DEVICES='' python -m tf2onnx.convert --saved-model $LAST_SAVED_MODEL --output $OUT/bert.onnx --opset 8"
398383
]
399384
},
@@ -408,7 +393,7 @@
408393
"cell_type": "markdown",
409394
"metadata": {},
410395
"source": [
411-
"Lets look at the inputs to the ONNX model. The input 'unique_ids' is special and creates some issue in onnx: the input is passed directly to the output and in tensorflow both have the same name. In ONNX that is not supported and the converter creates a name. We need to use that created name so we remember it."
396+
"Lets look at the inputs to the ONNX model. The input 'unique_ids' is special and creates some issue in ONNX: the input is passed directly to the output and in Tensorflow both have the same name. In ONNX that is not supported and the converter creates a new name for the input. We need to use that created name so we remember it."
412397
]
413398
},
414399
{
@@ -456,29 +441,22 @@
456441
"source": [
457442
"RawResult = collections.namedtuple(\"RawResult\", [\"unique_id\", \"start_logits\", \"end_logits\"])\n",
458443
"\n",
459-
"batch_size = 1\n",
460-
"N = len(eval_features)\n",
461-
"N = 100\n",
462-
"\n",
463444
"all_results = []\n",
464445
"for idx in range(0, N):\n",
465446
" item = eval_features[idx]\n",
466447
" # this is using batch_size=1\n",
448+
" # feed the input data as int64\n",
467449
" data = {\"unique_ids_raw_output___9:0\": np.array([item.unique_id], dtype=np.int64),\n",
468450
" \"input_ids:0\": np.array([item.input_ids], dtype=np.int64),\n",
469451
" \"input_mask:0\": np.array([item.input_mask], dtype=np.int64),\n",
470452
" \"segment_ids:0\": np.array([item.segment_ids], dtype=np.int64)}\n",
471453
" result = sess.run([\"unique_ids:0\", \"unstack:0\", \"unstack:1\"], data)\n",
472454
" unique_id = result[0][0]\n",
473-
" start_logits = result[1][0]\n",
474-
" end_logits = result[2][0]\n",
475-
" start_logits = [float(x) for x in start_logits.flat]\n",
476-
" end_logits = [float(x) for x in end_logits.flat]\n",
477-
"\n",
478-
" # all_results.append(RawResult(unique_id=unique_id, start_logits=result[0][0][i], end_logits=result[1][0][i]))\n",
455+
" start_logits = [float(x) for x in result[1][0].flat]\n",
456+
" end_logits = [float(x) for x in result[2][0].flat]\n",
479457
" all_results.append(RawResult(unique_id=unique_id, start_logits=start_logits, end_logits=end_logits))\n",
480458
" if unique_id % 1000 == 0:\n",
481-
" print(\"example: %d\" % (len(all_results)))\n",
459+
" print(\"sample: %d\" % (len(all_results)))\n",
482460
" if len(all_results) >= N:\n",
483461
" break\n",
484462
"\n",
@@ -493,7 +471,7 @@
493471
"cell_type": "markdown",
494472
"metadata": {},
495473
"source": [
496-
"Compare some results between tensorflow and ONNX:"
474+
"Compare some results between Tensorflow and ONNX:"
497475
]
498476
},
499477
{
@@ -568,6 +546,17 @@
568546
"!head -20 $OUT/onnx_predictions.json"
569547
]
570548
},
549+
{
550+
"cell_type": "markdown",
551+
"metadata": {},
552+
"source": [
553+
"## Summary\n",
554+
"\n",
555+
"That was all it takes to convert a relativly complex model from Tensorflow to ONNX. \n",
556+
"\n",
557+
"You find more documentation about tensorflow-onnx [here](https://github.com/onnx/tensorflow-onnx)."
558+
]
559+
},
571560
{
572561
"cell_type": "code",
573562
"execution_count": null,

0 commit comments

Comments
 (0)