Merge pull request #99493 from PatrickFarley/formre-updates

Jak-MS · web-flow · commit a37ca95dc303 · 2019-12-20T13:47:59.000-06:00
[cog serv] Formre updates
diff --git a/articles/cognitive-services/form-recognizer/includes/python-custom-analyze.md b/articles/cognitive-services/form-recognizer/includes/python-custom-analyze.md
@@ -9,11 +9,11 @@ ms.author: pafarley
 
 ## Analyze forms for key-value pairs and tables
 
-Next, you'll use your newly trained model to analyze a document and extract key-value pairs and tables from it. Call the **Analyze Form** API by running the following code in a new Python script. Before you run the script, make these changes:
+Next, you'll use your newly trained model to analyze a document and extract key-value pairs and tables from it. Call the **[Analyze Form](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/AnalyzeWithCustomForm)** API by running the following code in a new Python script. Before you run the script, make these changes:
 
-1. Replace `<path to your form>` with the file path of your form (for example, C:\temp\file.pdf). This can also be the URL of a remote file. For this quickstart, you can use the files under the **Test** folder of the [sample data set](https://go.microsoft.com/fwlink/?linkid=2090451).
-1. Replace `<modelID>` with the model ID you received in the previous section.
-1. Replace `<Endpoint>` with the endpoint that you obtained with your Form Recognizer subscription key. You can find it on your Form Recognizer resource **Overview** tab.
+1. Replace `<file path>` with the file path of your form (for example, C:\temp\file.pdf). This can also be the URL of a remote file. For this quickstart, you can use the files under the **Test** folder of the [sample data set](https://go.microsoft.com/fwlink/?linkid=2090451).
+1. Replace `<model_id>` with the model ID you received in the previous section.
+1. Replace `<endpoint>` with the endpoint that you obtained with your Form Recognizer subscription key. You can find it on your Form Recognizer resource **Overview** tab.
 1. Replace `<file type>` with the file type. Supported types: `application/pdf`, `image/jpeg`, `image/png`, `image/tiff`.
 1. Replace `<subscription key>` with your subscription key.
 
@@ -24,12 +24,11 @@ Next, you'll use your newly trained model to analyze a document and extract key-
     from requests import get, post
     
     # Endpoint URL
-    endpoint = r"<Endpoint>"
-    apim_key = "<Subscription Key>"
-    model_id = "<modelID>"
+    endpoint = r"<endpoint>"
+    apim_key = "<subsription key>"
+    model_id = "<model_id>"
     post_url = endpoint + "/formrecognizer/v2.0-preview/custom/models/%s/analyze" % model_id
-    source = r"<path or url to your form>"
-    prefix = "<prefix string>"
+    source = r"<file path>"
     params = {
         "includeTextDetails": True
     }
@@ -45,7 +44,7 @@ Next, you'll use your newly trained model to analyze a document and extract key-
     try:
         resp = post(url = post_url, data = data_bytes, headers = headers, params = params)
         if resp.status_code != 202:
-            print("POST analyze failed:\n%s" % resp.text)
+            print("POST analyze failed:\n%s" % json.dumps(resp.json()))
             quit()
         print("POST analyze succeeded:\n%s" % resp.headers)
         get_url = resp.headers["operation-location"]
@@ -65,28 +64,31 @@ When you call the **Analyze Form** API, you'll receive a `201 (Success)` respons
 Add the following code to the bottom of your Python script. This uses the ID value from the previous call in a new API call to retrieve the analysis results. The **Analyze Form** operation is asynchronous, so this script calls the API at regular intervals until the results are available. We recommend an interval of one second or more.
 
 ```python 
-n_tries = 10
+n_tries = 15
 n_try = 0
-wait_sec = 6
+wait_sec = 5
+max_wait_sec = 60
 while n_try < n_tries:
     try:
         resp = get(url = get_url, headers = {"Ocp-Apim-Subscription-Key": apim_key})
-        resp_json = json.loads(resp.text)
+        resp_json = resp.json()
         if resp.status_code != 200:
-            print("GET analyze results failed:\n%s" % resp_json)
+            print("GET analyze results failed:\n%s" % json.dumps(resp_json))
             quit()
         status = resp_json["status"]
         if status == "succeeded":
-            print("Analysis succeeded:\n%s" % resp_json)
+            print("Analysis succeeded:\n%s" % json.dumps(resp_json))
             quit()
         if status == "failed":
-            print("Analysis failed:\n%s" % resp_json)
+            print("Analysis failed:\n%s" % json.dumps(resp_json))
             quit()
         # Analysis still running. Wait and retry.
         time.sleep(wait_sec)
-        n_try += 1     
+        n_try += 1
+        wait_sec = min(2*wait_sec, max_wait_sec)     
     except Exception as e:
         msg = "GET analyze results failed:\n%s" % str(e)
         print(msg)
         quit()
+print("Analyze operation did not complete within the allocated time.")
 ```
diff --git a/articles/cognitive-services/form-recognizer/quickstarts/curl-receipts.md b/articles/cognitive-services/form-recognizer/quickstarts/curl-receipts.md
@@ -35,7 +35,7 @@ To complete this quickstart, you must have:
 
 ## Analyze a receipt
 
-To start analyzing a receipt, you call the **Analyze Receipt** API using the cURL command below. Before you run the command, make these changes:
+To start analyzing a receipt, you call the **[Analyze Receipt](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/AnalyzeReceiptAsync)** API using the cURL command below. Before you run the command, make these changes:
 
 1. Replace `<Endpoint>` with the endpoint that you obtained with your Form Recognizer subscription.
 1. Replace `<your receipt URL>` with the URL address of a receipt image.
@@ -53,7 +53,7 @@ https://cognitiveservice/formrecognizer/v2.0-preview/prebuilt/receipt/operations
 
 ## Get the receipt results
 
-After you've called the **Analyze Receipt** API, you call the **Get Receipt Result** API to get the status of the operation and the extracted data. Before you run the command, make these changes:
+After you've called the **Analyze Receipt** API, you call the **[Get Analyze Receipt Result](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/GetAnalyzeReceiptResult)** API to get the status of the operation and the extracted data. Before you run the command, make these changes:
 
 1. Replace `<Endpoint>` with the endpoint that you obtained with your Form Recognizer subscription key. You can find it on your Form Recognizer resource **Overview** tab.
 1. Replace `<operationId>` with the operation ID from the previous step.
diff --git a/articles/cognitive-services/form-recognizer/quickstarts/curl-train-extract.md b/articles/cognitive-services/form-recognizer/quickstarts/curl-train-extract.md
@@ -40,7 +40,7 @@ First, you'll need a set of training data in an Azure Storage blob. You should h
 > [!NOTE]
 > You can use the labeled data feature to manually label some or all of your training data beforehand. This is a more complex process but results in a better trained model. See the [Train with labels](../overview.md#train-with-labels) section of the overview to learn more about this feature.
 
-To train a Form Recognizer model with the documents in your Azure blob container, call the **Train Custom Model** API by running the following cURL command. Before you run the command, make these changes:
+To train a Form Recognizer model with the documents in your Azure blob container, call the **[Train Custom Model](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/TrainCustomModelAsync)** API by running the following cURL command. Before you run the command, make these changes:
 
 1. Replace `<Endpoint>` with the endpoint that you obtained with your Form Recognizer subscription.
 1. Replace `<subscription key>` with the subscription key you copied from the previous step.
@@ -54,7 +54,7 @@ You'll receive a `201 (Success)` response with a **Location** header. The value
 
 ## Get training results
 
-After you've started the train operation, you use a new operation, **Get Custom Model** to check the training status. Pass the model ID into this API call to check the training status:
+After you've started the train operation, you use a new operation, **[Get Custom Model](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/GetCustomModel)** to check the training status. Pass the model ID into this API call to check the training status:
 
 1. Replace `<Endpoint>` with the endpoint that you obtained with your Form Recognizer subscription key.
 1. Replace `<subscription key>` with your subscription key
@@ -136,7 +136,7 @@ The `"modelId"` field contains the ID of the model you're training. You'll need
 
 ## Analyze forms for key-value pairs and tables
 
-Next, you'll use your newly trained model to analyze a document and extract key-value pairs and tables from it. Call the **Analyze Form** API by running the following cURL command. Before you run the command, make these changes:
+Next, you'll use your newly trained model to analyze a document and extract key-value pairs and tables from it. Call the **[Analyze Form](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/AnalyzeWithCustomForm)** API by running the following cURL command. Before you run the command, make these changes:
 
 1. Replace `<Endpoint>` with the endpoint that you obtained from your Form Recognizer subscription key. You can find it on your Form Recognizer resource **Overview** tab.
 1. Replace `<model ID>` with the model ID that you received in the previous section.
diff --git a/articles/cognitive-services/form-recognizer/quickstarts/python-labeled-data.md b/articles/cognitive-services/form-recognizer/quickstarts/python-labeled-data.md
@@ -55,8 +55,8 @@ All of these files should occupy the same sub-folder and be in the following for
 
 You need OCR result files in order for the service to consider the corresponding input files for labeled training. To obtain OCR results for a given source form, follow the steps below:
 
-1. Call the **/formrecognizer/v2.0-preview/layout/analyze** API on the read Layout container with the input file as part of the request body. Save the ID found in the response's **Operation-Location** header.
-1. Call the **/formrecognizer/v2.0-preview/layout/analyzeResults/{id}** API, using operation ID from the previous step.
+1. Call the **[Analyze Layout](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/AnalyzeLayoutAsync)** API on the read Layout container with the input file as part of the request body. Save the ID found in the response's **Operation-Location** header.
+1. Call the **[Get Analyze Layout Result](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/GetAnalyzeLayoutResult)** API, using the operation ID from the previous step.
 1. Get the response and write the contents to a file. For each source form, the corresponding OCR file should have the original file name appended with `.ocr.json`. The OCR JSON output should have the following format. See the [sample OCR file](https://github.com/Azure-Samples/cognitive-services-REST-api-samples/blob/master/curl/form-recognizer/Invoice_1.pdf.ocr.json) for a full example. 
 
     ```json
@@ -187,11 +187,11 @@ For each source form, the corresponding label file should have the original file
 
 ## Train a model using labeled data
 
-To train a model with labeled data, call the **Train Custom Model** API by running the following python code. Before you run the code, make these changes:
+To train a model with labeled data, call the **[Train Custom Model](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/TrainCustomModelAsync)** API by running the following python code. Before you run the code, make these changes:
 
 1. Replace `<Endpoint>` with the endpoint URL for your Form Recognizer resource.
 1. Replace `<SAS URL>` with the Azure Blob storage container's shared access signature (SAS) URL. To retrieve the SAS URL, open the Microsoft Azure Storage Explorer, right-click your container, and select **Get shared access signature**. Make sure the **Read** and **List** permissions are checked, and click **Create**. Then copy the value in the **URL** section. It should have the form: `https://<storage account>.blob.core.windows.net/<container name>?<SAS value>`.
-1. Replace `<prefix>` with the folder name in your blob container where the input data is located. Or, if your data is at the root, leave this blank and remove the `"prefix"` field from the body of the HTTP request.
+1. Replace `<Blob folder name>` with the folder name in your blob container where the input data is located. Or, if your data is at the root, leave this blank and remove the `"prefix"` field from the body of the HTTP request.
 
 ```python
 ########### Python Form Recognizer Labeled Async Train #############
@@ -203,14 +203,14 @@ from requests import get, post
 endpoint = r"<Endpoint>"
 post_url = endpoint + r"/formrecognizer/v2.0-preview/custom/models"
 source = r"<SAS URL>"
-prefix = "<folder name>"
+prefix = "<Blob folder name>"
 includeSubFolders = False
 useLabelFile = True
 
 headers = {
     # Request headers
     'Content-Type': 'application/json',
-    'Ocp-Apim-Subscription-Key': '<Subscription Key>',
+    'Ocp-Apim-Subscription-Key': '<subsription key>',
 }
 
 body = 	{
@@ -225,7 +225,7 @@ body = 	{
 try:
     resp = post(url = post_url, json = body, headers = headers)
     if resp.status_code != 201:
-        print("POST model failed:\n%s" % resp.text)
+        print("POST model failed (%s):\n%s" % (resp.status_code, json.dumps(resp.json())))
         quit()
     print("POST model succeeded:\n%s" % resp.headers)
     get_url = resp.headers["location"]
@@ -236,25 +236,36 @@ except Exception as e:
 
 ## Get training results
 
-After you've started the train operation, you use the returned ID to get the status of the operation. Add the following code to the bottom of your Python script. This extracts the ID value from the training call and passes it to a new API call. The training operation is asynchronous, so this script calls the API at regular intervals until the training status is completed. We recommend an interval of one second or more.
+After you've started the train operation, you use the returned ID to get the status of the operation. Add the following code to the bottom of your Python script. This uses the ID value from the training call in a new API call. The training operation is asynchronous, so this script calls the API at regular intervals until the training status is completed. We recommend an interval of one second or more.
 
 ```python 
-operationId = operationURL.split("operations/")[1]
-
-conn = http.client.HTTPSConnection('<Endpoint>')
-while True:
+n_tries = 15
+n_try = 0
+wait_sec = 5
+max_wait_sec = 60
+while n_try < n_tries:
     try:
-        conn.request("GET", f"/formrecognizer/v1.0-preview/custom/models/{operationId}", "", headers)
-        responseString = conn.getresponse().read().decode('utf-8')
-        responseDict = json.loads(responseString)
-        conn.close()
-        print(responseString)
-        if 'status' in responseDict and responseDict['status'] not in ['creating','created']:
-            break
-        time.sleep(1)
+        resp = get(url = get_url, headers = headers)
+        resp_json = resp.json()
+        if resp.status_code != 200:
+            print("GET model failed (%s):\n%s" % (resp.status_code, json.dumps(resp_json)))
+            quit()
+        model_status = resp_json["modelInfo"]["status"]
+        if model_status == "ready":
+            print("Training succeeded:\n%s" % json.dumps(resp_json))
+            quit()
+        if model_status == "invalid":
+            print("Training failed. Model is invalid:\n%s" % json.dumps(resp_json))
+            quit()
+        # Training still running. Wait and retry.
+        time.sleep(wait_sec)
+        n_try += 1
+        wait_sec = min(2*wait_sec, max_wait_sec)     
     except Exception as e:
-        print(e)
-        exit()
+        msg = "GET model failed:\n%s" % str(e)
+        print(msg)
+        quit()
+print("Train operation did not complete within the allocated time.")
 ```
 
 When the training process is completed, you'll receive a `201 (Success)` response with JSON content like the following. The response has been shortened for simplicity.
diff --git a/articles/cognitive-services/form-recognizer/quickstarts/python-layout.md b/articles/cognitive-services/form-recognizer/quickstarts/python-layout.md
@@ -31,7 +31,7 @@ To complete this quickstart, you must have:
 
 ## Analyze the form layout
 
-To start analyzing the layout, you call the **Analyze Layout** API using the Python script below. Before you run the script, make these changes:
+To start analyzing the layout, you call the **[Analyze Layout](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/AnalyzeLayoutAsync)** API using the Python script below. Before you run the script, make these changes:
 
 1. Replace `<Endpoint>` with the endpoint that you obtained with your Form Recognizer subscription.
 1. Replace `<path to your form>` with the path to your local form document.
@@ -82,7 +82,7 @@ https://cognitiveservice/formrecognizer/v2.0-preview/layout/operations/54f0b076-
 
 ## Get the layout results
 
-After you've called the **Analyze Layout** API, you call the **Get Analyze Layout Result** API to get the status of the operation and the extracted data. Add the following code to the bottom of your Python script. This uses the operation ID value in a new API call. This script calls the API at regular intervals until the results are available. We recommend an interval of one second or more.
+After you've called the **Analyze Layout** API, you call the **[Get Analyze Layout Result](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/GetAnalyzeLayoutResult)** API to get the status of the operation and the extracted data. Add the following code to the bottom of your Python script. This uses the operation ID value in a new API call. This script calls the API at regular intervals until the results are available. We recommend an interval of one second or more.
 
 ```python
 n_tries = 10
diff --git a/articles/cognitive-services/form-recognizer/quickstarts/python-receipts.md b/articles/cognitive-services/form-recognizer/quickstarts/python-receipts.md
@@ -35,7 +35,7 @@ To complete this quickstart, you must have:
 
 ## Analyze a receipt
 
-To start analyzing a receipt, you call the **Analyze Receipt** API using the Python script below. Before you run the script, make these changes:
+To start analyzing a receipt, you call the **[Analyze Receipt](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/AnalyzeReceiptAsync)** API using the Python script below. Before you run the script, make these changes:
 
 1. Replace `<Endpoint>` with the endpoint that you obtained with your Form Recognizer subscription.
 1. Replace `<your receipt URL>` with the URL address of a receipt image.
@@ -91,7 +91,7 @@ https://cognitiveservice/formrecognizer/v2.0-preview/prebuilt/receipt/operations
 
 ## Get the receipt results
 
-After you've called the **Analyze Receipt** API, you call the **Get Receipt Result** API to get the status of the operation and the extracted data. Add the following code to the bottom of your Python script. This uses the operation ID value in a new API call. This script calls the API at regular intervals until the results are available. We recommend an interval of one second or more.
+After you've called the **Analyze Receipt** API, you call the **[Get Analyze Receipt Result](https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/GetAnalyzeReceiptResult)** API to get the status of the operation and the extracted data. Add the following code to the bottom of your Python script. This uses the operation ID value in a new API call. This script calls the API at regular intervals until the results are available. We recommend an interval of one second or more.
 
 ```python
 n_tries = 10
diff --git a/articles/cognitive-services/form-recognizer/quickstarts/python-train-extract.md b/articles/cognitive-services/form-recognizer/quickstarts/python-train-extract.md
diff --git a/articles/cognitive-services/form-recognizer/whats-new.md b/articles/cognitive-services/form-recognizer/whats-new.md