Merge pull request #180490 from magrefaat/patch-65

v-shils · web-flow · commit 73608f1a5452 · 2021-11-19T10:36:03.000-08:00
Update data-formats.md
diff --git a/articles/cognitive-services/language-service/custom-classification/concepts/data-formats.md b/articles/cognitive-services/language-service/custom-classification/concepts/data-formats.md
@@ -34,7 +34,7 @@ Your tags file should be in the `json` format below.
     ],
     "documents": [
         {
-            "location": "doc1.txt",
+            "location": "file1.txt",
             "language": "en-us",
             "classifiers": [
                 {
@@ -44,6 +44,15 @@ Your tags file should be in the `json` format below.
                     "classifierName": "Class1"
                 }
             ]
+        },
+        {
+            "location": "file2.txt",
+            "language": "en-us",
+            "classifiers": [
+                {
+                    "classifierName": "Class2"
+                }
+            ]
         }
     ]
 }
@@ -52,10 +61,10 @@ Your tags file should be in the `json` format below.
 ### Data description
 
 * `classifiers`: An array of classifiers for your data. Each classifier represents one of the classes you want to tag your data with.
-* `documents`: An array of tagged documents. For example:
-  * `location`: The path of the JSON file containing tags. The tags file has to be in root of the storage container.
-  * `language`: Language of the document. Use one of the [supported culture locales](../language-support.md).
-  * `classifiers`: Array of classifier objects assigned to the document. If you're working on a single classification project, there should be one classifier only.
+* `documents`: An array of tagged documents.
+  * `location`: The path of the file. The file has to be in root of the storage container.
+  * `language`: Language of the file. Use one of the [supported culture locales](../language-support.md).
+  * `classifiers`: Array of classifier objects assigned to the file. If you're working on a single classification project, there should be one classifier per file only.
 
 ## Next steps
 
diff --git a/articles/cognitive-services/language-service/custom-classification/includes/quickstarts/rest-api.md b/articles/cognitive-services/language-service/custom-classification/includes/quickstarts/rest-api.md
@@ -72,23 +72,13 @@ Create a **POST** request using the following URL, headers, and JSON body to cre
 Use the following URL to create a project and import your tags file. Replace the placeholder values below with your own values. 
 
 ```rest
-{YOUR-ENDPOINT}/language/analyze-text/projects/{projectName}/:import. 
+{YOUR-ENDPOINT}/language/analyze-text/projects/{projectName}/:import?api-version=2021-11-01-preview
 ```
 
 |Placeholder  |Value  | Example |
 |---------|---------|---------|
 |`{YOUR-ENDPOINT}`     | The endpoint for authenticating your API request.   | `https://<your-custom-subdomain>.cognitiveservices.azure.com` |
 
-### Parameters
-
-Pass the following parameter with your request. 
-
-|Key|Explanation|Value|
-|--|--|--|
-|`api-version`| The API version used.| `2021-11-01-preview` |
-
-To pass the parameter, add `?api-version=2021-11-01-preview` to the end of your request URL.
-
 ### Headers
 
 Use the following header to authenticate your request. 
@@ -153,24 +143,14 @@ After your project has been created, you can begin training a text classificatio
 Use the following URL when creating your API request. Replace the placeholder values below with your own values. 
 
 ```rest
-{YOUR-ENDPOINT}/language/analyze-text/projects/{PROJECT-NAME}/:train
+{YOUR-ENDPOINT}/language/analyze-text/projects/{PROJECT-NAME}/:train?api-version=2021-11-01-preview
 ```
 
 |Placeholder  |Value  | Example |
 |---------|---------|---------|
 |`{YOUR-ENDPOINT}`     | The endpoint for authenticating your API request.   | `https://<your-custom-subdomain>.cognitiveservices.azure.com` |
 |`{PROJECT-NAME}`     | The name for your project. This value is case-sensitive.  | `myProject` |
 
-### Parameters
-
-Pass the following parameter with your request. 
-
-|Key|Explanation|Value|
-|--|--|--|
-|`api-version`| The API version used.| `2021-11-01-preview` |
-
-To pass the parameter, add `?api-version=2021-11-01-preview` to the end of your request URL.
-
 ### Headers
 
 Use the following header to authenticate your request. 
@@ -198,7 +178,7 @@ Use the following JSON in your request. The model will be named `MyModel` once t
 Once you send your API request, you will receive a `202` response indicating success. In the response headers, extract the `location` value. It will be formatted like this: 
 
 ```rest
-{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/train/jobs/{JOB-ID}?api-version=xxxx-xx-xx-xxxxxxx
+{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/train/jobs/{JOB-ID}?api-version=2021-11-01-preview
 ``` 
 
 `JOB-ID` is used to identify your request, since this operation is asynchronous. You will use this URL in the next step to get the training status. 
@@ -209,7 +189,7 @@ Use the following **GET** request to query the status of your model's training p
 
 
 ```rest
-{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/train/jobs/{JOB-ID}
+{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/train/jobs/{JOB-ID}?api-version=2021-11-01-preview
 ```
 
 |Placeholder  |Value  | Example |
@@ -218,16 +198,6 @@ Use the following **GET** request to query the status of your model's training p
 |`{PROJECT-NAME}`     | The name for your project. This value is case-sensitive.  | `myProject` |
 |`{JOB-ID}`     | The ID for locating your model's training status. This is in the `location` header value you received in the previous step.  | `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx` |
 
-### Parameters
-
-Pass the following parameter with your request. 
-
-|Key|Explanation|Value|
-|--|--|--|
-|`api-version`| The API version used.| `2021-11-01-preview` |
-
-To pass the parameter, add `?api-version=2021-11-01-preview` to the end of your request URL.
-
 ### Headers
 
 Use the following header to authenticate your request. 
@@ -277,7 +247,7 @@ Once you send the request, you will get the following response.
 Create a **PUT** request using the following URL, headers, and JSON body to start deploying a text classification model.
 
 ```rest
-{YOUR-ENDPOINT}/language/analyze-text/projects/{PROJECT-NAME}/deployments/{DEPLOYMENT-NAME}
+{YOUR-ENDPOINT}/language/analyze-text/projects/{PROJECT-NAME}/deployments/{DEPLOYMENT-NAME}?api-version=2021-11-01-preview
 ```
 
 |Placeholder  |Value  | Example |
@@ -286,16 +256,6 @@ Create a **PUT** request using the following URL, headers, and JSON body to star
 |`{PROJECT-NAME}`     | The name for your project. This value is case-sensitive.  | `myProject` |
 |`{DEPLOYMENT-NAME}`     | The name of your deployment. This value is case-sensitive.  | `prod` |
 
-### Parameters
-
-Pass the following parameter with your request. 
-
-|Key|Explanation|Value|
-|--|--|--|
-|`api-version`| The API version used.| `2021-11-01-preview` |
-
-To pass the parameter, add `?api-version=2021-11-01-preview` to the end of your request URL.
-
 ### Headers
 
 Use the following header to authenticate your request. 
@@ -318,7 +278,7 @@ Use the following JSON in your request. The model will be named `MyModel` once t
 Once you send your API request, you will receive a `202` response indicating success. In the response headers, extract the `location` value. It will be formatted like this: 
 
 ```rest
-{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/deployments/{DEPLOYMENT-NAME}/jobs/{JOB-ID}?api-version=xxxx-xx-xx-xxxxxxx
+{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/deployments/{DEPLOYMENT-NAME}/jobs/{JOB-ID}?api-version=2021-11-01-preview
 ``` 
 
 `JOB-ID` is used to identify your request, since this operation is asynchronous. You will use this URL in the next step to get the publishing status.
@@ -328,7 +288,7 @@ Once you send your API request, you will receive a `202` response indicating suc
 Use the following **GET** request to query the status of your model's publishing process. You can use the URL you received from the previous step, or replace the placeholder values below with your own values. 
 
 ```rest
-{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/deployments/{DEPLOYMENT-NAME}/jobs/{JOB-ID}
+{YOUR-ENDPOINT}/language/analyze-text/projects/{YOUR-PROJECT-NAME}/deployments/{DEPLOYMENT-NAME}/jobs/{JOB-ID}?api-version=2021-11-01-preview
 ```
 
 |Placeholder  |Value  | Example |
@@ -338,16 +298,6 @@ Use the following **GET** request to query the status of your model's publishing
 |`{DEPLOYMENT-NAME}`     | The name of your deployment. This value is case-sensitive.  | `prod` |
 |`{JOB-ID}`     | The ID for locating your model's training status. This is in the `location` header value you received in the previous step.  | `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx` |
 
-### Parameters
-
-Pass the following parameter with your request. 
-
-|Key|Explanation|Value|
-|--|--|--|
-|`api-version`| The API version used.| `2021-11-01-preview` |
-
-To pass the parameter, add `?api-version=2021-11-01-preview` to the end of your request URL.
-
 ### Headers
 
 Use the following header to authenticate your request. 
@@ -525,4 +475,4 @@ Use the following header to authenticate your request.
 
 |Key|Value|
 |--|--|
-|Ocp-Apim-Subscription-Key| The key to your resource. Used for authenticating your API requests.|
+|Ocp-Apim-Subscription-Key| The key to your resource. Used for authenticating your API requests.|
diff --git a/articles/cognitive-services/language-service/custom-named-entity-recognition/concepts/data-formats.md b/articles/cognitive-services/language-service/custom-named-entity-recognition/concepts/data-formats.md
@@ -24,48 +24,76 @@ When you tag entities, the tags are saved as in the following JSON format. If yo
 
 ```json
 {
-    //List of entity names. Their index within this array is used as an ID. 
-    "entityNames": [
-        "entity_name1",
-        "entity_name2"
+    "extractors": [
+        {
+            "name": "Entity1"
+        },
+        {
+            "name": "Entity2"
+        }
     ],
-    "documents": "path_to_document", //Relative file path to get the text.
-    "culture": "en-US", //Standard culture strings supported by CultureInfo.
-    "entities": [
+    "documents": [
         {
-            "regionStart": 0,
-            "regionLength": 69,
-            "labels": [
+            "location": "file1.txt",
+            "language": "en-us",
+            "extractors": [
                 {
-                    "entity": 0, // Index of the entity in the "entityNames" array. Positions are relative to the original text (not bounding box)
-                    "start": 4,
-                    "length": 10
-                },
+                    "regionOffset": 0,
+                    "regionLength": 5129,
+                    "labels": [
+                        {
+                            "extractorName": "Entity1",
+                            "offset": 77,
+                            "length": 10
+                        },
+                        {
+                            "extractorName": "Entity2",
+                            "offset": 3062,
+                            "length": 8
+                        }
+                    ]
+                }
+            ]
+        },
+        {
+            "location": "file2.txt",
+            "language": "en-us",
+            "extractors": [
                 {
-                    "entity": 1,
-                    "start": 18,
-                    "length": 11
+                    "regionOffset": 0,
+                    "regionLength": 6873,
+                    "labels": [
+                        {
+                            "extractorName": "Entity2",
+                            "offset": 60,
+                            "length": 7
+                        },
+                        {
+                            "extractorName": "Entity1",
+                            "offset": 2805,
+                            "length": 10
+                        }
+                    ]
                 }
             ]
         }
-    ]    
+    ]
 }
 ```
 
-The following list describes the various JSON properties of the sample above.
+### Data description
 
-* `entityNames`: An array of entity names. Index of the entity within the array is used as its ID.
+* `extractors`: An array of extractors for your data. Each extractor represents one of the entities you want to extract from your data.
 * `documents`: An array of tagged documents.
-  * `location`: The path of the document relative to the JSON file. For example, docs on the same level as the tags file `file.txt`, for docs inside one directory level `dir1/file.txt`.
-  * `culture`: culture/language of the document. <!-- See [language support](../language-support.md) for more information. -->
-  * `entities`: Specifies the entity recognition tags.
-    * `regionStart`: The inclusive character position of the start of the text.
-    * `regionLength`: The length of the bounding box in terms of UTF16 characters. Training only considers the data in this region, so if this is a tagged file, set the `regionStart` to 0 and the `regionLength` to the last index of last character in the file. You can also set this region if you want to introduce a negative sample to the training, by defining the region as a portion of the file with no tags.
-
-    * `labels`: All tags occurring within the bounding box.
-      * `entity`: The index of the entity in the `entityNames` array.
-      * `start`: The inclusive character position of the start of the tag in the document text. This is not relative to the bounding box.
-      * `length`: The length of the tag in terms of UTF16 characters.
+  * `location`: The path of the file. The file has to be in root of the storage container.
+  * `language`: Language of the file. Use one of the [supported culture locales](../language-support.md).
+  * `extractors`: Array of extractor objects to be extracted from the file.
+    * `regionOffset`: The inclusive character position of the start of the text.
+    * `regionLength`: The length of the bounding box in terms of UTF16 characters. Training only considers the data in this region.
+    * `labels`: Array of all the tagged entities within the specified region.
+      * `extractorName`: Type of the entity to be extracted.
+      * `offset`: The inclusive character position of the start of the entity. This is not relative to the bounding box.
+      * `length`: The length of the entity in terms of UTF16 characters.
 
 ## Next steps
 
diff --git a/articles/cognitive-services/language-service/custom-named-entity-recognition/includes/quickstarts/rest-api.md b/articles/cognitive-services/language-service/custom-named-entity-recognition/includes/quickstarts/rest-api.md

Original file line number	Diff line number	Diff line change
@@ -34,7 +34,7 @@ Your tags file should be in the `json` format below.
`34`	`34`	`],`
`35`	`35`	`"documents": [`
`36`	`36`	`{`
`37`		`- "location": "doc1.txt",`
	`37`	`+ "location": "file1.txt",`
`38`	`38`	`"language": "en-us",`
`39`	`39`	`"classifiers": [`
`40`	`40`	`{`
@@ -44,6 +44,15 @@ Your tags file should be in the `json` format below.
`44`	`44`	`"classifierName": "Class1"`
`45`	`45`	`}`
`46`	`46`	`]`
	`47`	`+ },`
	`48`	`+ {`
	`49`	`+ "location": "file2.txt",`
	`50`	`+ "language": "en-us",`
	`51`	`+ "classifiers": [`
	`52`	`+ {`
	`53`	`+ "classifierName": "Class2"`
	`54`	`+ }`
	`55`	`+ ]`
`47`	`56`	`}`
`48`	`57`	`]`
`49`	`58`	`}`
@@ -52,10 +61,10 @@ Your tags file should be in the `json` format below.
`52`	`61`	`### Data description`
`53`	`62`
`54`	`63`	* `classifiers`: An array of classifiers for your data. Each classifier represents one of the classes you want to tag your data with.
`55`		-* `documents`: An array of tagged documents. For example:
`56`		- * `location`: The path of the JSON file containing tags. The tags file has to be in root of the storage container.
`57`		- * `language`: Language of the document. Use one of the [supported culture locales](../language-support.md).
`58`		- * `classifiers`: Array of classifier objects assigned to the document. If you're working on a single classification project, there should be one classifier only.
	`64`	+* `documents`: An array of tagged documents.
	`65`	+ * `location`: The path of the file. The file has to be in root of the storage container.
	`66`	+ * `language`: Language of the file. Use one of the [supported culture locales](../language-support.md).
	`67`	+ * `classifiers`: Array of classifier objects assigned to the file. If you're working on a single classification project, there should be one classifier per file only.
`59`	`68`
`60`	`69`	`## Next steps`
`61`	`70`