|
21 | 21 | "\n", |
22 | 22 | "To train the aspect-based sentiment model, you will be using the [nlp recipes repository](https://github.com/microsoft/nlp-recipes/tree/master/examples/sentiment_analysis/absa). The model will then be deployed as an endpoint on an Azure Kubernetes cluster. Once deployed, the model is added to the enrichment pipeline as a custom skill for use by the Cognitive Search service.\n", |
23 | 23 | "\n", |
24 | | - "There are two datasets provided. To train the model the larger of the two datasets, the hotel_reviews_1000.csv, is required. Prefer to skip the training step? Download the hotel_reviews_100.csv.\n", |
| 24 | + "There are two datasets provided. To train the model the larger of the two datasets, the `hotel_reviews_1000.csv`, is required. Prefer to skip the training step? Download the `hotel_reviews_100.csv`.\n", |
25 | 25 | "\n", |
26 | 26 | "## Prerequisites\n", |
27 | 27 | "\n", |
|
177 | 177 | "\n", |
178 | 178 | "headers = {'api-key': api_key, 'Content-Type': content_type}\n", |
179 | 179 | "# Test out the URLs to ensure that the configuration works\n", |
180 | | - "print(construct_Url(search_service, \"indexes\", \"custom-ner\", \"analyze\", api_version))\n", |
181 | | - "print(construct_Url(search_service, \"indexes\", \"custom-ner\", None, api_version))\n", |
| 180 | + "print(construct_Url(search_service, \"indexes\", \"azureml-sentiment\", \"analyze\", api_version))\n", |
| 181 | + "print(construct_Url(search_service, \"indexes\", \"azureml-sentiment\", None, api_version))\n", |
182 | 182 | "print(construct_Url(search_service, \"indexers\", None, None, api_version))" |
183 | 183 | ] |
184 | 184 | }, |
185 | 185 | { |
186 | 186 | "cell_type": "markdown", |
187 | 187 | "metadata": {}, |
188 | 188 | "source": [ |
189 | | - "## 1.1 Create datasource" |
| 189 | + "## 1.1 Create datasource\n", |
| 190 | + "Be sure to upload `hotel_reviews_1000.csv` or `hotel_reviews_100.csv` to your storage account before continuing." |
190 | 191 | ] |
191 | 192 | }, |
192 | 193 | { |
|
195 | 196 | "metadata": {}, |
196 | 197 | "outputs": [], |
197 | 198 | "source": [ |
198 | | - "# Replace with the container name, storage account name, storage account connection string\n", |
199 | 199 | "container = datasource_container\n", |
200 | 200 | "\n", |
201 | | - "\n", |
202 | 201 | "datsource_def = {\n", |
203 | 202 | " 'name': f'{datasource_container}-ds',\n", |
204 | | - " 'description': f'km-aml solution accelerator- Datasource to label and extract custom entities from a corpus in STORAGEACCOUNTNAME, container datasource_container',\n", |
| 203 | + " 'description': f'Datasource with hotel reviews',\n", |
205 | 204 | " 'type': 'azureblob',\n", |
206 | 205 | " 'subtype': None,\n", |
207 | 206 | " 'credentials': {\n", |
208 | | - " 'connectionString': f'STORAGECONNSTRING'\n", |
| 207 | + " 'connectionString': f'{STORAGECONNSTRING}'\n", |
209 | 208 | " },\n", |
210 | 209 | " 'container': {\n", |
211 | 210 | " 'name': f'{datasource_container}'\n", |
|
233 | 232 | "source": [ |
234 | 233 | "skillset_name = f'{datasource_container}-ss'\n", |
235 | 234 | "skillset_def = {\n", |
236 | | - " '@odata.etag': '\\'0x8D7F6C769AC06B2\\'',\n", |
237 | | - " 'name': f'skillset_name',\n", |
| 235 | + " 'name': f'{skillset_name}',\n", |
238 | 236 | " 'description': 'Skillset to enrich hotel reviews with aspect based sentiment',\n", |
239 | 237 | " 'skills': [\n", |
240 | 238 | " {\n", |
|
482 | 480 | " 'cognitiveServices': {\n", |
483 | 481 | " '@odata.type': '#Microsoft.Azure.Search.CognitiveServicesByKey',\n", |
484 | 482 | " 'description': '/subscriptions/subscription_id/resourceGroups/resource_group/providers/Microsoft.CognitiveServices/accounts/cog_svcs_acct',\n", |
485 | | - " 'key': f'cog_svcs_key'\n", |
| 483 | + " 'key': f'{cog_svcs_key}'\n", |
486 | 484 | " },\n", |
487 | 485 | " 'knowledgeStore': {\n", |
488 | | - " 'storageConnectionString': f'STORAGECONNSTRING',\n", |
| 486 | + " 'storageConnectionString': f'{STORAGECONNSTRING}',\n", |
489 | 487 | " 'projections': [\n", |
490 | 488 | " {\n", |
491 | 489 | " 'tables': [],\n", |
492 | 490 | " 'objects': [\n", |
493 | 491 | " {\n", |
494 | | - " 'storageContainer': f'datasource_container-enriched',\n", |
| 492 | + " 'storageContainer': f'{datasource_container}-enriched',\n", |
495 | 493 | " 'referenceKeyName': None,\n", |
496 | 494 | " 'generatedKeyName': None,\n", |
497 | 495 | " 'source': '/document/objectprojection',\n", |
|
527 | 525 | "source": [ |
528 | 526 | "indexname = f'{datasource_container}-idx'\n", |
529 | 527 | "index_def = {\n", |
530 | | - " \"name\":f'indexname',\n", |
| 528 | + " \"name\":f'{indexname}',\n", |
531 | 529 | " \"defaultScoringProfile\": \"\",\n", |
532 | 530 | " \"fields\": [\n", |
533 | 531 | " {\n", |
|
1024 | 1022 | "source": [ |
1025 | 1023 | "indexername = f'{datasource_container}-idxr'\n", |
1026 | 1024 | "indexer_def = {\n", |
1027 | | - " \"name\": f'indexername',\n", |
| 1025 | + " \"name\": f'{indexername}',\n", |
1028 | 1026 | " \"description\": \"Indexer to enrich hotel reviews\",\n", |
1029 | 1027 | " \"dataSourceName\": f'{datasource_container}-ds',\n", |
1030 | 1028 | " \"skillsetName\": f'{datasource_container}-ss',\n", |
|
1076 | 1074 | " ],\n", |
1077 | 1075 | " \"cache\": {\n", |
1078 | 1076 | " \"enableReprocessing\": True,\n", |
1079 | | - " \"storageConnectionString\": f'know_store_cache'\n", |
| 1077 | + " \"storageConnectionString\": f'{know_store_cache}'\n", |
1080 | 1078 | " }\n", |
1081 | 1079 | "}\n", |
1082 | 1080 | "r = requests.post(construct_Url(search_service, \"indexers\", None, None, api_version), data=json.dumps(indexer_def), headers=headers)\n", |
|
1142 | 1140 | }, |
1143 | 1141 | "outputs": [], |
1144 | 1142 | "source": [ |
1145 | | - "!pip install azure-storage-blob\n", |
1146 | | - "!pip install --upgrade azureml-sdk" |
| 1143 | + "%pip install azure-storage-blob\n", |
| 1144 | + "%pip install --upgrade azureml-sdk" |
1147 | 1145 | ] |
1148 | 1146 | }, |
1149 | 1147 | { |
|
1161 | 1159 | "source": [ |
1162 | 1160 | "from azureml.core import Workspace, Datastore\n", |
1163 | 1161 | "try:\n", |
1164 | | - " ws = Workspace.from_config()\n", |
| 1162 | + " ws = Workspace(subscription_id, resource_group, workspace_name)\n", |
1165 | 1163 | " print(ws.name, ws.location, ws.resource_group, ws.location, sep='\\t')\n", |
1166 | 1164 | " print('Library configuration succeeded')\n", |
1167 | 1165 | "except:\n", |
1168 | 1166 | " print('Workspace not found')\n", |
| 1167 | + " \n", |
1169 | 1168 | "ds = ws.get_default_datastore()" |
1170 | 1169 | ] |
1171 | 1170 | }, |
|
1496 | 1495 | "metadata": {}, |
1497 | 1496 | "outputs": [], |
1498 | 1497 | "source": [ |
1499 | | - "docs = [ \"Great\", \"The bathroom sink splashes water, need to be fixed. The bathtub/shower make loud noises. Other than that, it was a good stay.\" ]\n", |
| 1498 | + "%pip install git+https://github.com/NervanaSystems/nlp-architect.git@absa --user" |
| 1499 | + ] |
| 1500 | + }, |
| 1501 | + { |
| 1502 | + "cell_type": "code", |
| 1503 | + "execution_count": null, |
| 1504 | + "metadata": {}, |
| 1505 | + "outputs": [], |
| 1506 | + "source": [ |
| 1507 | + "from nlp_architect.models.absa.inference.inference import SentimentInference\n", |
1500 | 1508 | "\n", |
| 1509 | + "aspect_lex_path = \"../models/hotel_aspect_lex.csv\"\n", |
| 1510 | + "opinion_lex_path = \"../models/hotel_opinion_lex_reranked.csv\"\n", |
| 1511 | + "inference = SentimentInference(aspect_lex_path, opinion_lex_path)\n", |
| 1512 | + "\n", |
| 1513 | + "docs = [ \"Great\", \"The bathroom sink splashes water, need to be fixed. The bathtub/shower make loud noises. Other than that, it was a good stay.\" ]\n", |
1501 | 1514 | "sentiment_docs = []\n", |
1502 | 1515 | "\n", |
1503 | 1516 | "for doc_raw in docs:\n", |
|
1772 | 1785 | "source": [ |
1773 | 1786 | "## Endpoint Deployed!\n", |
1774 | 1787 | "\n", |
1775 | | - "Next, add the endpoint in as a Azure ML skill. Either follow the instructions in (the tutorial)[https://docs.microsoft.com/azure/search/cognitive-search-tutorial-aml-custom-skill] to add the skill in the portal or continue through the notebook for the API experience.\n", |
| 1788 | + "Next, add the endpoint in as a Azure ML skill. Either follow the instructions in [the tutorial](https://docs.microsoft.com/azure/search/cognitive-search-tutorial-aml-custom-skill) to add the skill in the portal or continue through the notebook for the API experience.\n", |
1776 | 1789 | "\n", |
1777 | 1790 | "## 4.0 Add AML Skill - Update Skillset" |
1778 | 1791 | ] |
|
1795 | 1808 | "outputs": [], |
1796 | 1809 | "source": [ |
1797 | 1810 | "skillset_def = {\n", |
1798 | | - " '@odata.etag': '\\'0x8D7F6C769AC06B2\\'',\n", |
1799 | 1811 | " 'name': f'{skillset_name}',\n", |
1800 | 1812 | " 'description': 'Skillset to enrich hotel reviews with aspect based sentiment',\n", |
1801 | 1813 | " 'skills': [\n", |
|
2207 | 2219 | "name": "python", |
2208 | 2220 | "nbconvert_exporter": "python", |
2209 | 2221 | "pygments_lexer": "ipython3", |
2210 | | - "version": "3.6.5" |
| 2222 | + "version": "3.7.3" |
2211 | 2223 | } |
2212 | 2224 | }, |
2213 | 2225 | "nbformat": 4, |
|
0 commit comments