Skip to content

Commit 30ef7a7

Browse files
committed
Freshness, correctness, and format updates
1 parent f75dbac commit 30ef7a7

File tree

1 file changed

+23
-14
lines changed

1 file changed

+23
-14
lines changed

articles/synapse-analytics/machine-learning/tutorial-text-analytics-use-mmlspark.md

Lines changed: 23 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,15 @@ description: Learn how to use text analytics in Azure Synapse Analytics.
44
ms.service: azure-synapse-analytics
55
ms.subservice: machine-learning
66
ms.topic: tutorial
7-
ms.date: 11/02/2021
7+
ms.date: 11/19/2024
88
author: ruixinxu
99
ms.author: ruxu
10+
# customer intent: As a Synapse Analytics user, I want to be able to analyze my text using Azure AI services.
1011
---
1112

1213
# Tutorial: Text Analytics with Azure AI services
1314

14-
[Text Analytics](/azure/ai-services/language-service/) is an [Azure AI services](/azure/ai-services/) that enables you to perform text mining and text analysis with Natural Language Processing (NLP) features. In this tutorial, you'll learn how to use [Text Analytics](/azure/ai-services/language-service/) to analyze unstructured text on Azure Synapse Analytics.
15+
In this tutorial, you learn how to use [Text Analytics](/azure/ai-services/language-service/) to analyze unstructured text on Azure Synapse Analytics. [Text Analytics](/azure/ai-services/language-service/) is an [Azure AI services](/azure/ai-services/) that enables you to perform text mining and text analysis with Natural Language Processing (NLP) features.
1516

1617
This tutorial demonstrates using text analytics with [SynapseML](https://github.com/microsoft/SynapseML) to:
1718

@@ -29,34 +30,35 @@ If you don't have an Azure subscription, [create a free account before you begin
2930

3031
- [Azure Synapse Analytics workspace](../get-started-create-workspace.md) with an Azure Data Lake Storage Gen2 storage account configured as the default storage. You need to be the *Storage Blob Data Contributor* of the Data Lake Storage Gen2 file system that you work with.
3132
- Spark pool in your Azure Synapse Analytics workspace. For details, see [Create a Spark pool in Azure Synapse](../quickstart-create-sql-pool-studio.md).
32-
- Pre-configuration steps described in the tutorial [Configure Azure AI services in Azure Synapse](tutorial-configure-cognitive-services-synapse.md).
33-
33+
- Preconfiguration steps described in the tutorial [Configure Azure AI services in Azure Synapse](tutorial-configure-cognitive-services-synapse.md).
3434

3535
## Get started
36-
Open Synapse Studio and create a new notebook. To get started, import [SynapseML](https://github.com/microsoft/SynapseML).
36+
37+
Open Synapse Studio and create a new notebook. To get started, import [SynapseML](https://github.com/microsoft/SynapseML).
3738

3839
```python
3940
import synapse.ml
40-
from synapse.ml.cognitive import *
41+
from synapse.ml.services import *
4142
from pyspark.sql.functions import col
4243
```
4344

4445
## Configure text analytics
4546

46-
Use the linked text analytics you configured in the [pre-configuration steps](tutorial-configure-cognitive-services-synapse.md) .
47+
Use the linked text analytics you configured in the [preconfiguration steps](tutorial-configure-cognitive-services-synapse.md).
4748

4849
```python
49-
ai_service_name = "<Your linked service for text analytics>"
50+
linked_service_name = "<Your linked service for text analytics>"
5051
```
5152

5253
## Text Sentiment
53-
The Text Sentiment Analysis provides a way for detecting the sentiment labels (such as "negative", "neutral" and "positive") and confidence scores at the sentence and document-level. See the [Supported languages in Text Analytics API](/azure/ai-services/language-service/language-detection/overview?tabs=sentiment-analysis) for the list of enabled languages.
54+
55+
The Text Sentiment Analysis provides a way for detecting the sentiment labels (such as "negative", "neutral", and "positive") and confidence scores at the sentence and document-level. See the [Supported languages in Text Analytics API](/azure/ai-services/language-service/language-detection/overview?tabs=sentiment-analysis) for the list of enabled languages.
5456

5557
```python
5658

5759
# Create a dataframe that's tied to it's column names
5860
df = spark.createDataFrame([
59-
("I am so happy today, its sunny!", "en-US"),
61+
("I am so happy today, it's sunny!", "en-US"),
6062
("I am frustrated by this rush hour traffic", "en-US"),
6163
("The Azure AI services on spark aint bad", "en-US"),
6264
], ["text", "language"])
@@ -77,13 +79,14 @@ display(results
7779
.select("text", "sentiment"))
7880

7981
```
82+
8083
### Expected results
8184

8285
|text|sentiment|
8386
|---|---|
84-
|I am so happy today, its sunny!|positive|
85-
|I am frustrated by this rush hour traffic|negative|
86-
|The Azure AI services on spark aint bad|positive|
87+
|I'm so happy today, it's sunny!|positive|
88+
|I'm frustrated by this rush hour traffic|negative|
89+
|The Azure AI services on spark aint bad|neutral|
8790

8891
---
8992

@@ -186,12 +189,15 @@ ner = (NER()
186189

187190
display(ner.transform(df).select("text", col("replies").getItem("document").getItem("entities").alias("entities")))
188191
```
192+
189193
### Expected results
194+
190195
![Expected results for named entity recognition v3.1](./media/tutorial-text-analytics-use-mmlspark/expected-output-ner-v-31.png)
191196

192197
---
193198

194199
## Personally Identifiable Information (PII) V3.1
200+
195201
The PII feature is part of NER and it can identify and redact sensitive entities in text that are associated with an individual person such as: phone number, email address, mailing address, passport number. See the [Supported languages in Text Analytics API](/azure/ai-services/language-service/language-detection/overview?tabs=pii) for the list of enabled languages.
196202

197203
```python
@@ -209,17 +215,20 @@ pii = (PII()
209215

210216
display(pii.transform(df).select("text", col("replies").getItem("document").getItem("entities").alias("entities")))
211217
```
218+
212219
### Expected results
220+
213221
![Expected results for personal identifiable information v3.1](./media/tutorial-text-analytics-use-mmlspark/expected-output-pii-v-31.png)
214222

215223
---
216224

217225
## Clean up resources
226+
218227
To ensure the Spark instance is shut down, end any connected sessions(notebooks). The pool shuts down when the **idle time** specified in the Apache Spark pool is reached. You can also select **stop session** from the status bar at the upper right of the notebook.
219228

220229
![Screenshot showing the Stop session button on the status bar.](./media/tutorial-build-applications-use-mmlspark/stop-session.png)
221230

222-
## Next steps
231+
## Related content
223232

224233
* [Check out Synapse sample notebooks](https://github.com/Azure-Samples/Synapse/tree/main/MachineLearning)
225234
* [SynapseML GitHub Repo](https://github.com/microsoft/SynapseML)

0 commit comments

Comments
 (0)