Skip to content

Commit 8c9dd33

Browse files
authored
Update how-to-nlp-processing-batch.md
1 parent 5542e90 commit 8c9dd33

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

articles/machine-learning/how-to-nlp-processing-batch.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: "Text processing with batch endpoints"
2+
title: "Deploy and run language models in batch endpoints"
33
titleSuffix: Azure Machine Learning
4-
description: Learn how to use batch deployments to process text and output results.
4+
description: Learn how to use batch deployments to process text with large language models.
55
services: machine-learning
66
ms.service: machine-learning
77
ms.subservice: core
@@ -17,7 +17,7 @@ ms.custom: devplatv2
1717

1818
[!INCLUDE [cli v2](../../includes/machine-learning-dev-v2.md)]
1919

20-
Batch Endpoints can be used to deploy expensive models, like language models, over text data. In this tutorial you'll learn how to deploy a model that can perform text summarization of long sequences of text using a model from HuggingFace.
20+
Batch Endpoints can be used to deploy expensive models, like language models, over text data. In this tutorial you'll learn how to deploy a model that can perform text summarization of long sequences of text using a model from HuggingFace. It also shows how to do inference optimization using HuggingFace `optimum` and `accelerate` libraries.
2121

2222
## About this sample
2323

@@ -141,7 +141,7 @@ Let's create the deployment that will host the model:
141141

142142
> [!div class="checklist"]
143143
> * Indicates an `init` function that detects the hardware configuration (CPU vs GPU) and loads the model accordingly. Both the model and the tokenizer are loaded in global variables. We are not using a `pipeline` object from HuggingFace to account for the limitation in the sequence lenghs of the model we are currently using.
144-
> * Notice that we are doing performing model optimizations to improve the performance using `optimum` and accelerate libraries. If the model or hardware doesn't support it, we will run the deployment without such optimizations.
144+
> * Notice that we are doing performing **model optimizations** to improve the performance using `optimum` and `accelerate` libraries. If the model or hardware doesn't support it, we will run the deployment without such optimizations.
145145
> * Indicates a `run` function that is executed for each mini-batch the batch deployment provides.
146146
> * The `run` function read the entire batch using the `datasets` library. The text we need to summarize is on the column `text`.
147147
> * The `run` method iterates over each of the rows of the text and run the prediction. Since this is a very expensive model, running the prediction over entire files will result in an out-of-memory exception. Notice that the model is not execute with the `pipeline` object from `transformers`. This is done to account for long sequences of text and the limitation of 1024 tokens in the underlying model we are using.

0 commit comments

Comments
 (0)