Skip to content

Commit 92632ca

Browse files
authored
Update how-to-nlp-processing-batch.md
1 parent 4295a1b commit 92632ca

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

articles/machine-learning/how-to-nlp-processing-batch.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Batch Endpoints can be used for processing tabular data, but also any other file
2121

2222
## About this sample
2323

24-
The model we are going to work with was built using the popular library transformers from HuggingFace along with [a pre-trained model from Facebook with the BART architecture](https://huggingface.co/facebook/bart-large-cnn). It was introduced in the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation](https://arxiv.org/abs/1910.13461). This model has the following constrains which are important to keep in mind for deployment:
24+
The model we are going to work with was built using the popular library transformers from HuggingFace along with [a pre-trained model from Facebook with the BART architecture](https://huggingface.co/facebook/bart-large-cnn). It was introduced in the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation](https://arxiv.org/abs/1910.13461). This model has the following constraints which are important to keep in mind for deployment:
2525

2626
* It can work with sequences up to 1024 tokens.
2727
* It is trained for summarization of text in English.
@@ -90,7 +90,7 @@ model_local_path = 'model'
9090
summarizer.save_pretrained(model_local_path)
9191
```
9292

93-
We can now register this model in the Azure Machine Leanring registry:
93+
We can now register this model in the Azure Machine Learning registry:
9494

9595
# [Azure CLI](#tab/cli)
9696

@@ -135,7 +135,7 @@ We are going to create a batch endpoint named `text-summarization-batch` where t
135135
136136
# [Azure CLI](#tab/azure-cli)
137137
138-
The following YAML file defines a batch endpoint:.
138+
The following YAML file defines a batch endpoint:
139139
140140
__endpoint.yml__
141141
@@ -183,7 +183,7 @@ Let's create the deployment that will host the model:
183183
> [!TIP]
184184
> Although files are provided in mini-batches by the deployment, this scoring script processes one row at a time. This is a common pattern when dealing with expensive models (like transformers) as trying to load the entire batch and send it to the model at once may result in high-memory pressure on the batch executor (OOM exeptions).
185185
186-
1. We need to indicate over which environment we are going to run the deployment. In our case, our model runs on `Torch` and it requires the libraries `transformers`, `accelerate`, and `optimium` from HuggingFace. Azure Machine Learning already has an environment with Torch and GPU support available. We are just going to add a couple of dependencies in a `conda.yml` file.
186+
1. We need to indicate over which environment we are going to run the deployment. In our case, our model runs on `Torch` and it requires the libraries `transformers`, `accelerate`, and `optimum` from HuggingFace. Azure Machine Learning already has an environment with Torch and GPU support available. We are just going to add a couple of dependencies in a `conda.yml` file.
187187

188188
__environment/conda.yml__
189189

@@ -239,7 +239,7 @@ Let's create the deployment that will host the model:
239239
> [!NOTE]
240240
> You are not charged for compute at this point as the cluster will remain at 0 nodes until a batch endpoint is invoked and a batch scoring job is submitted. Learn more about [manage and optimize cost for AmlCompute](./how-to-manage-optimize-cost.md#use-azure-machine-learning-compute-cluster-amlcompute).
241241
242-
1. Now, let create the deployment.
242+
1. Now, let's create the deployment.
243243

244244
# [Azure CLI](#tab/cli)
245245

@@ -374,7 +374,7 @@ As mentioned in some of the notes along this tutorial, processing text may have
374374
375375
## Considerations for MLflow models that process text
376376

377-
The same considerations mentioned above apply to MLflow models. However, since you are not required to provide a scoring script for your MLflow model deployment, some of the recommendation mentioned may require a different approach.
377+
The same considerations mentioned above apply to MLflow models. However, since you are not required to provide a scoring script for your MLflow model deployment, some of the recommendations mentioned may require a different approach.
378378

379379
* MLflow models in Batch Endpoints support reading tabular data as input data, which may contain long sequences of text. See [File's types support](how-to-mlflow-batch.md#files-types-support) for details about which file types are supported.
380380
* Batch deployments will call your MLflow model's predict function with the content of an entire file in as Pandas dataframe. If your input data contains many rows, chances are that running a complex model (like the one presented in this tutorial) will result in an out-of-memory exception. If this is your case, you can consider:

0 commit comments

Comments
 (0)