Skip to content

Commit 00be434

Browse files
committed
Update code and text
1 parent 64010a9 commit 00be434

File tree

1 file changed

+51
-9
lines changed

1 file changed

+51
-9
lines changed

articles/machine-learning/how-to-use-batch-model-openai-embeddings.md

Lines changed: 51 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -292,8 +292,28 @@ For testing our endpoint, we are going to use a sample of the dataset [BillSum:
292292
293293
# [Azure CLI](#tab/cli)
294294
295-
:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/imagenet-classifier/deploy-and-run.sh" ID="show_job_in_studio" :::
296-
295+
1. Create a YAML file, bill-summarization.yml:
296+
297+
```yml
298+
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
299+
name: bill-summarization
300+
description: A sample of a dataset for summarization of US Congressional and California state bills.
301+
type: uri_file
302+
path: data/billsum-0.csv
303+
```
304+
305+
1. Create a data asset.
306+
307+
```azurecli
308+
az ml data create -f bill-summarization.yml
309+
```
310+
311+
1. Get the ID of the data asset.
312+
313+
```azurecli
314+
DATA_ASSET_ID=$(az ml data show -n bill-summarization --label latest | jq -r .id)
315+
```
316+
297317
# [Python](#tab/python)
298318
299319
[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=configure_inputs)]
@@ -302,7 +322,9 @@ For testing our endpoint, we are going to use a sample of the dataset [BillSum:
302322
303323
# [Azure CLI](#tab/cli)
304324
305-
:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="start_batch_scoring_job" :::
325+
```azurecli
326+
JOB_NAME=$(az ml batch-endpoint invoke --name $ENDPOINT_NAME --input $DATA_ASSET_ID --query name -o tsv)
327+
```
306328

307329
# [Python](#tab/python)
308330

@@ -331,15 +353,35 @@ For testing our endpoint, we are going to use a sample of the dataset [BillSum:
331353

332354
# [Python](#tab/python)
333355

356+
The deployment creates a child job that implements the scoring. Get a reference to that child job:
357+
358+
```python
359+
scoring_job = list(ml_client.jobs.list(parent_job_name=job.name))[0]
360+
```
361+
362+
Download the scores:
363+
334364
[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=download_outputs)]
335365

336366
1. The output predictions look like the following.
337367

338368
```python
339-
import pandas as pd
340-
341-
embeddings = pd.read_json("named-outputs/score/embeddings.jsonl", lines=True)
342-
embeddings
369+
import pandas as pd
370+
from io import StringIO
371+
372+
# Read the output data into an object.
373+
with open('sample-output.jsonl', 'r') as f:
374+
json_lines = f.readlines()
375+
string_io = StringIO()
376+
for line in json_lines:
377+
string_io.write(line)
378+
string_io.seek(0)
379+
380+
# Read the data into a data frame.
381+
embeddings = pd.read_json(string_io, lines=True)
382+
383+
# Print the data frame.
384+
print(embeddings)
343385
```
344386

345387
__embeddings.jsonl__
@@ -349,14 +391,14 @@ For testing our endpoint, we are going to use a sample of the dataset [BillSum:
349391
"file": "billsum-0.csv",
350392
"row": 0,
351393
"embeddings": [
352-
[0, 0, 0 ,0 , 0, 0, 0 ]
394+
[0, 0, 0, 0, 0, 0, 0 ]
353395
]
354396
},
355397
{
356398
"file": "billsum-0.csv",
357399
"row": 1,
358400
"embeddings": [
359-
[0, 0, 0 ,0 , 0, 0, 0 ]
401+
[0, 0, 0, 0, 0, 0, 0 ]
360402
]
361403
},
362404
```

0 commit comments

Comments
 (0)