You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/content-understanding/tutorial/RAG-tutorial.md
+29-62Lines changed: 29 additions & 62 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -324,74 +324,40 @@ With the analyzers created for each modality, we can now process files to extrac
324
324
325
325
---
326
326
327
-
## Analyze a file
328
-
329
-
You can analyze files using the custom analyzer you created to extract the fields defined in the schema.
330
-
331
-
Before running the cURL command, make the following changes to the HTTP request:
332
-
333
-
# [Document](#tab/document)
334
-
335
-
1. Replace `{endpoint}` and `{key}` with the endpoint and key values from your Azure portal Azure AI Services instance.
336
-
1. Replace `{analyzerId}` with the name of the custom analyzer created earlier.
337
-
1. Replace `{fileUrl}` with a publicly accessible URL of the file to analyze, such as a path to an Azure Storage Blob with a shared access signature (SAS) or the sample URL `https://github.com/Azure-Samples/cognitive-services-REST-api-samples/raw/master/curl/form-recognizer/rest-api/invoice.pdf`.
338
-
339
-
# [Image](#tab/image)
340
-
341
-
1. Replace `{endpoint}` and `{key}` with the endpoint and key values from your Azure portal Azure AI Services instance.
342
-
1. Replace `{analyzerId}` with the name of the custom analyzer created earlier.
343
-
1. Replace `{fileUrl}` with a publicly accessible URL of the file to analyze, such as a path to an Azure Storage Blob with a shared access signature (SAS).
344
-
345
-
# [Audio](#tab/audio)
346
-
347
-
1. Replace `{endpoint}` and `{key}` with the endpoint and key values from your Azure portal Azure AI Services instance.
348
-
1. Replace `{analyzerId}` with the name of the custom analyzer created earlier.
349
-
1. Replace `{fileUrl}` with a publicly accessible URL of the file to analyze, such as a path to an Azure Storage Blob with a shared access signature (SAS).
350
-
351
-
# [Video](#tab/video)
352
-
353
-
1. Replace `{endpoint}` and `{key}` with the endpoint and key values from your Azure portal Azure AI Services instance.
354
-
1. Replace `{analyzerId}` with the name of the custom analyzer created earlier.
355
-
1. Replace `{fileUrl}` with a publicly accessible URL of the file to analyze, such as a path to an Azure Storage Blob with a shared access signature (SAS).
356
-
357
-
---
358
-
359
-
### POST request
360
-
```bash
361
-
curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2024-12-01-preview" \
362
-
-H "Ocp-Apim-Subscription-Key: {key}" \
363
-
-H "Content-Type: application/json" \
364
-
-d "{\"url\":\"{fileUrl}\"}"
365
-
```
366
-
367
-
### POST response
368
-
369
-
The 202 (`Accepted`) response includes an `Operation-Location` header containing a URL that you can use to track the status of this asynchronous analyze operation.
print("Error in creating analyzer. Please double-check your analysis settings.\nIf there is a conflict, you can delete the analyzer and then recreate it, or move to the next cell and use the existing analyzer.")
380
350
381
-
1. Replace `{endpoint}` and `{key}` with the endpoint and key values from your Azure portal Azure AI Services instance.
382
-
1. Replace `{analyzerId}` with the name of the custom analyzer created earlier.
383
-
1. Replace `{resultId}` with the `resultId` returned from the `POST` request.
The 200 (`OK`) JSON response includes a `status` field indicating the status of the operation. If the operation isn't complete, the value of `status` is `running` or `notStarted`. In such cases, you should call the API again, either manually or through a script. Wait an interval of one second or more between calls.
394
-
360
+
---
395
361
### Extraction Results
396
362
The result below demonstrates the output of content and field extraction using Azure AI Content Understanding. The JSON response contains multiple fields, each serving a specific purpose in representing the extracted data.
397
363
@@ -646,7 +612,8 @@ The result shows the extraction of video segments into meaningful units, spoken
646
612
647
613
## Pre-processing the Output from Content Understanding
648
614
649
-
Once the data has been extracted using Azure AI Content Understanding, the next step is to prepare the analysis output for embedding within a search system. Pre-processing the output ensures that the extracted content is transformed into a format suitable for indexing and retrieval. This step involves converting the JSON output from the analyzers into structured strings, preserving both the content and metadata for seamless integration into downstream workflows.
615
+
Once the data has been extracted using Azure AI Content Understanding, the next step is to prepare the analysis output for embedding within a search system. Pre-processing the output ensures that the extracted content is transformed into a format suitable for indexing and retrieval. This step involves converting the JSON output from the analyzers into structured strings, preserving both the content and metadata for seamless integration into downstream workflows.
616
+
650
617
The following example demonstrates how to pre-process the output data from the analyzers, including documents, images, audio, and video. By converting each JSON output into a structured string, this process lays the groundwork for embedding the data into a vector-based search system, enabling efficient retrieval and enhanced RAG workflows.
0 commit comments