Skip to content

Commit c8667d4

Browse files
authored
Update on-your-data-configuration.md
1 parent bc83649 commit c8667d4

File tree

1 file changed

+3
-11
lines changed

1 file changed

+3
-11
lines changed

articles/ai-services/openai/how-to/on-your-data-configuration.md

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -25,20 +25,12 @@ Use this article to learn how to configure networking and access when using Azur
2525

2626
When you use Azure OpenAI On Your Data to ingest data from Azure blob storage, local files or URLs into Azure AI Search, the following process is used to process the data.
2727

28-
:::image type="content" source="../media/use-your-data/ingestion-architecture.png" alt-text="A diagram showing the process of ingesting data." lightbox="../media/use-your-data/ingestion-architecture.png":::
28+
:::image type="content" source="../media/use-your-data/ingestion-architecture.png" alt-text="A diagram showing the process of ingesting data." lightbox="../media/use-your-data/ingestion-architecture-iv.png":::
2929

3030
* Steps 1 and 2 are only used for file upload.
3131
* Downloading URLs to your blob storage is not illustrated in this diagram. After web pages are downloaded from the internet and uploaded to blob storage, steps 3 onward are the same.
32-
* Two indexers, two indexes, two data sources and a [custom skill](/azure/search/cognitive-search-custom-skill-interface) are created in the Azure AI Search resource.
33-
* The chunks container is created in the blob storage.
34-
* If the schedule triggers the ingestion, the ingestion process starts from step 7.
35-
* Azure OpenAI's `preprocessing-jobs` API implements the [Azure AI Search customer skill web API protocol](/azure/search/cognitive-search-custom-skill-web-api), and processes the documents in a queue.
36-
* Azure OpenAI:
37-
1. Internally uses the first indexer created earlier to crack the documents.
38-
1. Uses a heuristic-based algorithm to perform chunking. It honors table layouts and other formatting elements in the chunk boundary to ensure the best chunking quality.
39-
1. If you choose to enable vector search, Azure OpenAI uses the selected embedding setting to vectorize the chunks.
40-
* When all the data that the service is monitoring are processed, Azure OpenAI triggers the second indexer.
41-
* The indexer stores the processed data into an Azure AI Search service.
32+
* One indexer, one index, and one data sources in the Azure AI Search resource are created using prebuilt skills and [integrated vectorization](/azure/search/vector-search-integrated-vectorization.md).
33+
* Azure AI Search handles the extraction, chunking, and vectorization of chunked documents through integrated vectorization. If a scheduling interval is specified, the indexer will run accordingly.
4234

4335
For the managed identities used in service calls, only system assigned managed identities are supported. User assigned managed identities aren't supported.
4436

0 commit comments

Comments
 (0)