You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/on-your-data-configuration.md
+4-12Lines changed: 4 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,16 +29,8 @@ When you use Azure OpenAI On Your Data to ingest data from Azure blob storage, l
29
29
30
30
* Steps 1 and 2 are only used for file upload.
31
31
* Downloading URLs to your blob storage is not illustrated in this diagram. After web pages are downloaded from the internet and uploaded to blob storage, steps 3 onward are the same.
32
-
* Two indexers, two indexes, two data sources and a [custom skill](/azure/search/cognitive-search-custom-skill-interface) are created in the Azure AI Search resource.
33
-
* The chunks container is created in the blob storage.
34
-
* If the schedule triggers the ingestion, the ingestion process starts from step 7.
35
-
* Azure OpenAI's `preprocessing-jobs` API implements the [Azure AI Search customer skill web API protocol](/azure/search/cognitive-search-custom-skill-web-api), and processes the documents in a queue.
36
-
* Azure OpenAI:
37
-
1. Internally uses the first indexer created earlier to crack the documents.
38
-
1. Uses a heuristic-based algorithm to perform chunking. It honors table layouts and other formatting elements in the chunk boundary to ensure the best chunking quality.
39
-
1. If you choose to enable vector search, Azure OpenAI uses the selected embedding setting to vectorize the chunks.
40
-
* When all the data that the service is monitoring are processed, Azure OpenAI triggers the second indexer.
41
-
* The indexer stores the processed data into an Azure AI Search service.
32
+
* One indexer, one index, and one data source in the Azure AI Search resource is created using prebuilt skills and [integrated vectorization](/azure/search/vector-search-integrated-vectorization.md).
33
+
* Azure AI Search handles the extraction, chunking, and vectorization of chunked documents through integrated vectorization. If a scheduling interval is specified, the indexer will run accordingly.
42
34
43
35
For the managed identities used in service calls, only system assigned managed identities are supported. User assigned managed identities aren't supported.
44
36
@@ -167,7 +159,7 @@ To set the managed identities via the management API, see [the management API re
167
159
168
160
### Enable trusted service
169
161
170
-
To allow your Azure AI Search to call your Azure OpenAI `preprocessing-jobs` as custom skill web API, while Azure OpenAI has no public network access, you need to set up Azure OpenAI to bypass Azure AI Search as a trusted service based on managed identity. Azure OpenAI identifies the traffic from your Azure AI Search by verifying the claims in the JSON Web Token (JWT). Azure AI Search must use the system assigned managed identity authentication to call the custom skill web API.
162
+
To allow your Azure AI Search to call your Azure OpenAI `embedding model, while Azure OpenAI has no public network access, you need to set up Azure OpenAI to bypass Azure AI Search as a trusted service based on managed identity. Azure OpenAI identifies the traffic from your Azure AI Search by verifying the claims in the JSON Web Token (JWT). Azure AI Search must use the system assigned managed identity authentication to call the embedding endpoint.
171
163
172
164
Set `networkAcls.bypass` as `AzureServices` from the management API. For more information, see [Virtual networks article](/azure/ai-services/cognitive-services-virtual-networks?tabs=portal#grant-access-to-trusted-azure-services-for-azure-openai).
173
165
@@ -268,7 +260,7 @@ So far you have already setup each resource work independently. Next you need to
268
260
|`Search Index Data Reader`| Azure OpenAI | Azure AI Search | Inference service queries the data from the index. |
269
261
|`Search Service Contributor`| Azure OpenAI | Azure AI Search | Inference service queries the index schema for auto fields mapping. Data ingestion service creates index, data sources, skill set, indexer, and queries the indexer status. |
270
262
|`Storage Blob Data Contributor`| Azure OpenAI | Storage Account | Reads from the input container, and writes the preprocessed result to the output container. |
|`Cognitive Services OpenAI Contributor`| Azure AI Search | Azure OpenAI |to allow the Azure AI Search resource access to the Azure OpenAI embedding endpoint. |
272
264
|`Storage Blob Data Reader`| Azure AI Search | Storage Account | Reads document blobs and chunk blobs. |
273
265
|`Reader`| Azure AI Foundry Project | Azure Storage Private Endpoints (Blob & File) | Read search indexes created in blob storage within an Azure AI Foundry Project. |
0 commit comments