You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/tutorial-document-extraction-image-verbalization.md
+28-28Lines changed: 28 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,7 +51,7 @@ This tutorial demonstrates a lower-cost approach for indexing multimodal content
51
51
52
52
## Prepare data
53
53
54
-
The following instructions apply to Azure Storage which provides the sample data and also hosts the knowledge store. A search service identity needs read access to Azure Storage to retrieve the sample data, and it needs write access to create the knowledge store. The search service creates the container for cropped images during skillset processing.
54
+
The following instructions apply to Azure Storage which provides the sample data and also hosts the knowledge store. A search service identity needs read access to Azure Storage to retrieve the sample data, and it needs write access to create the knowledge store. The search service creates the container for cropped images during skillset processing, using the name you provide in an environment variable.
55
55
56
56
1. Download the following sample PDF: [sustainable-ai-pdf](https://cdn-dynmedia-1.microsoft.com/is/content/microsoftcorp/microsoft/msc/documents/presentations/CSR/Accelerating-Sustainability-with-AI-2025.pdf)
57
57
@@ -85,7 +85,7 @@ The following instructions apply to Azure Storage which provides the sample data
85
85
86
86
### Copy a search service URL and API key
87
87
88
-
For this tutorial, your REST client connection to Azure AI Search requires an endpoint and an API key. You can get these values from the Azure portal. For alternative connection methods, see [Connect to a search service](search-get-started-rbac.md).
88
+
For this tutorial, your local REST client connection to Azure AI Search requires an endpoint and an API key. You can get these values from the Azure portal. For alternative connection methods, see [Connect to a search service](search-get-started-rbac.md).
89
89
90
90
1. Sign in to the [Azure portal](https://portal.azure.com), navigate to the search service **Overview** page, and copy the URL. An example endpoint might look like `https://mydemo.search.windows.net`.
91
91
@@ -100,8 +100,8 @@ For this tutorial, your REST client connection to Azure AI Search requires an en
100
100
1. Provide values for variables used in the request.
@@ -638,9 +638,9 @@ You can start searching as soon as the first document is loaded.
638
638
639
639
```http
640
640
### Query the index
641
-
POST {{baseUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
641
+
POST {{searchUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
642
642
Content-Type: application/json
643
-
api-key: {{apiKey}}
643
+
api-key: {{searchApiKey}}
644
644
645
645
{
646
646
"search": "*",
@@ -689,9 +689,9 @@ Here are some examples of other queries:
689
689
690
690
```http
691
691
### Query for only images
692
-
POST {{baseUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
692
+
POST {{searchUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
693
693
Content-Type: application/json
694
-
api-key: {{apiKey}}
694
+
api-key: {{searchApiKey}}
695
695
696
696
{
697
697
"search": "*",
@@ -702,9 +702,9 @@ POST {{baseUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?ap
702
702
703
703
```http
704
704
### Query for text or images with content related to energy, returning the id, parent document, and text (extracted text for text chunks and verbalized image text for images), and the content path where the image is saved in the knowledge store (only populated for images)
705
-
POST {{baseUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
705
+
POST {{searchUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
706
706
Content-Type: application/json
707
-
api-key: {{apiKey}}
707
+
api-key: {{searchApiKey}}
708
708
709
709
{
710
710
"search": "energy",
@@ -719,20 +719,20 @@ Indexers can be reset to clear the high-water mark, which allows a full rerun. T
719
719
720
720
```http
721
721
### Reset the indexer
722
-
POST {{baseUrl}}/indexers/doc-extraction-image-verbalization-indexer/reset?api-version=2025-05-01-preview HTTP/1.1
723
-
api-key: {{apiKey}}
722
+
POST {{searchUrl}}/indexers/doc-extraction-image-verbalization-indexer/reset?api-version=2025-05-01-preview HTTP/1.1
723
+
api-key: {{searchApiKey}}
724
724
```
725
725
726
726
```http
727
727
### Run the indexer
728
-
POST {{baseUrl}}/indexers/doc-extraction-image-verbalization-indexer/run?api-version=2025-05-01-preview HTTP/1.1
729
-
api-key: {{apiKey}}
728
+
POST {{searchUrl}}/indexers/doc-extraction-image-verbalization-indexer/run?api-version=2025-05-01-preview HTTP/1.1
729
+
api-key: {{searchApiKey}}
730
730
```
731
731
732
732
```http
733
733
### Check indexer status
734
-
GET {{baseUrl}}/indexers/doc-extraction-image-verbalization-indexer/status?api-version=2025-05-01-preview HTTP/1.1
735
-
api-key: {{apiKey}}
734
+
GET {{searchUrl}}/indexers/doc-extraction-image-verbalization-indexer/status?api-version=2025-05-01-preview HTTP/1.1
Copy file name to clipboardExpand all lines: articles/search/tutorial-document-extraction-multimodal-embeddings.md
+46-32Lines changed: 46 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ This tutorial demonstrates a lower-cost approach for indexing multimodal content
37
37
38
38
## Prerequisites
39
39
40
-
+An [Azure AI services multi-service account](/azure/ai-services/multi-service-resource#azure-ai-services-resource-for-azure-ai-search-skills). This account provides access to the Azure AI Vision multimodal embedding model used in this tutorial. You must use an Azure AI multi-service account for skillset access to this resource.
40
+
+[Azure AI services multi-service account](/azure/ai-services/multi-service-resource#azure-ai-services-resource-for-azure-ai-search-skills). This account provides access to the Azure AI Vision multimodal embedding model used in this tutorial. You must use an Azure AI multi-service account for skillset access to this resource.
41
41
42
42
+[Azure AI Search](search-create-service-portal.md). [Configure your search service](search-manage.md) for role-based access control and a managed identity for connections to Azure Storage and Azure AI Vision. Your service must be on the Basic tier or higher. This tutorial isn't supported on the Free tier. The search service must also be in the same region as your multi-service account.
43
43
@@ -51,7 +51,7 @@ This tutorial demonstrates a lower-cost approach for indexing multimodal content
51
51
52
52
## Prepare data
53
53
54
-
The following instructions apply to Azure Storage which provides the sample data and also hosts the knowledge store. A search service identity needs read access to Azure Storage to retrieve the sample data, and it needs write access to create the knowledge store. The search service creates the container for cropped images during skillset processing.
54
+
The following instructions apply to Azure Storage which provides the sample data and also hosts the knowledge store. A search service identity needs read access to Azure Storage to retrieve the sample data, and it needs write access to create the knowledge store. The search service creates the container for cropped images during skillset processing, using the name you provide in an environment variable.
55
55
56
56
1. Download the following sample PDF: [sustainable-ai-pdf](https://cdn-dynmedia-1.microsoft.com/is/content/microsoftcorp/microsoft/msc/documents/presentations/CSR/Accelerating-Sustainability-with-AI-2025.pdf)
57
57
@@ -83,45 +83,59 @@ The following instructions apply to Azure Storage which provides the sample data
83
83
}
84
84
```
85
85
86
-
### Copy a search service URL and API key
86
+
## Prepare models
87
87
88
-
For this tutorial, your REST client connection to Azure AI Search requires an endpoint and an API key. You can get these values from the Azure portal. For alternative connection methods, see [Connect to a search service](search-get-started-rbac.md).
88
+
This tutorial assumes you have an existing Azure AI multiservice account through which the skill calls the Azure AI Vision multimodal 4.0 embedding model. The search service connects to the model during skillset processing using its managed identity. This section gives you guidance and links for assigning roles for authorized access.
89
89
90
-
1. Sign in to the [Azure portal](https://portal.azure.com), navigate to the search service **Overview** page, and copy the URL. An example endpoint might look like `https://mydemo.search.windows.net`.
90
+
1. Sign in to the Azure portal (not the Foundry portal) and find the Azure AI multiservice account. Make sure it's in a region that provides the [multimodal 4.0 API](/azure/ai-services/computer-vision/overview-image-analysis#region-availability).
91
91
92
-
1. Under **Settings** > **Keys**, copy an admin key. Admin keys are used to add, modify, and delete objects. There are two interchangeable admin keys. Copy either one.
92
+
1. Select **Access control (IAM)**.
93
93
94
-
:::image type="content" source="media/search-get-started-rest/get-url-key.png" alt-text="Screenshot of the URL and API keys in the Azure portal.":::
94
+
1. Select **Add** and then **Add role assignment**.
95
+
96
+
1. Search for **Cognitive Services User** and then select it.
97
+
98
+
1. Choose **Managed identity** and then assign your [search service managed identity](search-howto-managed-identities-data-sources.md).
95
99
96
100
## Set up your REST file
97
101
102
+
For this tutorial, your local REST client connection to Azure AI Search requires an endpoint and an API key. You can get these values from the Azure portal. For alternative connection methods, see [Connect to a search service](search-get-started-rbac.md).
103
+
104
+
For other connections, the search service uses the role assignments you previously defined.
105
+
98
106
1. Start Visual Studio Code and create a new file.
99
107
100
108
1. Provide values for variables used in the request.
@imageProjectionContainer=PUT-YOUR-IMAGE-PROJECTION-CONTAINER-HERE (Azure AI Search creates this container for you during skills processing)
110
118
```
111
119
112
-
1. Save the file using a `.rest` or `.http` file extension.
120
+
1. Save the file using a `.rest` or `.http` file extension. For help with the REST client, see [Quickstart: Full-text search using REST](search-get-started-text.md).
113
121
114
-
For help with the REST client, see [Quickstart: Full-text search using REST](search-get-started-text.md).
122
+
To get the Azure AI Search endpoint and API key:
123
+
124
+
1. Sign in to the [Azure portal](https://portal.azure.com), navigate to the search service **Overview** page, and copy the URL. An example endpoint might look like `https://mydemo.search.windows.net`.
125
+
126
+
1. Under **Settings** > **Keys**, copy an admin key. Admin keys are used to add, modify, and delete objects. There are two interchangeable admin keys. Copy either one.
127
+
128
+
:::image type="content" source="media/search-get-started-rest/get-url-key.png" alt-text="Screenshot of the URL and API keys in the Azure portal.":::
115
129
116
130
## Create a data source
117
131
118
132
[Create Data Source (REST)](/rest/api/searchservice/data-sources/create) creates a data source connection that specifies what data to index.
119
133
120
134
```http
121
135
### Create a data source
122
-
POST {{baseUrl}}/datasources?api-version=2025-05-01-preview HTTP/1.1
136
+
POST {{searchUrl}}/datasources?api-version=2025-05-01-preview HTTP/1.1
123
137
Content-Type: application/json
124
-
api-key: {{apiKey}}
138
+
api-key: {{searchApiKey}}
125
139
126
140
{
127
141
"name": "doc-extraction-multimodal-embedding-ds",
@@ -186,9 +200,9 @@ For nested JSON, the index fields must be identical to the source fields. Curren
186
200
187
201
```http
188
202
### Create an index
189
-
POST {{baseUrl}}/indexes?api-version=2025-05-01-preview HTTP/1.1
203
+
POST {{searchUrl}}/indexes?api-version=2025-05-01-preview HTTP/1.1
@@ -596,9 +610,9 @@ You can start searching as soon as the first document is loaded.
596
610
597
611
```http
598
612
### Query the index
599
-
POST {{baseUrl}}/indexes/doc-extraction-multimodal-embedding-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
613
+
POST {{searchUrl}}/indexes/doc-extraction-multimodal-embedding-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
600
614
Content-Type: application/json
601
-
api-key: {{apiKey}}
615
+
api-key: {{searchApiKey}}
602
616
603
617
{
604
618
"search": "*",
@@ -644,9 +658,9 @@ For filters, you can also use Logical operators (and, or, not) and comparison op
644
658
645
659
```http
646
660
### Query for only images
647
-
POST {{baseUrl}}/indexes/doc-extraction-multimodal-embedding-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
661
+
POST {{searchUrl}}/indexes/doc-extraction-multimodal-embedding-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
648
662
Content-Type: application/json
649
-
api-key: {{apiKey}}
663
+
api-key: {{searchApiKey}}
650
664
651
665
{
652
666
"search": "*",
@@ -657,9 +671,9 @@ POST {{baseUrl}}/indexes/doc-extraction-multimodal-embedding-index/docs/search?a
657
671
658
672
```http
659
673
### Query for text or images with content related to energy, returning the id, parent document, and text (only populated for text chunks), and the content path where the image is saved in the knowledge store (only populated for images)
660
-
POST {{baseUrl}}/indexes/doc-extraction-multimodal-embedding-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
674
+
POST {{searchUrl}}/indexes/doc-extraction-multimodal-embedding-index/docs/search?api-version=2025-05-01-preview HTTP/1.1
661
675
Content-Type: application/json
662
-
api-key: {{apiKey}}
676
+
api-key: {{searchApiKey}}
663
677
664
678
665
679
{
@@ -675,20 +689,20 @@ Indexers can be reset to clear the high-water mark, which allows a full rerun. T
675
689
676
690
```http
677
691
### Reset the indexer
678
-
POST {{baseUrl}}/indexers/doc-extraction-multimodal-embedding-indexer/reset?api-version=2025-05-01-preview HTTP/1.1
679
-
api-key: {{apiKey}}
692
+
POST {{searchUrl}}/indexers/doc-extraction-multimodal-embedding-indexer/reset?api-version=2025-05-01-preview HTTP/1.1
693
+
api-key: {{searchApiKey}}
680
694
```
681
695
682
696
```http
683
697
### Run the indexer
684
-
POST {{baseUrl}}/indexers/doc-extraction-multimodal-embedding-indexer/run?api-version=2025-05-01-preview HTTP/1.1
685
-
api-key: {{apiKey}}
698
+
POST {{searchUrl}}/indexers/doc-extraction-multimodal-embedding-indexer/run?api-version=2025-05-01-preview HTTP/1.1
699
+
api-key: {{searchApiKey}}
686
700
```
687
701
688
702
```http
689
703
### Check indexer status
690
-
GET {{baseUrl}}/indexers/doc-extraction-multimodal-embedding-indexer/status?api-version=2025-05-01-preview HTTP/1.1
691
-
api-key: {{apiKey}}
704
+
GET {{searchUrl}}/indexers/doc-extraction-multimodal-embedding-indexer/status?api-version=2025-05-01-preview HTTP/1.1
0 commit comments