Skip to content

Commit 7a2f039

Browse files
authored
Merge pull request #1574 from HeidiSteen/release-azure-search
Portal wizard data sources
2 parents f7a7972 + ee70fdd commit 7a2f039

17 files changed

+458
-118
lines changed

articles/search/.openpublishing.redirection.search.json

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,25 @@
11
{
22
"redirections": [
3+
{
4+
"source_path_from_root": "/articles/search/cognitive-search-quickstart-blob.md",
5+
"redirect_url": "/azure/search/search-get-started-skillset",
6+
"redirect_document_id": true
7+
},
8+
{
9+
"source_path_from_root": "/articles/search/search-howto-connecting-azure-sql-database-to-azure-search-using-indexers.md",
10+
"redirect_url": "/azure/search/search-how-to-index-sql-database",
11+
"redirect_document_id": true
12+
},
13+
{
14+
"source_path_from_root": "/articles/search/search-howto-connecting-azure-sql-mi-to-azure-search-using-indexers.md",
15+
"redirect_url": "/azure/search/search-how-to-index-sql-managed-instance",
16+
"redirect_document_id": true
17+
},
18+
{
19+
"source_path_from_root": "/articles/search/search-howto-connecting-azure-sql-iaas-to-azure-search-using-indexers.md",
20+
"redirect_url": "/azure/search/search-how-to-index-sql-server",
21+
"redirect_document_id": true
22+
},
323
{
424
"source_path_from_root": "/articles/search/index-projections-concept-intro.md",
525
"redirect_url": "/azure/search/search-how-to-define-index-projections",

articles/search/cognitive-search-skill-image-analysis.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,13 @@ This skill uses the machine learning models provided by [Azure AI Vision](/azure
2323
+ The file size of the image must be less than 4 megabytes (MB)
2424
+ The dimensions of the image must be greater than 50 x 50 pixels
2525

26+
Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
27+
2628
This skill is implemented using the [AI Image Analysis API](/azure/ai-services/computer-vision/overview-image-analysis) version 3.2. If your solution requires calling a newer version of that service API (such as version 4.0), consider implementing through [Web API custom skill](cognitive-search-custom-skill-web-api.md).
2729

2830
> [!NOTE]
2931
> This skill is bound to Azure AI services and requires [a billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services pay-as-you go price](https://azure.microsoft.com/pricing/details/cognitive-services/).
30-
>
32+
>
3133
> In addition, image extraction is [billable by Azure AI Search](https://azure.microsoft.com/pricing/details/search/).
3234
>
3335

articles/search/cognitive-search-skill-ocr.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,16 @@ An OCR skill uses the machine learning models provided by [Azure AI Vision](/azu
2222

2323
+ For Greek and Serbian Cyrillic, the legacy [OCR in version 3.2](https://github.com/Azure/azure-rest-api-specs/tree/master/specification/cognitiveservices/data-plane/ComputerVision/stable/v3.2) API is used.
2424

25-
The **OCR** skill extracts text from image files. Supported file formats include:
25+
The **OCR** skill extracts text from image files and embedded images. Supported file formats include:
2626

2727
+ .JPEG
2828
+ .JPG
2929
+ .PNG
3030
+ .BMP
3131
+ .TIFF
3232

33+
Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
34+
3335
> [!NOTE]
3436
> This skill is bound to Azure AI services and requires [a billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services pay-as-you go price](https://azure.microsoft.com/pricing/details/cognitive-services/).
3537
>
-9.39 KB
Loading
20.3 KB
Loading
129 KB
Loading
-40 Bytes
Loading

articles/search/search-get-started-portal-import-vectors.md

Lines changed: 22 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
88
ms.custom:
99
- build-2024
1010
ms.topic: quickstart
11-
ms.date: 11/19/2024
11+
ms.date: 11/20/2024
1212
---
1313

1414
# Quickstart: Vectorize text and images by using the Azure portal
@@ -27,15 +27,15 @@ This quickstart helps you get started with [integrated vectorization](vector-sea
2727

2828
### Supported data sources
2929

30-
+ [Azure Data Lake Storage (ADLS) Gen2](/azure/storage/blobs/create-data-lake-storage-account) (a storage account with a hierarchical namespace).
30+
The **Import and vectorize data** wizard [supports a wide range of Azure data sources](search-import-data-portal.md#supported-data-sources-and-scenarios), but this quickstart provides steps for just those data sources that work with whole files:
3131

32-
+ [Azure Storage](/azure/storage/common/storage-account-create) for blobs, files, and tables. Azure Storage must be a standard performance (general-purpose v2) account. Access tiers can be hot, cool, and cold.
32+
+ [Azure Blob Storage](search-howto-indexing-azure-blob-storage.md) for blobs and tables. Azure Storage must be a standard performance (general-purpose v2) account. Access tiers can be hot, cool, and cold.
3333

34-
+ [Azure Cosmos DB](/azure/cosmos-db/nosql/quickstart-portal) for NoSQL, Mongo DB, and Apache Gremlin.
34+
+ [Azure Data Lake Storage (ADLS) Gen2](/azure/storage/blobs/create-data-lake-storage-account) (an Azure Storage account with a hierarchical namespace enabled). You can confirm that you have Data Lake Storage by checking the **Properties** tab on the **Overview** page.
3535

36-
+ [Azure SQL Database](/azure/azure-sql/database/single-database-create-quickstart), [Azure SQL Managed Instance](/azure/azure-sql/managed-instance/instance-create-quickstart), and Azure SQL Server virtual machines.
36+
:::image type="content" source="media/search-get-started-portal-import-vectors/data-lake-storage.png" alt-text="Screenshot of the storage account properties page showing Data Lake Storage.":::
3737

38-
+ [OneLake lakehouse](search-how-to-index-onelake-files.md).
38+
+ [OneLake lakehouse (preview)](search-how-to-index-onelake-files.md).
3939

4040
### Supported embedding models
4141

@@ -45,9 +45,9 @@ Use an embedding model on an Azure AI platform in the [same region as Azure AI S
4545
|---|---|
4646
| [Azure OpenAI Service](https://aka.ms/oai/access) | text-embedding-ada-002, text-embedding-3-large, or text-embedding-3-small. |
4747
| [Azure AI Studio model catalog](/azure/ai-studio/what-is-ai-studio) | Azure, Cohere, and Facebook embedding models. |
48-
| [Azure AI services multi-service account](/azure/ai-services/multi-service-resource) | [Azure AI Vision multimodal](/azure/ai-services/computer-vision/how-to/image-retrieval) for image and text vectorization. Azure AI Vision multimodal is available in selected regions. [Check the documentation](/azure/ai-services/computer-vision/how-to/image-retrieval?tabs=csharp) for an updated list. **To use this resource, the account must be in an available region and in the same region as Azure AI Search**. |
48+
| [Azure AI services multi-service account](/azure/ai-services/multi-service-resource) | [Azure AI Vision multimodal](/azure/ai-services/computer-vision/how-to/image-retrieval) for image and text vectorization. Azure AI Vision multimodal is available in selected regions. [Check the documentation](/azure/ai-services/computer-vision/how-to/image-retrieval?tabs=csharp) for an updated list. Depending on how you [attach the multi-service resource](cognitive-search-attach-cognitive-services.md), the account might need to be in the same region as Azure AI Search. |
4949

50-
If using the Azure OpenAI Service, it must have an associated [custom subdomain](/azure/ai-services/cognitive-services-custom-subdomains). If the service was created through the Azure portal, this subdomain is automatically generated as part of your service setup. Ensure that your service includes a custom subdomain before using it with the Azure AI Search integration.
50+
If you use the Azure OpenAI Service, the endpoint must have an associated [custom subdomain](/azure/ai-services/cognitive-services-custom-subdomains). A custom subdomain is an endpoint that includes a unique name (for example, `https://hereismyuniquename.cognitiveservices.azure.com`). If the service was created through the Azure portal, this subdomain is automatically generated as part of your service setup. Ensure that your service includes a custom subdomain before using it with the Azure AI Search integration.
5151

5252
Azure OpenAI Service resources (with access to embedding models) that were created in AI Studio aren't supported. Only the Azure OpenAI Service resources created in the Azure portal are compatible with the **Azure OpenAI Embedding** skill integration.
5353

@@ -57,9 +57,9 @@ For the purposes of this quickstart, all of the preceding resources must have pu
5757

5858
If private endpoints are already present and you can't disable them, the alternative option is to run the respective end-to-end flow from a script or program on a virtual machine. The virtual machine must be on the same virtual network as the private endpoint. [Here's a Python code sample](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-python/code/integrated-vectorization) for integrated vectorization. The same [GitHub repo](https://github.com/Azure/azure-search-vector-samples/tree/main) has samples in other programming languages.
5959

60-
### Role requirements
60+
### Permissions
6161

62-
We recommend role assignments for search service connections to other resources.
62+
You can use key authentication and full access connection strings, or Microsoft Entra ID with role assignments. We recommend role assignments for search service connections to other resources.
6363

6464
1. On Azure AI Search, [enable roles](search-security-enable-roles.md).
6565

@@ -81,13 +81,9 @@ For more secure connections:
8181

8282
If you're starting with the free service, you're limited to three indexes, data sources, skillsets, and indexers. Basic limits you to 15. Make sure you have room for extra items before you begin. This quickstart creates one of each object.
8383

84-
### Check for semantic ranker
85-
86-
The wizard supports semantic ranking, but only on the Basic tier and higher, and only if semantic ranker is already [enabled on your search service](semantic-how-to-enable-disable.md). If you're using a billable tier, check whether semantic ranker is enabled.
87-
8884
## Prepare sample data
8985

90-
This section points you to data that works for this quickstart.
86+
This section points you to the content that works for this quickstart.
9187

9288
### [Azure Blob Storage](#tab/sample-data-storage)
9389

@@ -111,10 +107,6 @@ This section points you to data that works for this quickstart.
111107

112108
1. Sign in to the [Azure portal](https://portal.azure.com/) with your Azure account, and go to your Azure Storage account.
113109

114-
1. You can confirm that you have Data Lake Storage by checking the **Properties** tab on the **Overview** page.
115-
116-
:::image type="content" source="media/search-get-started-portal-import-vectors/data-lake-storage.png" alt-text="Screenshot of the storage account properties page showing Data Lake Storage.":::
117-
118110
1. On the left pane, under **Data Storage**, select **Containers**.
119111

120112
1. Create a new container and then upload the [health-plan PDF documents](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/health-plan) used for this quickstart.
@@ -231,7 +223,7 @@ The next step is to connect to a data source to use for the search index.
231223

232224
### [Azure Blob Storage](#tab/connect-data-storage)
233225

234-
1. On the **Set up your data connection** page, select **Azure Blob Storage**.
226+
1. On **Connect to your data**, select **Azure Blob Storage**.
235227

236228
1. Specify the Azure subscription.
237229

@@ -257,7 +249,7 @@ The next step is to connect to a data source to use for the search index.
257249

258250
### [ADLS Gen2](#tab/connect-data-adlsgen2)
259251

260-
1. On the **Set up your data connection** page, select **Azure Data Lake**.
252+
1. On **Connect to your data**, select **Azure Data Lake**.
261253

262254
1. Specify the Azure subscription.
263255

@@ -281,11 +273,11 @@ The next step is to connect to a data source to use for the search index.
281273

282274
1. Select **Next**.
283275

284-
### [OneLake (preview)](#tab/connect-data-onelake)
276+
### [OneLake](#tab/connect-data-onelake)
285277

286278
Support for OneLake indexing is in preview. For more information about supported shortcuts and limitations, see ([OneLake indexing](search-how-to-index-onelake-files.md)).
287279

288-
1. On the **Set up your data connection** page, select **OneLake**.
280+
1. On **Connect to your data**, select **OneLake**.
289281

290282
1. Specify the type of connection:
291283

@@ -304,7 +296,7 @@ Support for OneLake indexing is in preview. For more information about supported
304296

305297
In this step, specify the embedding model for vectorizing chunked data.
306298

307-
Chunking is built-in and nonconfigurable. The effective settings are:
299+
Chunking is built in and nonconfigurable. The effective settings are:
308300

309301
```json
310302
"textSplitMode": "pages",
@@ -336,13 +328,15 @@ Chunking is built-in and nonconfigurable. The effective settings are:
336328

337329
+ The identity should have a **Cognitive Services OpenAI User** role on the Azure AI multi-services account.
338330

339-
1. Select the checkbox that acknowledges the billing impact of using these resources.
331+
1. Select the checkbox that acknowledges the billing effects of using these resources.
340332

341333
1. Select **Next**.
342334

343335
## Vectorize and enrich your images
344336

345-
If your content includes images, you can apply AI in two ways:
337+
The health plan PDFs don't include images, so you can skip this step.
338+
339+
However, if you work with content that includes images, you can apply AI in two ways:
346340

347341
+ Use a supported image embedding model from the catalog, or choose the Azure AI Vision multimodal embeddings API to vectorize images.
348342

@@ -358,7 +352,7 @@ Azure AI Search and your Azure AI resource must be in the same region.
358352

359353
1. Optionally, you can crack binary images (for example, scanned document files) and [use OCR](cognitive-search-skill-ocr.md) to recognize text.
360354

361-
1. Select the checkbox that acknowledges the billing impact of using these resources.
355+
1. Select the checkbox that acknowledges the billing effects of using these resources.
362356

363357
1. Select **Next**.
364358

@@ -470,6 +464,7 @@ Search Explorer accepts text strings as input and then vectorizes the text for v
470464
}
471465
]
472466
}
467+
```
473468

474469
## Clean up
475470

articles/search/search-get-started-portal.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ The **Import data** wizard supports the creation of a skillset and [AI-enrichmen
7777
7878
### Configure the index
7979

80-
The wizard infers a schema for the built-in hotels-sample index. Follow these steps to configure the index:
80+
The wizard infers a schema for the built-in hotels-sample index. To configure the index, follow these steps:
8181

8282
1. Accept the system-generated values for the **Index name** (_hotels-sample-index_) and **Key** field (_HotelId_).
8383

@@ -174,7 +174,7 @@ The following examples assume the JSON view and the 2024-05-01-preview REST API
174174
175175
### Filter examples
176176

177-
Parking, tags, renovation date, rating and location are filterable.
177+
Parking, tags, renovation date, rating, and location are filterable.
178178

179179
```json
180180
{
@@ -223,7 +223,7 @@ The default syntax is [simple syntax](query-simple-syntax.md), but if you want f
223223
}
224224
```
225225

226-
By default, misspelled query terms like `seatle` for `Seattle` fail to return matches in a typical search. The `queryType=full` parameter invokes the full Lucene query parser, which supports the tilde `~` operand. When these parameters are present, the query performs a fuzzy search for the specified keyword. The query seeks matching results along with results that are similar to but not an exact match to the keyword.
226+
By default, misspelled query terms like `seatle` for `Seattle` fail to return matches in a typical search. The `queryType=full` parameter invokes the full Lucene query parser, which supports the tilde `~` operand. When these parameters are present, the query performs a fuzzy search for the specified keyword. The query matches on documents that are similar to but not an exact match to the keyword.
227227

228228
Take a minute to try a few of these example queries for your index. To learn more about queries, see [Querying in Azure AI Search](search-query-overview.md).
229229

articles/search/cognitive-search-quickstart-blob.md renamed to articles/search/search-get-started-skillset.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.service: azure-ai-search
1010
ms.custom:
1111
- ignite-2023
1212
ms.topic: quickstart
13-
ms.date: 10/15/2024
13+
ms.date: 11/20/2024
1414
---
1515

1616
# Quickstart: Create a skillset in the Azure portal
@@ -84,6 +84,8 @@ If you get *Error detecting index schema from data source*, the indexer that pow
8484

8585
Next, configure AI enrichment to invoke OCR, image analysis, and natural language processing.
8686

87+
OCR and image analysis are available for blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and for image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
88+
8789
1. For this quickstart, we're using the **Free** Azure AI services resource. The sample data consists of 14 files, so the free allotment of 20 transactions on Azure AI services is sufficient for this quickstart.
8890

8991
:::image type="content" source="media/cognitive-search-quickstart-blob/cog-search-attach.png" alt-text="Screenshot of the Attach Azure AI services tab." border="true":::

0 commit comments

Comments
 (0)