Skip to content

Commit c5a637a

Browse files
committed
Table storage updates, wizards supported data sources
1 parent 51bdaec commit c5a637a

File tree

4 files changed

+89
-17
lines changed

4 files changed

+89
-17
lines changed

articles/search/search-how-to-index-sql-database.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ To work through the examples in this article, you need the Azure portal or a [RE
4242

4343
## Try with sample data
4444

45-
Use these instructions to create a table in Azure SQL that you can use with an indexer on Azure AI Search. The portal approach, using either import data wizard, is the quickest way to create and load an index from a table in a SQL database.
45+
Use these instructions to create and load a table in Azure SQL Database.
4646

4747
1. [Download hotels-azure-sql.sql](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/hotels/hotel-sql) from GitHub to create a table on Azure SQL Database that contains a subset of the sample hotels data set.
4848

@@ -95,6 +95,8 @@ Use these instructions to create a table in Azure SQL that you can use with an i
9595
9696
The Description field provides the most verbose content. You should target this field for full text search and optional vector queries.
9797
98+
Now that you have a database table, you can use the Azure portal, REST client, or an Azure SDK to index your data.
99+
98100
> [!TIP]
99101
> Another resource that provides sample content and code can be found on [Azure-Samples/SQL-AI-samples](https://github.com/Azure-Samples/SQL-AI-samples/tree/main/AzureSQLACSSamples/src).
100102

articles/search/search-howto-index-cosmosdb.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
---
22
title: Azure Cosmos DB NoSQL indexer
33
titleSuffix: Azure AI Search
4-
description: Set up a search indexer to index data stored in Azure Cosmos DB for full text search in Azure AI Search. This article explains how index data using the NoSQL API protocol.
4+
description: Set up a search indexer to index data stored in Azure Cosmos DB for vector and full text search in Azure AI Search. This article explains how index data using the NoSQL API protocol.
55

6+
manager: nitinme
67
author: mgottein
78
ms.author: magottei
89
ms.service: azure-ai-search
@@ -29,11 +30,11 @@ Because terminology can be confusing, it's worth noting that [Azure Cosmos DB in
2930

3031
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using identities (Microsoft Entra ID), make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) is assigned both **Cosmos DB Account Reader Role** and [**Cosmos DB Built-in Data Reader Role**](/azure/cosmos-db/how-to-setup-rbac#built-in-role-definitions).
3132

32-
To work through the examples in this article, you need the Azure portal or a [REST client](search-get-started-rest.md). If you're using Azure portal, make sure that access to all public networks is enabled in Cosmos DB and that the client has access via an inbound rule. For a REST client that runs locally, configure the network firewall to allow inbound access from your device IP address. Other approaches for creating a Cosmos DB indexer include Azure SDKs.
33+
To work through the examples in this article, you need the Azure portal or a [REST client](search-get-started-rest.md). If you're using Azure portal, make sure that access to all public networks is enabled. Other approaches for creating a Cosmos DB indexer include Azure SDKs.
3334

3435
## Try with sample data
3536

36-
Use these instructions to create a container and database in Cosmos DB that you can use with an indexer on Azure AI Search. The portal approach, using either import data wizard, is the quickest way to create and load an index from a container in Cosmos DB.
37+
Use these instructions to create a container and database in Cosmos DB.
3738

3839
1. [Download HotelsData_toCosmosDB.JSON](https://github.com/HeidiSteen/azure-search-sample-data/blob/main/hotels/HotelsData_toCosmosDB.JSON) from GitHub to create a container in Cosmos DB that contains a subset of the sample hotels data set.
3940

@@ -59,7 +60,7 @@ Use these instructions to create a container and database in Cosmos DB that you
5960

6061
1. Select **Execute query** to run the query and view results. You should have 50 hotel documents.
6162

62-
You can now use this content for indexing in the Azure portal, REST client, or an Azure SDK.
63+
Now that you have a container, you can use the Azure portal, REST client, or an Azure SDK to index your data.
6364

6465
## Use the Azure portal
6566

@@ -81,7 +82,7 @@ You can use either the **Import data** wizard or **Import and vectorize data** w
8182

8283
[Change detection](#incremental-indexing-and-custom-queries) is supported by default through a `_ts` field (timestamp). If you upload content using the approach described in [Try with sample data](#try-with-sample-data), the collection is created with a `_ts` field.
8384

84-
[Deletion detection](#indexing-deleted-documents) requires that you have a pre-existing top-level field in the index that can be used as a soft-delete flag. It should be a Boolean field (you could name it IsDeleted). In the search index, add a corresponding search field called *IsDeleted* set to retrievable and filterable. Specify `true` as the soft-delete value.
85+
[Deletion detection](#indexing-deleted-documents) requires that you have a pre-existing top-level field in the collection that can be used as a soft-delete flag. It should be a Boolean field (you could name it IsDeleted). Specify `true` as the soft-delete value. In the search index, add a corresponding search field called *IsDeleted* set to retrievable and filterable.
8586

8687
1. Continue with the remaining steps to complete the wizard:
8788

articles/search/search-howto-indexing-azure-tables.md

Lines changed: 73 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
---
22
title: Azure table indexer
33
titleSuffix: Azure AI Search
4-
description: Set up a search indexer to index data stored in Azure Table Storage for full text search in Azure AI Search.
4+
description: Set up a search indexer to index data stored in Azure Table Storage for vector and full text search in Azure AI Search.
55

6-
manager: vinodva
6+
manager: nitinme
77
author: mgottein
88
ms.author: magottei
99

1010
ms.service: azure-ai-search
1111
ms.custom:
1212
- ignite-2023
1313
ms.topic: how-to
14-
ms.date: 08/23/2024
14+
ms.date: 11/20/2024
1515
---
1616

1717
# Index data from Azure Table Storage
@@ -26,11 +26,61 @@ This article supplements [**Create an indexer**](search-howto-create-indexers.md
2626

2727
+ Tables containing text. If you have binary data, consider [AI enrichment](cognitive-search-concept-intro.md) for image analysis.
2828

29-
+ Read permissions on Azure Storage. A "full access" connection string includes a key that gives access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Data and Reader** permissions.
29+
+ Read permissions on Azure Storage. A "full access" connection string includes a key that gives access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Reader and Data Access** permissions.
3030

31-
+ Use a [REST client](search-get-started-rest.md) to formulate REST calls similar to the ones shown in this article.
31+
To work through the examples in this article, you need the Azure portal or a [REST client](search-get-started-rest.md). If you're using Azure portal, make sure that access to all public networks is enabled. Other approaches for creating an Azure Table indexer include Azure SDKs.
3232

33-
## Define the data source
33+
## Try with sample data
34+
35+
Use these instructions to create a table in Azure Storage.
36+
37+
1. Sign in to the Azure portal, navigate to your storage account, and create a table named *hotels*.
38+
39+
1. [Install Azure Storage Explorer](https://azure.microsoft.com/products/storage/storage-explorer/#Download-4).
40+
41+
1. [Download HotelsData_toAzureSearch.csv](https://github.com/HeidiSteen/azure-search-sample-data/blob/main/hotels/HotelsData_toAzureSearch.csv) from GitHub. This file is a subset of the built-in hotels sample dataset. It omits the rooms collection, translated descriptions, and geography coordinates.
42+
43+
1. In Azure Storage Explorer, sign in to Azure, select your subscription, and then select your storage account.
44+
45+
1. Open **Tables** and select *hotels*.
46+
47+
1. Select **Import** on the command bar, and then select the *HotelsData_toAzureSearch.csv* file.
48+
49+
1. Accept the default. Select **Import** to load the data.
50+
51+
You should have 50 hotel records in the table with an autogenerated partitionKey, rowKey, and timestamp. You can now use this content for indexing in the Azure portal, REST client, or an Azure SDK.
52+
53+
## Use the Azure portal
54+
55+
You can use either the **Import data** wizard or **Import and vectorize data** wizard to automate indexing from an SQL database table or view. The data source configuration similar for both wizards.
56+
57+
1. [Start the wizard](search-import-data-portal.md#starting-the-wizards).
58+
59+
1. On **Connect to your data**, select or verify that the data source type is either *Azure Table Storage* or that the data selection fields prompt for tables.
60+
61+
The data source name refers to the data source connection object in Azure AI Search. If you use the vector wizard, your data source name is autogenerated using a custom prefix specified at the end of the wizard workflow.
62+
63+
1. Specify the storage account and table name. The query is optional. It's useful if you have specific columns you want to import.
64+
65+
1. Specify an authentication method, either a managed identity or built-in API key. If you don't specify a managed identity connection, the portal uses the key.
66+
67+
If you [configure Azure AI Search to use a managed identity](search-howto-managed-identities-data-sources.md), and you create a role assignment on Azure Storage that grants **Reader and Data Access** permissions to the identity, your indexer can connect to table storage using Microsoft Entra ID and roles.
68+
69+
1. For the **Import and vectorize data** wizard, you can specify options for deletion detection,
70+
71+
Deletion detection requires that you have a pre-existing field in the table that can be used as a soft-delete flag. It should be a Boolean field (you could name it IsDeleted). Specify `true` as the soft-delete value. In the search index, add a corresponding search field called *IsDeleted* set to retrievable and filterable.
72+
73+
1. Continue with the remaining steps to complete the wizard:
74+
75+
+ [Quickstart: Import data wizard](search-get-started-portal.md)
76+
77+
+ [Quickstart: Import and vectorize data wizard](search-get-started-portal-import-vectors.md)
78+
79+
## Use the REST APIs
80+
81+
This section demonstrates the REST API calls that create a data source, index, and indexer.
82+
83+
### Define the data source
3484

3585
The data source definition specifies the source data to index, credentials, and policies for change detection. A data source is an independent resource that can be used by multiple indexers.
3686

@@ -69,7 +119,7 @@ A data source definition can also include [soft deletion policies](search-howto-
69119

70120
<a name="Credentials"></a>
71121

72-
### Supported credentials and connection strings
122+
#### Supported credentials and connection strings
73123

74124
Indexers can connect to a table using the following connections.
75125

@@ -98,7 +148,7 @@ Indexers can connect to a table using the following connections.
98148
99149
<a name="Performance"></a>
100150

101-
### Partition for improved performance
151+
#### Partition for improved performance
102152

103153
By default, Azure AI Search uses the following internal query filter to keep track of which source entities have been updated since the last run: `Timestamp >= HighWaterMarkValue`. Because Azure tables don’t have a secondary index on the `Timestamp` field, this type of query requires a full table scan and is therefore slow for large tables.
104154

@@ -116,7 +166,7 @@ To avoid a full scan, you can use table partitions to narrow the scope of each i
116166

117167
+ With this approach, if you need to trigger a full reindex, reset the data source query in addition to [resetting the indexer](search-howto-run-reset-indexers.md).
118168

119-
## Add search fields to an index
169+
### Add search fields to an index
120170

121171
In a [search index](search-what-is-an-index.md), add fields to accept the content and metadata of your table entities.
122172

@@ -156,7 +206,7 @@ In a [search index](search-what-is-an-index.md), add fields to accept the conten
156206

157207
Using the same names and compatible [data types](/rest/api/searchservice/supported-data-types) minimizes the need for [field mappings](search-indexer-field-mappings.md). When names and types are the same, the indexer can determine the data path automatically.
158208

159-
## Configure and run the table indexer
209+
### Configure and run the table indexer
160210

161211
Once you have an index and data source, you're ready to create the indexer. Indexer configuration specifies the inputs, parameters, and properties controlling run time behaviors.
162212

@@ -199,7 +249,17 @@ An indexer runs automatically when it's created. You can prevent this by setting
199249

200250
## Check indexer status
201251

202-
To monitor the indexer status and execution history, send a [Get Indexer Status](/rest/api/searchservice/indexers/get-status) request:
252+
To monitor the indexer status and execution history, check the indexer execution history in the Azure portal, or send a [Get Indexer Status](/rest/api/searchservice/indexers/get-status) REST APIrequest
253+
254+
### [**Portal**](#tab/portal-check-indexer)
255+
256+
1. On the search service page, open **Search management** > **Indexers**.
257+
258+
1. Select an indexer to access configuration and execution history.
259+
260+
1. Select a specific indexer job to view details, warnings, and errors.
261+
262+
### [**REST**](#tab/rest-check-indexer)
203263

204264
```http
205265
GET https://myservice.search.windows.net/indexers/myindexer/status?api-version=2024-07-01
@@ -241,6 +301,8 @@ The response includes status and the number of items processed. It should look s
241301
}
242302
```
243303

304+
---
305+
244306
Execution history contains up to 50 of the most recently completed executions, which are sorted in the reverse chronological order so that the latest execution comes first.
245307

246308
## Next steps

articles/search/search-import-data-portal.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,13 @@ This article isn't a step by step. For help with using the wizard with sample da
2929
+ [Quickstart: Create a vector index](search-get-started-portal-import-vectors.md)
3030
+ [Quickstart: image search (vectors)](search-get-started-portal-image-search.md)
3131

32+
## Supported data sources and scenarios
33+
34+
| Wizard | Skills | Azure blobs | ADLS Gen2 | Azure tables | Azure files | Cosmos DB | Azure SQL | OneLake | SharePoint | MySQL |
35+
|--|--|--|--|--|--|--|--|--|--|--|
36+
|Import data | No embedding skills||||||||||
37+
|Import and vectorize data | All skills ||||||||||
38+
3239
## What the wizards create
3340

3441
The import wizards create the objects described in the following table. After the objects are created, you can review their JSON definitions in the portal or call them from code.

0 commit comments

Comments
 (0)