Skip to content

Commit d4cd7e0

Browse files
committed
Refresh Azure SQL indexer doc
1 parent fa9bdd3 commit d4cd7e0

File tree

1 file changed

+51
-19
lines changed

1 file changed

+51
-19
lines changed

articles/search/search-howto-connecting-azure-sql-database-to-azure-search-using-indexers.md

Lines changed: 51 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -8,31 +8,42 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: how-to
11-
ms.date: 06/09/2022
11+
ms.date: 07/25/2022
1212
---
1313

1414
# Index data from Azure SQL
1515

1616
In this article, learn how to configure an [**indexer**](search-indexer-overview.md) that imports content from Azure SQL Database or an Azure SQL managed instance and makes it searchable in Azure Cognitive Search.
1717

18-
This article supplements [**Create an indexer**](search-howto-create-indexers.md) with information that's specific to Azure SQL. It uses the REST APIs to demonstrate a three-part workflow common to all indexers: create a data source, create an index, create an indexer. Data extraction occurs when you submit the Create Indexer request.
18+
This article supplements [**Create an indexer**](search-howto-create-indexers.md) with information that's specific to Azure SQL. It uses the REST APIs to demonstrate a three-part workflow common to all indexers: create a data source, create an index, create an indexer.
19+
20+
This article also provides:
21+
22+
+ A description of the change detection policies supported by the Azure SQL indexer so that you can set up incremental indexing.
23+
24+
+ A frequently-asked-questions (FAQ) section for answers to questions about feature compatibility.
1925

2026
> [!NOTE]
2127
> [Always Encrypted](/sql/relational-databases/security/encryption/always-encrypted-database-engine) columns are not currently supported by Cognitive Search indexers.
2228
2329
## Prerequisites
2430

25-
+ An [Azure SQL database](/azure/azure-sql/database/sql-database-paas-overview) with data in a single table or view. Use a table if you want the ability to [index incremental updates](#CaptureChangedRows) using SQL's native change detection capabilities. If you use a view, take into consideration that large views are not ideal for SQL indexer. For such cases, it is suggested to change your application to create an additional single table just for ingestion into your Cognitive Search index with integrated change tracking enabled, where each column matches a column in the index, so processing is optimized. This approach will help using SQL integrated change tracking, which is easier to implement than High Water Mark.
31+
+ An [Azure SQL database](/azure/azure-sql/database/sql-database-paas-overview) with data in a single table or view.
2632

27-
+ Read permissions. Azure Cognitive Search supports SQL Server authentication, where the user name and password are provided on the connection string. Alternatively, you can [set up a managed identity and use Azure roles](search-howto-managed-identities-sql.md) to omit credentials on the connection.
33+
Use a table if your data is over 100,000 rows or if you need [incremental indexing](#CaptureChangedRows) using SQL's native change detection capabilities.
2834

29-
+ A REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md) to send REST calls that create the data source, index, and indexer.
35+
Use a view if you need to consolidate data from multiple tables. Large views are not ideal for SQL indexer. A workaround is to create a new single table just for ingestion into your Cognitive Search index. You'll be able to use SQL integrated change tracking, which is easier to implement than High Water Mark.
3036

31-
+ If you're using the [Azure portal](https://portal.azure.com/) to create the data source, make sure that access to all public networks is enabled in the Azure SQL firewall. Alternatively, you can use REST API from a device with an authorized IP in the firewall rules to perform these operations. If the Azure SQL firewall has public networks access disabled, there will be errors when connecting from the portal to it.
37+
+ Read permissions. Azure Cognitive Search supports SQL Server authentication, where the user name and password are provided on the connection string. Alternatively, you can [set up a managed identity and use Azure roles](search-howto-managed-identities-sql.md).
3238

33-
<!-- Real-time data synchronization must not be an application requirement. An indexer can reindex your table at most every five minutes. If your data changes frequently, and those changes need to be reflected in the index within seconds or single minutes, we recommend using the [REST API](/rest/api/searchservice/AddUpdate-or-Delete-Documents) or [.NET SDK](search-get-started-dotnet.md) to push updated rows directly.
39+
To work through the examples in this article, you'll need a REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md).
3440

35-
Incremental indexing is possible. If you have a large data set and plan to run the indexer on a schedule, Azure Cognitive Search must be able to efficiently identify new, changed, or deleted rows. Non-incremental indexing is only allowed if you're indexing on demand (not on schedule), or indexing fewer than 100,000 rows. For more information, see [Capturing Changed and Deleted Rows](#CaptureChangedRows) below. -->
41+
Other approaches for creating an Azure SQL indexer include Azure SDKs or [Import data wizard](search-get-started-portal.md) in the Azure portal. If you're using Azure portal, make sure that access to all public networks is enabled in the Azure SQL firewall and that the client has access via an inbound rule.
42+
43+
> [!NOTE]
44+
> Real-time data synchronization isn't possible with an indexer. An indexer can reindex your table at most every five minutes. If data updates need to be reflected in the index sooner, we recommend [pushing updated rows directly](tutorial-optimize-indexing-push-api.md).
45+
46+
<!-- Incremental indexing is possible. If you have a large data set and plan to run the indexer on a schedule, Azure Cognitive Search must be able to efficiently identify new, changed, or deleted rows. Full indexing is only allowed if you're indexing on demand (not on schedule), or indexing fewer than 100,000 rows. For more information, see [Capturing Changed and Deleted Rows](#CaptureChangedRows) below. -->
3647

3748
## Define the data source
3849

@@ -47,25 +58,35 @@ The data source definition specifies the data to index, credentials, and policie
4758
4859
{
4960
"name" : "myazuresqldatasource",
61+
"description" : "A database for testing Azure Cognitive Search indexes.",
5062
"type" : "azuresql",
5163
"credentials" : { "connectionString" : "Server=tcp:<your server>.database.windows.net,1433;Database=<your database>;User ID=<your user name>;Password=<your password>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30;" },
52-
"container" : { "name" : "name of the table or view that you want to index" }
64+
"container" : {
65+
"name" : "name of the table or view that you want to index",
66+
"query" : null (not supported in the Azure SQL indexer)
67+
},
68+
"dataChangeDetectionPolicy": null,
69+
"dataDeletionDetectionPolicy": null,
70+
"encryptionKey": null,
71+
"identity": null
5372
}
5473
```
5574

75+
1. Provide a unique name for the data source that follows Azure Cognitive Search [naming conventions](/rest/api/searchservice/naming-rules).
76+
5677
1. Set "type" to `"azuresql"` (required).
5778

5879
1. Set "credentials" to a connection string:
5980

60-
+ You can get the connection string from the [Azure portal](https://portal.azure.com). Use the `ADO.NET connection string` option.
81+
+ You can get a full access connection string from the [Azure portal](https://portal.azure.com). Use the `ADO.NET connection string` option. Set the user name and password.
6182

62-
+ You can specify a managed identity connection string that does not include database secrets with the following format: `Initial Catalog|Database=<your database name>;ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Sql/servers/<your SQL Server name>/;Connection Timeout=connection timeout length;`.
83+
+ Alternatively, you can specify a managed identity connection string that does not include database secrets with the following format: `Initial Catalog|Database=<your database name>;ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Sql/servers/<your SQL Server name>/;Connection Timeout=connection timeout length;`.
6384

64-
To use this connection string, follow the instructions for [Setting up an indexer connection to an Azure SQL Database using a managed identity](search-howto-managed-identities-sql.md).
85+
For more information, see [Connect to Azure SQL Database indexer using a managed identity](search-howto-managed-identities-sql.md).
6586

6687
## Add search fields to an index
6788

68-
In a [search index](search-what-is-an-index.md), add fields to accept values from corresponding fields in the SQL database. Ensure that the search index schema is compatible with source schema, with [equivalent data types](#TypeMapping).
89+
In a [search index](search-what-is-an-index.md), add fields that correspond to the fields in SQL database. Ensure that the search index schema is compatible with source schema by using [equivalent data types](#TypeMapping).
6990

7091
1. [Create or update an index](/rest/api/searchservice/create-index) to define search fields that will store data:
7192

@@ -94,9 +115,9 @@ In a [search index](search-what-is-an-index.md), add fields to accept values fro
94115
}
95116
```
96117
97-
1. Create a document key field ("key": true) that uniquely identifies each search document. This is the only field that's required. Typically, the table's primary key is mapped to the index key field. The document key must be unique and non-null. The values can be numeric in source data, but in a search index, a key is always a string.
118+
1. Create a document key field ("key": true) that uniquely identifies each search document. This is the only field that's required in a search index. Typically, the table's primary key is mapped to the index key field. The document key must be unique and non-null. The values can be numeric in source data, but in a search index, a key is always a string.
98119
99-
1. Create additional fields for more searchable content. See [Create an index](search-how-to-create-search-index.md) for details.
120+
1. Create more fields to add more searchable content. See [Create an index](search-how-to-create-search-index.md) for guidance.
100121
101122
<a name="TypeMapping"></a>
102123
@@ -138,7 +159,8 @@ Once the index and data source have been created, you're ready to create the ind
138159
"maxFailedItemsPerBatch": 0,
139160
"base64EncodeKeys": false,
140161
"configuration": {
141-
"queryTimeout": "00:05:00",
162+
"queryTimeout": "00:04:00",
163+
"convertHighWaterMarkToRowVersion": false,
142164
"disableOrderByHighWaterMarkColumn": false
143165
}
144166
},
@@ -147,7 +169,13 @@ Once the index and data source have been created, you're ready to create the ind
147169
}
148170
```
149171
150-
1. Under parameter configuration, you can set a timeout for SQL query execution. In the example above, the timeout is 5 minutes. The second configuration setting is "disableOrderByHighWaterMarkColumn". It causes the SQL query used by the [high water mark policy](#HighWaterMarkPolicy) to omit the ORDER BY clause.
172+
1. Under parameters, the configuration section has parameters that are specific to Azure SQL:
173+
174+
+ Default query timeout for SQL query execution is 5 minutes, which you can override.
175+
176+
+ "convertHighWaterMarkToRowVersion" optimizes for the [High Water Mark change detection policy](#HighWaterMarkPolicy). Change detection policies are set in the data source. If you're using the native change detection policy, this parameter has no effect.
177+
178+
+ "disableOrderByHighWaterMarkColumn" causes the SQL query used by the [high water mark policy](#HighWaterMarkPolicy) to omit the ORDER BY clause. If you're using the native change detection policy, this parameter has no effect.
151179
152180
1. [Specify field mappings](search-indexer-field-mappings.md) if there are differences in field name or type, or if you need multiple versions of a source field in the search index.
153181
@@ -291,9 +319,9 @@ api-key: admin-key
291319

292320
If you're using a [rowversion](/sql/t-sql/data-types/rowversion-transact-sql) data type for the high water mark column, consider setting the `convertHighWaterMarkToRowVersion` property in indexer configuration. Setting this property to true results in the following behaviors:
293321

294-
* Uses the rowversion data type for the high water mark column in the indexer SQL query. Using the correct data type improves indexer query performance.
322+
+ Uses the rowversion data type for the high water mark column in the indexer SQL query. Using the correct data type improves indexer query performance.
295323

296-
* Subtracts one from the rowversion value before the indexer query runs. Views with one-to-many joins may have rows with duplicate rowversion values. Subtracting one ensures the indexer query doesn't miss these rows.
324+
+ Subtracts one from the rowversion value before the indexer query runs. Views with one-to-many joins may have rows with duplicate rowversion values. Subtracting one ensures the indexer query doesn't miss these rows.
297325

298326
To enable this property, create or update the indexer with the following configuration:
299327

@@ -358,6 +386,10 @@ If you are setting up a soft delete policy from the Azure portal, don't add quot
358386

359387
## FAQ
360388

389+
**Q: Can I index Always Encrypted columns?**
390+
391+
No. [Always Encrypted](/sql/relational-databases/security/encryption/always-encrypted-database-engine) columns are not currently supported by Cognitive Search indexers.
392+
361393
**Q: Can I use Azure SQL indexer with SQL databases running on IaaS VMs in Azure?**
362394

363395
Yes. However, you need to allow your search service to connect to your database. For more information, see [Configure a connection from an Azure Cognitive Search indexer to SQL Server on an Azure VM](search-howto-connecting-azure-sql-iaas-to-azure-search-using-indexers.md).

0 commit comments

Comments
 (0)