You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-connecting-azure-sql-database-to-azure-search-using-indexers.md
+51-19Lines changed: 51 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,31 +8,42 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: how-to
11
-
ms.date: 06/09/2022
11
+
ms.date: 07/25/2022
12
12
---
13
13
14
14
# Index data from Azure SQL
15
15
16
16
In this article, learn how to configure an [**indexer**](search-indexer-overview.md) that imports content from Azure SQL Database or an Azure SQL managed instance and makes it searchable in Azure Cognitive Search.
17
17
18
-
This article supplements [**Create an indexer**](search-howto-create-indexers.md) with information that's specific to Azure SQL. It uses the REST APIs to demonstrate a three-part workflow common to all indexers: create a data source, create an index, create an indexer. Data extraction occurs when you submit the Create Indexer request.
18
+
This article supplements [**Create an indexer**](search-howto-create-indexers.md) with information that's specific to Azure SQL. It uses the REST APIs to demonstrate a three-part workflow common to all indexers: create a data source, create an index, create an indexer.
19
+
20
+
This article also provides:
21
+
22
+
+ A description of the change detection policies supported by the Azure SQL indexer so that you can set up incremental indexing.
23
+
24
+
+ A frequently-asked-questions (FAQ) section for answers to questions about feature compatibility.
19
25
20
26
> [!NOTE]
21
27
> [Always Encrypted](/sql/relational-databases/security/encryption/always-encrypted-database-engine) columns are not currently supported by Cognitive Search indexers.
22
28
23
29
## Prerequisites
24
30
25
-
+ An [Azure SQL database](/azure/azure-sql/database/sql-database-paas-overview) with data in a single table or view. Use a table if you want the ability to [index incremental updates](#CaptureChangedRows) using SQL's native change detection capabilities. If you use a view, take into consideration that large views are not ideal for SQL indexer. For such cases, it is suggested to change your application to create an additional single table just for ingestion into your Cognitive Search index with integrated change tracking enabled, where each column matches a column in the index, so processing is optimized. This approach will help using SQL integrated change tracking, which is easier to implement than High Water Mark.
31
+
+ An [Azure SQL database](/azure/azure-sql/database/sql-database-paas-overview) with data in a single table or view.
26
32
27
-
+ Read permissions. Azure Cognitive Search supports SQL Server authentication, where the user name and password are provided on the connection string. Alternatively, you can [set up a managed identity and use Azure roles](search-howto-managed-identities-sql.md) to omit credentials on the connection.
33
+
Use a table if your data is over 100,000 rows or if you need [incremental indexing](#CaptureChangedRows) using SQL's native change detection capabilities.
28
34
29
-
+ A REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md) to send REST calls that create the data source, index, and indexer.
35
+
Use a view if you need to consolidate data from multiple tables. Large views are not ideal for SQL indexer. A workaround is to create a new single table just for ingestion into your Cognitive Search index. You'll be able to use SQL integrated change tracking, which is easier to implement than High Water Mark.
30
36
31
-
+If you're using the [Azure portal](https://portal.azure.com/) to create the data source, make sure that access to all public networks is enabled in the Azure SQL firewall. Alternatively, you can use REST API from a device with an authorized IP in the firewall rules to perform these operations. If the Azure SQL firewall has public networks access disabled, there will be errors when connecting from the portal to it.
37
+
+Read permissions. Azure Cognitive Search supports SQL Server authentication, where the user name and password are provided on the connection string. Alternatively, you can [set up a managed identity and use Azure roles](search-howto-managed-identities-sql.md).
32
38
33
-
<!-- Real-time data synchronization must not be an application requirement. An indexer can reindex your table at most every five minutes. If your data changes frequently, and those changes need to be reflected in the index within seconds or single minutes, we recommend using the [REST API](/rest/api/searchservice/AddUpdate-or-Delete-Documents) or [.NET SDK](search-get-started-dotnet.md) to push updated rows directly.
39
+
To work through the examples in this article, you'll need a REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md).
34
40
35
-
Incremental indexing is possible. If you have a large data set and plan to run the indexer on a schedule, Azure Cognitive Search must be able to efficiently identify new, changed, or deleted rows. Non-incremental indexing is only allowed if you're indexing on demand (not on schedule), or indexing fewer than 100,000 rows. For more information, see [Capturing Changed and Deleted Rows](#CaptureChangedRows) below. -->
41
+
Other approaches for creating an Azure SQL indexer include Azure SDKs or [Import data wizard](search-get-started-portal.md) in the Azure portal. If you're using Azure portal, make sure that access to all public networks is enabled in the Azure SQL firewall and that the client has access via an inbound rule.
42
+
43
+
> [!NOTE]
44
+
> Real-time data synchronization isn't possible with an indexer. An indexer can reindex your table at most every five minutes. If data updates need to be reflected in the index sooner, we recommend [pushing updated rows directly](tutorial-optimize-indexing-push-api.md).
45
+
46
+
<!-- Incremental indexing is possible. If you have a large data set and plan to run the indexer on a schedule, Azure Cognitive Search must be able to efficiently identify new, changed, or deleted rows. Full indexing is only allowed if you're indexing on demand (not on schedule), or indexing fewer than 100,000 rows. For more information, see [Capturing Changed and Deleted Rows](#CaptureChangedRows) below. -->
36
47
37
48
## Define the data source
38
49
@@ -47,25 +58,35 @@ The data source definition specifies the data to index, credentials, and policie
47
58
48
59
{
49
60
"name" : "myazuresqldatasource",
61
+
"description" : "A database for testing Azure Cognitive Search indexes.",
"container" : { "name" : "name of the table or view that you want to index" }
64
+
"container" : {
65
+
"name" : "name of the table or view that you want to index",
66
+
"query" : null (not supported in the Azure SQL indexer)
67
+
},
68
+
"dataChangeDetectionPolicy": null,
69
+
"dataDeletionDetectionPolicy": null,
70
+
"encryptionKey": null,
71
+
"identity": null
53
72
}
54
73
```
55
74
75
+
1. Provide a unique name for the data source that follows Azure Cognitive Search [naming conventions](/rest/api/searchservice/naming-rules).
76
+
56
77
1. Set "type" to `"azuresql"` (required).
57
78
58
79
1. Set "credentials" to a connection string:
59
80
60
-
+ You can get the connection string from the [Azure portal](https://portal.azure.com). Use the `ADO.NET connection string` option.
81
+
+ You can get a full access connection string from the [Azure portal](https://portal.azure.com). Use the `ADO.NET connection string` option. Set the user name and password.
61
82
62
-
+You can specify a managed identity connection string that does not include database secrets with the following format: `Initial Catalog|Database=<your database name>;ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Sql/servers/<your SQL Server name>/;Connection Timeout=connection timeout length;`.
83
+
+Alternatively, you can specify a managed identity connection string that does not include database secrets with the following format: `Initial Catalog|Database=<your database name>;ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Sql/servers/<your SQL Server name>/;Connection Timeout=connection timeout length;`.
63
84
64
-
To use this connection string, follow the instructions for [Setting up an indexer connection to an Azure SQL Database using a managed identity](search-howto-managed-identities-sql.md).
85
+
For more information, see [Connect to Azure SQL Database indexer using a managed identity](search-howto-managed-identities-sql.md).
65
86
66
87
## Add search fields to an index
67
88
68
-
In a [search index](search-what-is-an-index.md), add fields to accept values from corresponding fields in the SQL database. Ensure that the search index schema is compatible with source schema, with[equivalent data types](#TypeMapping).
89
+
In a [search index](search-what-is-an-index.md), add fields that correspond to the fields in SQL database. Ensure that the search index schema is compatible with source schema by using[equivalent data types](#TypeMapping).
69
90
70
91
1.[Create or update an index](/rest/api/searchservice/create-index) to define search fields that will store data:
71
92
@@ -94,9 +115,9 @@ In a [search index](search-what-is-an-index.md), add fields to accept values fro
94
115
}
95
116
```
96
117
97
-
1. Create a document key field ("key": true) that uniquely identifies each search document. This is the only field that's required. Typically, the table's primary key is mapped to the index key field. The document key must be unique and non-null. The values can be numeric in source data, but in a search index, a key is always a string.
118
+
1. Create a document key field ("key": true) that uniquely identifies each search document. This is the only field that's required in a search index. Typically, the table's primary key is mapped to the index key field. The document key must be unique and non-null. The values can be numeric in source data, but in a search index, a key is always a string.
98
119
99
-
1. Create additional fields for more searchable content. See [Create an index](search-how-to-create-search-index.md) for details.
120
+
1. Create more fields to add more searchable content. See [Create an index](search-how-to-create-search-index.md) for guidance.
100
121
101
122
<a name="TypeMapping"></a>
102
123
@@ -138,7 +159,8 @@ Once the index and data source have been created, you're ready to create the ind
138
159
"maxFailedItemsPerBatch": 0,
139
160
"base64EncodeKeys": false,
140
161
"configuration": {
141
-
"queryTimeout": "00:05:00",
162
+
"queryTimeout": "00:04:00",
163
+
"convertHighWaterMarkToRowVersion": false,
142
164
"disableOrderByHighWaterMarkColumn": false
143
165
}
144
166
},
@@ -147,7 +169,13 @@ Once the index and data source have been created, you're ready to create the ind
147
169
}
148
170
```
149
171
150
-
1. Under parameter configuration, you can set a timeout for SQL query execution. In the example above, the timeout is 5 minutes. The second configuration setting is "disableOrderByHighWaterMarkColumn". It causes the SQL query used by the [high water mark policy](#HighWaterMarkPolicy) to omit the ORDER BY clause.
172
+
1. Under parameters, the configuration section has parameters that are specific to Azure SQL:
173
+
174
+
+ Default query timeout for SQL query execution is 5 minutes, which you can override.
175
+
176
+
+ "convertHighWaterMarkToRowVersion" optimizes for the [High Water Mark change detection policy](#HighWaterMarkPolicy). Change detection policies are set in the data source. If you're using the native change detection policy, this parameter has no effect.
177
+
178
+
+ "disableOrderByHighWaterMarkColumn" causes the SQL query used by the [high water mark policy](#HighWaterMarkPolicy) to omit the ORDER BY clause. If you're using the native change detection policy, this parameter has no effect.
151
179
152
180
1. [Specify field mappings](search-indexer-field-mappings.md) if there are differences in field name or type, or if you need multiple versions of a source field in the search index.
153
181
@@ -291,9 +319,9 @@ api-key: admin-key
291
319
292
320
If you're using a [rowversion](/sql/t-sql/data-types/rowversion-transact-sql) data type for the high water mark column, consider setting the `convertHighWaterMarkToRowVersion` property in indexer configuration. Setting this property to true results in the following behaviors:
293
321
294
-
* Uses the rowversion data type for the high water mark column in the indexer SQL query. Using the correct data type improves indexer query performance.
322
+
+ Uses the rowversion data type for the high water mark column in the indexer SQL query. Using the correct data type improves indexer query performance.
295
323
296
-
* Subtracts one from the rowversion value before the indexer query runs. Views with one-to-many joins may have rows with duplicate rowversion values. Subtracting one ensures the indexer query doesn't miss these rows.
324
+
+ Subtracts one from the rowversion value before the indexer query runs. Views with one-to-many joins may have rows with duplicate rowversion values. Subtracting one ensures the indexer query doesn't miss these rows.
297
325
298
326
To enable this property, create or update the indexer with the following configuration:
299
327
@@ -358,6 +386,10 @@ If you are setting up a soft delete policy from the Azure portal, don't add quot
358
386
359
387
## FAQ
360
388
389
+
**Q: Can I index Always Encrypted columns?**
390
+
391
+
No. [Always Encrypted](/sql/relational-databases/security/encryption/always-encrypted-database-engine) columns are not currently supported by Cognitive Search indexers.
392
+
361
393
**Q: Can I use Azure SQL indexer with SQL databases running on IaaS VMs in Azure?**
362
394
363
395
Yes. However, you need to allow your search service to connect to your database. For more information, see [Configure a connection from an Azure Cognitive Search indexer to SQL Server on an Azure VM](search-howto-connecting-azure-sql-iaas-to-azure-search-using-indexers.md).
0 commit comments