You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-analyzers.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,13 @@
1
1
---
2
2
title: Analyzers for linguistic and text processing
3
3
titleSuffix: Azure AI Search
4
-
description: Assign analyzers to searchable text fields in an index to replace default standard Lucene with custom, predefined or language-specific alternatives.
4
+
description: Assign analyzers to searchable string fields in an index to replace default standard Lucene with custom, predefined or language-specific alternatives.
5
5
author: HeidiSteen
6
6
manager: nitinme
7
7
ms.author: heidist
8
8
ms.service: cognitive-search
9
9
ms.topic: conceptual
10
-
ms.date: 07/19/2023
10
+
ms.date: 05/23/2024
11
11
ms.custom:
12
12
- devx-track-csharp
13
13
- ignite-2023
@@ -38,7 +38,11 @@ In Azure AI Search, an analyzer is automatically invoked on all string fields ma
38
38
39
39
By default, Azure AI Search uses the [Apache Lucene Standard analyzer (standard lucene)](https://lucene.apache.org/core/6_6_1/core/org/apache/lucene/analysis/standard/StandardAnalyzer.html), which breaks text into elements following the ["Unicode Text Segmentation"](https://unicode.org/reports/tr29/) rules. The standard analyzer converts all characters to their lower case form. Both indexed documents and search terms go through the analysis during indexing and query processing.
40
40
41
-
You can override the default on a field-by-field basis. Alternative analyzers can be a [language analyzer](index-add-language-analyzers.md) for linguistic processing, a [custom analyzer](index-add-custom-analyzers.md), or a built-in analyzer from the [list of available analyzers](index-add-custom-analyzers.md#built-in-analyzers).
41
+
You can override the default on a field-by-field basis. Alternative analyzers are:
42
+
43
+
+[language analyzer](index-add-language-analyzers.md) for linguistic processing
44
+
+[custom analyzer](index-add-custom-analyzers.md) for custom configurations
45
+
+[built-in analyzers](index-add-custom-analyzers.md#built-in-analyzers) for as-is usage
Copy file name to clipboardExpand all lines: articles/search/search-howto-connecting-azure-sql-database-to-azure-search-using-indexers.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ ms.service: cognitive-search
10
10
ms.custom:
11
11
- ignite-2023
12
12
ms.topic: how-to
13
-
ms.date: 07/31/2023
13
+
ms.date: 05/23/2024
14
14
---
15
15
16
16
# How to index data from Azure SQL in Azure AI Search
@@ -49,7 +49,7 @@ The data source definition specifies the data to index, credentials, and policie
49
49
1.[Create data source](/rest/api/searchservice/create-data-source) or [Update data source](/rest/api/searchservice/update-data-source) to set its definition:
50
50
51
51
```http
52
-
POST https://myservice.search.windows.net/datasources?api-version=2020-06-30
52
+
POST https://myservice.search.windows.net/datasources?api-version=2023-11-01
53
53
Content-Type: application/json
54
54
api-key: admin-key
55
55
@@ -88,7 +88,7 @@ In a [search index](search-what-is-an-index.md), add fields that correspond to t
88
88
1.[Create or update an index](/rest/api/searchservice/create-index) to define search fields that will store data:
89
89
90
90
```http
91
-
POST https://[service name].search.windows.net/indexes?api-version=2020-06-30
91
+
POST https://[service name].search.windows.net/indexes?api-version=2023-11-01
92
92
Content-Type: application/json
93
93
api-key: [Search service admin key]
94
94
{
@@ -141,7 +141,7 @@ Once the index and data source have been created, you're ready to create the ind
141
141
1. [Create or update an indexer](/rest/api/searchservice/create-indexer) by giving it a name and referencing the data source and target index:
142
142
143
143
```http
144
-
POST https://[service name].search.windows.net/indexers?api-version=2020-06-30
144
+
POST https://[service name].search.windows.net/indexers?api-version=2023-11-01
145
145
Content-Type: application/json
146
146
api-key: [search service admin key]
147
147
{
@@ -170,7 +170,7 @@ Once the index and data source have been created, you're ready to create the ind
170
170
171
171
+ Default query timeout for SQL query execution is 5 minutes, which you can override.
172
172
173
-
+ "convertHighWaterMarkToRowVersion" optimizes for the [High Water Mark change detection policy](#HighWaterMarkPolicy). Change detection policies are set in the data source. If you're using the native change detection policy, this parameter has no effect.
173
+
+ "convertHighWaterMarkToRowVersion" optimizes for the [High Water Mark Change Detection policy](#HighWaterMarkPolicy). Change detection policies are set in the data source. If you're using the native change detection policy, this parameter has no effect.
174
174
175
175
+ "disableOrderByHighWaterMarkColumn" causes the SQL query used by the [high water mark policy](#HighWaterMarkPolicy) to omit the ORDER BY clause. If you're using the native change detection policy, this parameter has no effect.
176
176
@@ -185,7 +185,7 @@ An indexer runs automatically when it's created. You can prevent this by setting
185
185
To monitor the indexer status and execution history, send a [Get Indexer Status](/rest/api/searchservice/get-indexer-status) request:
186
186
187
187
```http
188
-
GET https://myservice.search.windows.net/indexers/myindexer/status?api-version=2020-06-30
188
+
GET https://myservice.search.windows.net/indexers/myindexer/status?api-version=2023-11-01
189
189
Content-Type: application/json
190
190
api-key: [admin key]
191
191
```
@@ -251,12 +251,12 @@ Database requirements:
251
251
+ Tables only (no views)
252
252
+ On the database, [enable change tracking](/sql/relational-databases/track-changes/enable-and-disable-change-tracking-sql-server) for the table
253
253
+ No composite primary key (a primary key containing more than one column) on the table
254
-
+ No clustered indexes on the table. As a workaround, any clustered index would have to be dropped and re-created as nonclustered index, however, performance may be affected in the source compared to having a clustered index
254
+
+ No clustered indexes on the table. As a workaround, any clustered index would have to be dropped and re-created as nonclustered index, however, performance might be affected in the source compared to having a clustered index
255
255
256
256
Change detection policies are added to data source definitions. To use this policy, create or update your data source like this:
257
257
258
258
```http
259
-
POST https://myservice.search.windows.net/datasources?api-version=2020-06-30
259
+
POST https://myservice.search.windows.net/datasources?api-version=2023-11-01
260
260
Content-Type: application/json
261
261
api-key: admin-key
262
262
{
@@ -293,7 +293,7 @@ The high water mark column must meet the following requirements:
293
293
Change detection policies are added to data source definitions. To use this policy, create or update your data source like this:
294
294
295
295
```http
296
-
POST https://myservice.search.windows.net/datasources?api-version=2020-06-30
296
+
POST https://myservice.search.windows.net/datasources?api-version=2023-11-01
297
297
Content-Type: application/json
298
298
api-key: admin-key
299
299
{
@@ -319,7 +319,7 @@ If you're using a [rowversion](/sql/t-sql/data-types/rowversion-transact-sql) da
319
319
320
320
+ Uses the rowversion data type for the high water mark column in the indexer SQL query. Using the correct data type improves indexer query performance.
321
321
322
-
+ Subtracts one from the rowversion value before the indexer query runs. Views with one-to-many joins may have rows with duplicate rowversion values. Subtracting one ensures the indexer query doesn't miss these rows.
322
+
+ Subtracts one from the rowversion value before the indexer query runs. Views with one-to-many joins might have rows with duplicate rowversion values. Subtracting one ensures the indexer query doesn't miss these rows.
323
323
324
324
To enable this property, create or update the indexer with the following configuration:
325
325
@@ -349,7 +349,7 @@ If you encounter timeout errors, set the `queryTimeout` indexer configuration se
349
349
350
350
##### disableOrderByHighWaterMarkColumn
351
351
352
-
You can also disable the `ORDER BY [High Water Mark Column]` clause. However, this isn't recommended because if the indexer execution is interrupted by an error, the indexer has to re-process all rows if it runs later, even if the indexer has already processed almost all the rows at the time it was interrupted. To disable the `ORDER BY` clause, use the `disableOrderByHighWaterMarkColumn` setting in the indexer definition:
352
+
You can also disable the `ORDER BY [High Water Mark Column]` clause. However, this isn't recommended because if the indexer execution is interrupted by an error, the indexer has to reprocess all rows if it runs later, even if the indexer has already processed almost all the rows at the time it was interrupted. To disable the `ORDER BY` clause, use the `disableOrderByHighWaterMarkColumn` setting in the indexer definition:
353
353
354
354
```http
355
355
{
@@ -363,7 +363,7 @@ You can also disable the `ORDER BY [High Water Mark Column]` clause. However, th
363
363
364
364
When rows are deleted from the source table, you probably want to delete those rows from the search index as well. If you use the SQL integrated change tracking policy, this is taken care of for you. However, the high water mark change tracking policy doesn’t help you with deleted rows. What to do?
365
365
366
-
If the rows are physically removed from the table, Azure AI Search has no way to infer the presence of records that no longer exist. However, you can use the “soft-delete” technique to logically delete rows without removing them from the table. Add a column to your table or view and mark rows as deleted using that column.
366
+
If the rows are physically removed from the table, Azure AI Search has no way to infer the presence of records that no longer exist. However, you can use the “soft-delete” technique to logically delete rows without removing them from the table. Add a column to your table or view and mark rows as deleted using that column.
367
367
368
368
When using the soft-delete technique, you can specify the soft delete policy as follows when creating or updating the data source:
369
369
@@ -386,7 +386,7 @@ If you're setting up a soft delete policy from the Azure portal, don't add quote
386
386
387
387
**Q: Can I index Always Encrypted columns?**
388
388
389
-
No.[Always Encrypted](/sql/relational-databases/security/encryption/always-encrypted-database-engine) columns aren't currently supported by Azure AI Search indexers.
389
+
No,[Always Encrypted](/sql/relational-databases/security/encryption/always-encrypted-database-engine) columns aren't currently supported by Azure AI Search indexers.
390
390
391
391
**Q: Can I use Azure SQL indexer with SQL databases running on IaaS VMs in Azure?**
392
392
@@ -412,7 +412,7 @@ If you attempt to use rowversion on a read-only replica, you'll see the followin
412
412
413
413
**Q: Can I use an alternative, non-rowversion column for high water mark change tracking?**
414
414
415
-
It's not recommended. Only **rowversion** allows for reliable data synchronization. However, depending on your application logic, it may be safe if:
415
+
It's not recommended. Only **rowversion** allows for reliable data synchronization. However, depending on your application logic, it can be safe if:
416
416
417
417
+ You can ensure that when the indexer runs, there are no outstanding transactions on the table that’s being indexed (for example, all table updates happen as a batch on a schedule, and the Azure AI Search indexer schedule is set to avoid overlapping with the table update schedule).
0 commit comments