Skip to content

Commit 398b33d

Browse files
Merge pull request #231893 from HeidiSteen/heidist-refresh
[azure search] GH issue resolution for table index doc key
2 parents 60edfcf + e05a271 commit 398b33d

File tree

3 files changed

+77
-40
lines changed

3 files changed

+77
-40
lines changed

articles/search/search-howto-indexing-azure-tables.md

Lines changed: 75 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -22,34 +22,46 @@ This article supplements [**Create an indexer**](search-howto-create-indexers.md
2222

2323
+ [Azure Table Storage](../storage/tables/table-storage-overview.md)
2424

25-
+ Tables containing text. If you have binary data, you can include [AI enrichment](cognitive-search-concept-intro.md) for image analysis.
25+
+ Tables containing text. If you have binary data, consider [AI enrichment](cognitive-search-concept-intro.md) for image analysis.
2626

27-
+ Read permissions to access Azure Storage. A "full access" connection string includes a key that gives access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Data and Reader** permissions.
27+
+ Read permissions on Azure Storage. A "full access" connection string includes a key that gives access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Data and Reader** permissions.
2828

29-
+ A REST client, such as [Postman](search-get-started-rest.md), to send REST calls that create the data source, index, and indexer.
29+
+ Use a REST client, such as [Postman app](https://www.postman.com/downloads/), if you want to formulate REST calls similar to the ones shown in this article.
3030

3131
## Define the data source
3232

33-
The data source definition specifies the data to index, credentials, and policies for identifying changes in the data. A data source is defined as an independent resource so that it can be used by multiple indexers.
33+
The data source definition specifies the source data to index, credentials, and policies for change detection. A data source is an independent resource that can be used by multiple indexers.
3434

35-
1. [Create or update a data source](/rest/api/searchservice/create-data-source) to set its definition:
35+
1. [Create or update a data source](/rest/api/searchservice/create-data-source) to set its definition:
3636

37-
```json
37+
```http
38+
POST https://[service name].search.windows.net/datasources?api-version=2020-06-30
3839
{
39-
"name" : "hotel-tables",
40-
"type" : "azuretable",
41-
"credentials" : { "connectionString" : "DefaultEndpointsProtocol=https;AccountName=<account name>;AccountKey=<account key>;" },
42-
"container" : { "name" : "tblHotels", "query" : "PartitionKey eq '123'" }
40+
"name": "my-table-storage-ds",
41+
"description": null,
42+
"type": "azuretable",
43+
"subtype": null,
44+
"credentials": {
45+
"connectionString": "DefaultEndpointsProtocol=https;AccountName=<account name>"
46+
},
47+
"container": {
48+
"name": "my-table-in-azure-storage",
49+
"query": ""
50+
},
51+
"dataChangeDetectionPolicy": null,
52+
"dataDeletionDetectionPolicy": null,
53+
"encryptionKey": null,
54+
"identity": null
4355
}
44-
```
56+
```
4557

4658
1. Set "type" to `"azuretable"` (required).
4759

4860
1. Set "credentials" to an Azure Storage connection string. The next section describes the supported formats.
4961

5062
1. Set "container" to the name of the table.
5163

52-
1. Optionally, set "query" to a filter on PartitionKey. This is a best practice that improves performance. If "query" is specified any other way, the indexer will execute a full table scan, resulting in poor performance if the tables are large.
64+
1. Optionally, set "query" to a filter on PartitionKey. Setting this property is a best practice that improves performance. If "query" is null, the indexer executes a full table scan, which can result in poor performance if the tables are large.
5365

5466
A data source definition can also include [soft deletion policies](search-howto-index-changed-deleted-blobs.md), if you want the indexer to delete a search document when the source document is flagged for deletion.
5567

@@ -80,7 +92,7 @@ Indexers can connect to a table using the following connections.
8092
| The SAS should have the list and read permissions on the container. For more information, see [Using Shared Access Signatures](../storage/common/storage-sas-overview.md). |
8193

8294
> [!NOTE]
83-
> If you use SAS credentials, you will need to update the data source credentials periodically with renewed signatures to prevent their expiration. If SAS credentials expire, the indexer will fail with an error message similar to "Credentials provided in the connection string are invalid or have expired".
95+
> If you use SAS credentials, you'll need to update the data source credentials periodically with renewed signatures to prevent their expiration. When SAS credentials expire, the indexer will fail with an error message similar to "Credentials provided in the connection string are invalid or have expired".
8496
8597
<a name="Performance"></a>
8698

@@ -100,7 +112,7 @@ To avoid a full scan, you can use table partitions to narrow the scope of each i
100112

101113
+ Monitor indexer progress by using [Get Indexer Status API](/rest/api/searchservice/get-indexer-status), and periodically update the `<TimeStamp>` condition of the query based on the latest successful high-water-mark value.
102114

103-
+ With this approach, if you need to trigger a complete reindexing, you need to reset the data source query in addition to resetting the indexer.
115+
+ With this approach, if you need to trigger a full reindex, reset the data source query in addition to [resetting the indexer](search-howto-run-reset-indexers.md).
104116

105117
## Add search fields to an index
106118

@@ -113,47 +125,72 @@ In a [search index](search-what-is-an-index.md), add fields to accept the conten
113125
{
114126
"name" : "my-search-index",
115127
"fields": [
116-
{ "name": "ID", "type": "Edm.String", "key": true, "searchable": false },
128+
{ "name": "Key", "type": "Edm.String", "key": true, "searchable": false },
117129
{ "name": "SomeColumnInMyTable", "type": "Edm.String", "searchable": true }
118130
]
119131
}
120132
```
121133
122-
1. Create a document key field ("key": true), but allow the indexer to populate it automatically. Do not define a field mapping to alternative unique string field in your table.
134+
1. Create a document key field ("key": true), but allow the indexer to populate it automatically. A table indexer populates the key field with concatenated partition and row keys from the table. For example, if a row’s PartitionKey is `1` and RowKey is `1_123`, then the key value is `11_123`. If the partition key is null, just the row key is used.
135+
136+
If you're using the Import data wizard to create the index, the portal infers a "Key" field for the search index and uses an implicit field mapping to connect the source and destination fields. You don't have to add the field yourself, and you don't need to set up a field mapping.
123137
124-
A table indexer populates the key field with concatenated partition and row keys from the table. For example, if a row’s PartitionKey is `PK1` and RowKey is `RK1`, then the key value is `PK1RK1`. If the partition key is null, just the row key is used.
138+
If you're using the REST APIs and you want implicit field mappings, create and name the document key field "Key" in the search index definition as shown in the previous step (`{ "name": "Key", "type": "Edm.String", "key": true, "searchable": false }`). The indexer populates the Key field automatically, with no field mappings required.
125139
126-
1. Create additional fields that correspond to entity fields. For example, if an entity looks like the following example, your search index should have fields for HotelName, Description, and Category.
140+
If you don't want a field named "Key" in your search index, add an explicit field mapping in the indexer definition with the field name you want, setting the source field to "Key":
141+
142+
```json
143+
"fieldMappings" : [
144+
{
145+
"sourceFieldName" : "Key",
146+
"targetFieldName" : "MyDocumentKeyFieldName"
147+
}
148+
]
149+
```
150+
151+
1. Now add any other entity fields that you want in your index. For example, if an entity looks like the following example, your search index should have fields for HotelName, Description, and Category to receive those values.
127152

128153
:::image type="content" source="media/search-howto-indexing-tables/table.png" alt-text="Screenshot of table content in Storage browser." border="true":::
129154

130-
Using the same names and compatible [data types](/rest/api/searchservice/supported-data-types) minimizes the need for [field mappings](search-indexer-field-mappings.md).
155+
Using the same names and compatible [data types](/rest/api/searchservice/supported-data-types) minimizes the need for [field mappings](search-indexer-field-mappings.md). When names and types are the same, the indexer can determine the data path automatically.
131156

132157
## Configure and run the table indexer
133158

134-
Once the index and data source have been created, you're ready to create the indexer. Indexer configuration specifies the inputs, parameters, and properties controlling run time behaviors.
159+
Once you have an index and data source, you're ready to create the indexer. Indexer configuration specifies the inputs, parameters, and properties controlling run time behaviors.
135160

136161
1. [Create or update an indexer](/rest/api/searchservice/create-indexer) by giving it a name and referencing the data source and target index:
137162

138163
```http
139164
POST https://[service name].search.windows.net/indexers?api-version=2020-06-30
140165
{
141-
"name" : "table-indexer",
142-
"dataSourceName" : "my-table-datasource",
166+
"name" : "my-table-indexer",
167+
"dataSourceName" : "my-table-storage-ds",
143168
"targetIndexName" : "my-search-index",
144-
"parameters": {
145-
"batchSize": null,
146-
"maxFailedItems": null,
147-
"maxFailedItemsPerBatch": null,
148-
"base64EncodeKeys": null,
149-
"configuration:" { }
169+
"disabled": null,
170+
"schedule": null,
171+
"parameters" : {
172+
"batchSize" : null,
173+
"maxFailedItems" : null,
174+
"maxFailedItemsPerBatch" : null,
175+
"base64EncodeKeys" : null,
176+
"configuration" : { }
150177
},
151-
"schedule" : { },
152-
"fieldMappings" : [ ]
178+
"fieldMappings" : [ ],
179+
"cache": null,
180+
"encryptionKey": null
153181
}
154182
```
155183
156-
1. [Specify field mappings](search-indexer-field-mappings.md) if there are differences in field name or type, or if you need multiple versions of a source field in the search index.
184+
1. [Specify field mappings](search-indexer-field-mappings.md) if there are differences in field name or type, or if you need multiple versions of a source field in the search index. The Target field is the name of the field in the search index.
185+
186+
```json
187+
"fieldMappings" : [
188+
{
189+
"sourceFieldName" : "Description",
190+
"targetFieldName" : "HotelDescription"
191+
}
192+
]
193+
```
157194

158195
1. See [Create an indexer](search-howto-create-indexers.md) for more information about other properties.
159196

@@ -177,8 +214,8 @@ The response includes status and the number of items processed. It should look s
177214
"lastResult": {
178215
"status":"success",
179216
"errorMessage":null,
180-
"startTime":"2022-02-21T00:23:24.957Z",
181-
"endTime":"2022-02-21T00:36:47.752Z",
217+
"startTime":"2023-02-21T00:23:24.957Z",
218+
"endTime":"2023-02-21T00:36:47.752Z",
182219
"errors":[],
183220
"itemsProcessed":1599501,
184221
"itemsFailed":0,
@@ -190,8 +227,8 @@ The response includes status and the number of items processed. It should look s
190227
{
191228
"status":"success",
192229
"errorMessage":null,
193-
"startTime":"2022-02-21T00:23:24.957Z",
194-
"endTime":"2022-02-21T00:36:47.752Z",
230+
"startTime":"2023-02-21T00:23:24.957Z",
231+
"endTime":"2023-02-21T00:36:47.752Z",
195232
"errors":[],
196233
"itemsProcessed":1599501,
197234
"itemsFailed":0,
@@ -207,7 +244,7 @@ Execution history contains up to 50 of the most recently completed executions, w
207244

208245
## Next steps
209246

210-
You can now [run the indexer](search-howto-run-reset-indexers.md), [monitor status](search-howto-monitor-indexers.md), or [schedule indexer execution](search-howto-schedule-indexers.md). The following articles apply to indexers that pull content from Azure Storage:
247+
Learn more about how to [run the indexer](search-howto-run-reset-indexers.md), [monitor status](search-howto-monitor-indexers.md), or [schedule indexer execution](search-howto-schedule-indexers.md). The following articles apply to indexers that pull content from Azure Storage:
211248

212-
+ [Index large data sets](search-howto-large-index.md)
213-
+ [Indexer access to content protected by Azure network security features](search-indexer-securing-resources.md)
249+
+ [Tutorial: Index JSON blobs from Azure Storage](search-semi-structured-data.md)
250+
+ [Tutorial: Index encrypted blobs in Azure Storage](search-howto-index-encrypted-blobs.md)

articles/search/search-security-trimming-for-azure-search-with-aad.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ This article covers the following tasks:
2222
> - Cache the new groups
2323
> - Index documents with associated groups
2424
> - Issue a search request with group identifiers filter
25-
>
25+
2626
> [!NOTE]
2727
> Sample code snippets in this article are written in C#. You can find the full source code [on GitHub](https://github.com/Azure-Samples/search-dotnet-getting-started).
2828

articles/search/whats-new.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Learn about the latest updates to Azure Cognitive Search functionality, docs, an
2222

2323
| Item&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | Type | Description |
2424
|-----------------------------|------|--------------|
25-
| [**ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search (GitHub)**](https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/README.md) | Sample | Python code and a template for combining Cognitive Search with the large language models in OpenAI. For background, see this Tech Community blog post: [Revolutionize your Enterprise Data with ChatGPT](https://techcommunity.microsoft.com/t5/ai-applied-ai-blog/revolutionize-your-enterprise-data-with-chatgpt-next-gen-apps-w/ba-p/3762087). To summarize the key points: <ul><li>Use Cognitive Search to consolidate and index searchable content.</li> <li>Query the index for initial search results.</li> <li>Assemble prompts from those results and send to the gpt-35-turbo (preview) model in Azure OpenAI.</li> <li>Return a summary and provide citations and transparency in your customer-facing app so that users can evaluate the response.</li> </ul>|
25+
| [**ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search (GitHub)**](https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/README.md) | Sample | Python code and a template for combining Cognitive Search with the large language models in OpenAI. For background, see this Tech Community blog post: [Revolutionize your Enterprise Data with ChatGPT](https://techcommunity.microsoft.com/t5/ai-applied-ai-blog/revolutionize-your-enterprise-data-with-chatgpt-next-gen-apps-w/ba-p/3762087). <br><br>Key points: <br><br>Use Cognitive Search to consolidate and index searchable content.</br> <br>Query the index for initial search results.</br> <br>Assemble prompts from those results and send to the gpt-35-turbo (preview) model in Azure OpenAI.</br> <br>Return a cross-document answer and provide citations and transparency in your customer-facing app so that users can assess the response.</br>|
2626

2727
## November 2022
2828

0 commit comments

Comments
 (0)