Skip to content

Commit 513fd0f

Browse files
committed
Fixed links and H2s
1 parent 50e42f1 commit 513fd0f

File tree

3 files changed

+84
-23
lines changed

3 files changed

+84
-23
lines changed

articles/search/search-how-to-create-search-index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ In this article, learn the steps for defining a schema for the index and pushing
2828

2929
+ A stable index location. Moving an existing index to a different search service isn't supported out-of-the-box. Revisit application requirements and make sure that your existing search service (capacity and location), are sufficient for your needs.
3030

31-
+ Finally, all service tiers have [index limits](search-limits-quotas-capacity.md#index-limits) on the number of objects that you can create. For example, if you're experimenting on the Free tier, you can only have three indexes at any given time. Within the index itself, there are [limits on vectors](search-limits-quotas-capacity.md#vector-index-size-limits) and [index limits](search-limits-quotas-capacity#index-limits) on the number of simple and complex fields.
31+
+ Finally, all service tiers have [index limits](search-limits-quotas-capacity.md#index-limits) on the number of objects that you can create. For example, if you're experimenting on the Free tier, you can only have three indexes at any given time. Within the index itself, there are [limits on vectors](search-limits-quotas-capacity.md#vector-index-size-limits) and [index limits](search-limits-quotas-capacity.md#index-limits) on the number of simple and complex fields.
3232

3333
## Document keys
3434

@@ -62,7 +62,7 @@ Use this checklist to assist the design decisions for your search index.
6262

6363
+ Filterable fields are returned in arbitrary order, so consider making them sortable as well.
6464

65-
1. For vector fields, specify a vector search configuration and the algorithms used for creating navigation paths and filling the embedding space. For more information, see [Add vector fields](vector-search-how-to-create.md).
65+
1. For vector fields, specify a vector search configuration and the algorithms used for creating navigation paths and filling the embedding space. For more information, see [Add vector fields](vector-search-how-to-create-index.md).
6666

6767
Vector fields have extra properties that nonvector fields don't have, such as which algorithms to use and vector compression.
6868

@@ -229,6 +229,6 @@ To minimize churn in the design process, the following table describes which ele
229229
Use the following links to become familiar with loading an index with data, or extending an index with a synonyms map.
230230

231231
+ [Data import overview](search-what-is-data-import.md)
232-
+ [Add vector fields](vector-search-how-to-create.md)
232+
+ [Add vector fields](vector-search-how-to-create-index.md)
233233
+ [Load documents](search-how-to-load-search-index.md)
234234
+ [Synonym maps](search-synonyms.md)

articles/search/search-how-to-load-search-index.md

Lines changed: 61 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,21 +14,25 @@ ms.date: 07/01/2024
1414

1515
# Load data into a search index in Azure AI Search
1616

17-
This article explains how to import, refresh, and manage content in a predefined search index. In Azure AI Search, a [search index is created first](search-how-to-create-search-index.md) with [data import](search-what-is-data-import.md) following as a second step. The exception is [Import wizards](search-import-data-portal.md) in the portal and indexer pipelines, which create and load an index in one workflow.
17+
This article explains how to import documents into a predefined search index. In Azure AI Search, a [search index is created first](search-how-to-create-search-index.md) with [data import](search-what-is-data-import.md) following as a second step. The exception is [Import wizards](search-import-data-portal.md) in the portal and indexer pipelines, which create and load an index in one workflow.
1818

19-
A search service imports and indexes plain text and vectors in JSON, used in full text search, vector search, hybrid search, and knowledge mining scenarios. Plain text content is obtainable from alphanumeric fields in the external data source, metadata that's useful in search scenarios, or enriched content created by a [skillset](cognitive-search-working-with-skillsets.md) (skills can extract or infer textual descriptions from images and unstructured content). Vector content is vectorized using an [external embedding model](vector-search-how-to-generate-embeddings.md) or [integrated vectorization (preview)](vector-search-integrated-vectorization.md) using Azure AI Search features that integrate with applied AI.
19+
## How data import works
2020

21-
Once data is indexed, the physical data structures of the index are locked in. For guidance on what can and can't be changed, see [Update and rebuild an index](search-howto-reindex.md).
22-
23-
Indexing isn't a background process. A search service will balance indexing and query workloads, but if [query latency is too high](search-performance-analysis.md#impact-of-indexing-on-queries), you can either [add capacity](search-capacity-planning.md#adjust-capacity) or identify periods of low query activity for loading an index.
21+
A search service accepts JSON documents that conform to the index schema. A search service imports and indexes plain text and vectors in JSON, used in full text search, vector search, hybrid search, and knowledge mining scenarios.
2422

25-
## Load documents
23+
+ Plain text content is obtainable from alphanumeric fields in the external data source, metadata that's useful in search scenarios, or enriched content created by a [skillset](cognitive-search-working-with-skillsets.md) (skills can extract or infer textual descriptions from images and unstructured content).
2624

27-
A search service accepts JSON documents that conform to the index schema.
25+
+ Vector content is vectorized using an [external embedding model](vector-search-how-to-generate-embeddings.md) or [integrated vectorization (preview)](vector-search-integrated-vectorization.md) using Azure AI Search features that integrate with applied AI.
2826

2927
You can prepare these documents yourself, but if content resides in a [supported data source](search-indexer-overview.md#supported-data-sources), running an [indexer](search-indexer-overview.md) or using an Import wizard can automate document retrieval, JSON serialization, and indexing.
3028

31-
### [**Azure portal**](#tab/portal)
29+
Once data is indexed, the physical data structures of the index are locked in. For guidance on what can and can't be changed, see [Update and rebuild an index](search-howto-reindex.md).
30+
31+
Indexing isn't a background process. A search service will balance indexing and query workloads, but if [query latency is too high](search-performance-analysis.md#impact-of-indexing-on-queries), you can either [add capacity](search-capacity-planning.md#adjust-capacity) or identify periods of low query activity for loading an index.
32+
33+
For more information, see [Data import strategies](search-what-is-data-import.md).
34+
35+
## Load documents using the Azure portal
3236

3337
In the Azure portal, use the Import wizards to create and load indexes in a seamless workflow. If you want to load an existing index, choose an alternative approach.
3438

@@ -40,7 +44,7 @@ In the Azure portal, use the Import wizards to create and load indexes in a seam
4044

4145
If indexers are already defined, you can [reset and run an indexer](search-howto-run-reset-indexers.md) from the Azure portal, which is useful if you're adding fields incrementally. Reset forces the indexer to start over, picking up all fields from all source documents.
4246

43-
### [**REST**](#tab/import-rest)
47+
## Load documents using the REST APIs
4448

4549
[Documents - Index (REST)](/rest/api/searchservice/documents) is the means by which you can import data into a search index through the REST APIs. The `@search.action` parameter determines whether documents are added in full, or partially in terms of new or replacement values for specific fields.
4650

@@ -82,11 +86,15 @@ If indexers are already defined, you can [reset and run an indexer](search-howto
8286
8387
When the document key or ID is new, **null** becomes the value for any field that is unspecified in the document. For actions on an existing document, updated values replace the previous values. Any fields that weren't specified in a "merge" or "mergeUpload" are left intact in the search index.
8488
85-
### [**.NET SDK (C#)**](#tab/importcsharp)
89+
## Load documents using the Azure SDKs
8690
87-
Azure AI Search supports the following APIs for simple and bulk document uploads into an index:
91+
Programmability is provided in the following Azure SDKs.
8892
89-
+ [IndexDocumentsAsync (Azure SDK for .NET)](/dotnet/api/azure.search.documents.searchclient.indexdocumentsasync)
93+
### [**.NET**](#tab/sdk-dotnet)
94+
95+
The Azure SDK for .NET provides the following APIs for simple and bulk document uploads into an index:
96+
97+
+ [IndexDocumentsAsync](/dotnet/api/azure.search.documents.searchclient.indexdocumentsasync)
9098
+ [SearchIndexingBufferedSender](/dotnet/api/azure.search.documents.searchindexingbufferedsender-1)
9199
92100
There are several samples that illustrate indexing in context of simple and large-scale indexing:
@@ -97,6 +105,47 @@ There are several samples that illustrate indexing in context of simple and larg
97105
98106
+ [**Tutorial: Index any data**](tutorial-optimize-indexing-push-api.md) couples batch indexing with testing strategies for determining an optimum size.
99107
108+
+ Be sure to check the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repo for code examples showing how to index vector fields.
109+
110+
### [**Python**](#tab/sdk-python)
111+
112+
The Azure SDK for Python provides the following APIs for simple and bulk document uploads into an index:
113+
114+
+ [IndexDocumentsBatch](/python/api/azure-search-documents/azure.search.documents.indexdocumentsbatch)
115+
+ [SearchIndexingBufferedSender](/python/api/azure-search-documents/azure.search.documents.searchindexingbufferedsender)
116+
117+
Code samples include:
118+
119+
+ [sample_crud_operations.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/samples/sample_crud_operations.py)
120+
121+
+ Be sure to check the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repo for code examples showing how to index vector fields.
122+
123+
### [**JavaScript**](#tab/sdk-javascript)
124+
125+
The Azure SDK for JavaScript/TypeScript provides the following APIs for simple and bulk document uploads into an index:
126+
127+
+ [IndexDocumentsBath](/javascript/api/%40azure/search-documents/indexdocumentsbatch)
128+
+ [SearchIndexingBufferedSender](/javascript/api/%40azure/search-documents/searchindexingbufferedsender)
129+
130+
Code samples include:
131+
132+
+ See this quickstart for basic steps: [Quickstart: Full text search using the Azure SDKs](search-get-started-text.md?tabs=javascript)
133+
134+
+ Be sure to check the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repo for code examples showing how to index vector fields.
135+
136+
### [**Java**](#tab/sdk-java)
137+
138+
The Azure SDK for Java provides the following APIs for simple and bulk document uploads into an index:
139+
140+
+ [indexactiontype enumerator](/java/api/com.azure.search.documents.models.indexactiontype?view=azure-java-stable)
141+
+ [SearchIndexingBufferedSender](/java/api/com.azure.search.documents.searchclientbuilder.searchindexingbufferedsenderbuilder)
142+
143+
Code samples include:
144+
145+
+ [IndexContentManagementExample.java](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/search/azure-search-documents/src/samples/java/com/azure/search/documents/IndexContentManagementExample.java)
146+
147+
+ Be sure to check the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repo for code examples showing how to index vector fields.
148+
100149
---
101150
102151
Internally during indexing, each vector field is populated with embeddings in an internal vector index, and each nonvector field's inverted index is populated with all of the unique, tokenized words from each document. Each field is associated with a document key that determines the logical structure of the document. For example, when indexing a hotels data set, an inverted index created for a City field might contain terms for Seattle, Portland, and so forth. Documents that include Seattle or Portland in the City field would have their document ID listed alongside the term. On any [Documents - Index](/rest/api/searchservice/documents) operation, the terms and document ID list are updated accordingly. For more information about inverted indexes, see [Full text search in Azure AI Search](search-lucene-query-architecture.md).

articles/search/search-howto-reindex.md

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -14,32 +14,44 @@ ms.date: 07/01/2024
1414

1515
# Update or rebuild an index in Azure AI Search
1616

17-
This article explains how to update an existing index in Azure AI Search. It explains the circumstances under which rebuilds are required, and provides recommendations for mitigating the effects of rebuilds on ongoing query requests. If you have to rebuild frequently, we recommend using [index aliases](search-how-to-alias.md) to make it easier to swap which index your application is pointing to.
17+
This article explains how to update an existing index in Azure AI Search with incremental indexing. It explains the circumstances under which rebuilds are required, and provides recommendations for mitigating the effects of rebuilds on ongoing query requests.
1818

1919
During active development, it's common to drop and rebuild indexes when you're iterating over index design. Most developers work with a small representative sample of their data so that reindexing goes faster.
2020

2121
For applications already in production, we recommend creating a new index that runs side by side an existing index to avoid query downtime and using an [index alias](search-how-to-alias.md) to avoid changing your application code.
2222

2323
## Update content
2424

25-
Incremental indexing and synchronizing an index against changes in source data is a basic requirement in search scenarios. This section explains the workflow for overwriting field contents in a search index.
25+
Incremental indexing and synchronizing an index against changes in source data is a basic requirement for most search applications. This section explains the workflow for overwriting field contents in a search index.
2626

27-
1. Use the same techniques for loading documents: [Documents - Index (REST)](/rest/api/searchservice/documents) or an equivalent API in the Azure SDKs. For more information, see [Load documents](search-how-to-load-search-index.md).
27+
1. Use the same techniques for loading documents: [Documents - Index (REST)](/rest/api/searchservice/documents) or an equivalent API in the Azure SDKs. For more information about indexing, see [Load documents](search-how-to-load-search-index.md).
2828

2929
1. Set the `@search.action` parameter to determine the effect on existing documents:
3030

31-
+ `delete` removes the entire document from the index. If you want to remove an individual field, use `merge` instead, setting the field in question to null. Deleted documents don't immediately free up space in the index. Every few minutes, a background process performs the physical deletion. Whether you use the portal or an API to return index statistics, you can expect a small delay before the deletion is reflected in the portal and through APIs.
32-
+ `merge` updates a document that already exists, and fails a document that can't be found. Merge replaces existing values. For this reason, be sure to check for collection fields that contain multiple values, such as fields of type `Collection(Edm.String)`. For example, if a `tags` field starts with a value of `["budget"]` and you execute a merge with `["economy", "pool"]`, the final value of the `tags` field is `["economy", "pool"]`. It won't be `["budget", "economy", "pool"]`.
33-
+ `mergeOrUpload` behaves like `merge` if the document exists, and `upload` if the document is new.
34-
+ `upload`, similar to an "upsert" where the document is inserted if it's new, and updated or replaced if it exists. If the document is missing values that the index requires, the document field's value is set to null.
31+
| Action | Effect |
32+
|--------|--------|
33+
| `delete` | emoves the entire document from the index. If you want to remove an individual field, use `merge` instead, setting the field in question to null. Deleted documents and fields don't immediately free up space in the index. Every few minutes, a background process performs the physical deletion. Whether you use the portal or an API to return index statistics, you can expect a small delay before the deletion is reflected in the portal and through APIs. |
34+
| `merge` | Updates a document that already exists, and fails a document that can't be found. Merge replaces existing values. For this reason, be sure to check for collection fields that contain multiple values, such as fields of type `Collection(Edm.String)`. For example, if a `tags` field starts with a value of `["budget"]` and you execute a merge with `["economy", "pool"]`, the final value of the `tags` field is `["economy", "pool"]`. It won't be `["budget", "economy", "pool"]`. |
35+
| `mergeOrUpload` | Behaves like `merge` if the document exists, and `upload` if the document is new. This is the most common action for incremental updates. |
36+
| `upload` | Similar to an "upsert" where the document is inserted if it's new, and updated or replaced if it exists. If the document is missing values that the index requires, the document field's value is set to null. |
3537

3638
1. Post the update.
3739

3840
Queries continue to run, but if you're updating or removing existing fields, you can expect mixed results and a higher incidence of throttling.
3941

42+
## Tips for incremental indexing
43+
44+
+ Use `mergeOrUpload` as the search action.
45+
46+
+ The payload must include the keys or identifiers of every document you want to add, update, or delete.
47+
48+
+ For merging, avoid listing fields that contain content you want to preserve. For example, if you populated vector fields, but only need to update a few nonvector fields, the payload should list just those fields you want to update. Specifying an empty field overwrites the existing value with a null value.
49+
50+
+ [Indexers](search-indexer-overview.md) are designed for incremental indexing. If you can use an indexer, and if the data source supports change tracking, you can run the indexer on a recurring schedule to add, update, and delete an index so that it's synchronized to your external data.
51+
4052
## Change an index schema
4153

42-
The index schema defines the physical data structures created on the search service, so there aren't many schema changes that you can make without incurring a full rebuild. The following list enumerates the schema changes that can be introduced seamlessly into an existing index. The list includes new fields and functionality used during query executions.
54+
The index schema defines the physical data structures created on the search service, so there aren't many schema changes that you can make without incurring a full rebuild. The following list enumerates the schema changes that can be introduced seamlessly into an existing index. Generally, the list includes new fields and functionality used during query executions.
4355

4456
+ Add a new field
4557
+ Set the **retrievable** attribute on an existing field

0 commit comments

Comments
 (0)