You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-how-to-define-index-projections.md
+12-6Lines changed: 12 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,15 +52,15 @@ Indexers load indexed data into a predefined index. How you define the schema an
52
52
53
53
## Create an index for one-to-many indexing
54
54
55
-
Whether you create one index that combines parent-child fields, or multiple indexes for separating parent-child fields, the primary index used for searching is designed around data chunks. It must have the following fields:
55
+
Whether you create one index for chunks that repeats parent values, or separate indexes for parent-child field placement, the primary index used for searching is designed around data chunks. It must have the following fields:
56
56
57
57
- A document key field uniquely identifying each document. It must be defined as type `Edm.String` with the `keyword` analyzer.
58
58
59
59
- A field associating each chunk with its parent. It must be of type `Edm.String`. It can't be the document key field, and must have `filterable` set to true. It's referred to as parent_id in the examples and as a [projected key value](#projected-key-value) in this article.
60
60
61
61
- Other fields for content, such as text or vectorized chunk fields.
62
62
63
-
An index must exist on the search service before you create the skillset or run the indexer
63
+
An index must exist on the search service before you create the skillset or run the indexer.
64
64
65
65
### Single index schema inclusive of parent and child fields
66
66
@@ -142,14 +142,21 @@ This example is similar to the [RAG tutorial](tutorial-rag-build-solution-index-
142
142
143
143
Index projections are defined inside a skillset definition and are primarily defined as an array of `selectors`, where each selector corresponds to a different target index on the search service. Each selector requires the following parameters as part of its definition:
144
144
145
-
| Parameter | Definition |
145
+
| Index projection parameters | Definition |
146
+
|----------------------------|------------|
147
+
|`selectors`| Parameters for the main search corpus, usually the one designed around chunks. |
148
+
|`projectionMode`| An optional parameter providing instructions to the indexer. The only valid value for this parameter is `skipIndexingParentDocuments`, and it's used when the chunk index is the primary search corpus and you need to specify whether parent fields are indexed as individual search documents within the chunked index. If you don't set `skipIndexingParentDocuments`, you get extra search documents in your index that are null for chunks, but populated with parent fields only. For example, if five documents contribute 100 chunks to the index, then the number of documents in the index is 105. The five documents created or parent fields have nulls for chunk (child) fields, making them substantially different from the bulk of the documents in the index. We recommend `projectionMode` set to `skipIndexingParentDocument`. |
149
+
150
+
Selectors also have parameters.
151
+
152
+
| Selector parameters | Definition |
146
153
|-----------|------------|
147
154
|`targetIndexName`| The name of the index into which index data is projected. It's either the single chunked index with repeating parent fields, or it's the child index if you're using [separate indexes](#example-of-separate-parent-child-indexes) for parent-child content. |
148
155
|`parentKeyFieldName`| The name of the field providing the key for the parent document.|
149
156
|`sourceContext`| The enrichment annotation that defines the granularity at which to map data into individual search documents. For more information, see [Skill context and input annotation language](cognitive-search-skill-annotation-language.md). |
150
157
|`mappings`| An array of mappings of enriched data to fields in the search index. Each mapping consists of: <br>`name`: The name of the field in the search index that the data should be indexed into. <br>`source`: The enrichment annotation path that the data should be pulled from. <br><br>Each `mapping` can also recursively define data with an optional `sourceContext` and `inputs` field, similar to the [knowledge store](knowledge-store-concept-intro.md) or [Shaper Skill](cognitive-search-skill-shaper.md). Depending on your application, these parameters allow you to shape data into fields of type `Edm.ComplexType` in the search index. Some LLMs don't accept a complex type in search results, so the LLM you're using determines whether a complex type mapping is helpful or not.|
151
158
152
-
You must explicitly map every field in the child index, except for the ID fields such as document key and the parent ID.
159
+
The `mappings` parameter is important. You must explicitly map every field in the child index, except for the ID fields such as document key and the parent ID.
153
160
154
161
This requirement is in contrast with other field mapping conventions in Azure AI Search. For some data source types, the indexer can implicitly map fields based on similar names, or known characteristics (for example, blob indexers use the unique metadata storage path as the default document key). However, for indexer projections, you must explicitly specify every field mapping on the "many" side of the relationship.
155
162
@@ -226,8 +233,7 @@ For .NET developers, use the [IndexProjections Class](/dotnet/api/azure.search.d
226
233
227
234
---
228
235
229
-
> [!TIP]
230
-
> We recommend setting the `skipIndexingParentDocuments` parameter for the consolidated schema scenario. If you don't set parameters for skipping parent document indexing, you get extra search documents in your index that are null for chunks, but populated with parent fields only. For example, if five documents contribute 100 chunks to the index, then the number of documents in the index is 105. The five documents created or parent fields have nulls for child fields, making them substantially different from the bulk of the documents in the index.
236
+
As a best practice, we recommend setting the `skipIndexingParentDocuments` parameter for the consolidated schema scenario. If you don't set parameters for skipping parent document indexing, you get extra search documents in your index that are null for chunks, but populated with parent fields only. For example, if five documents contribute 100 chunks to the index, then the number of documents in the index is 105. The five documents created or parent fields have nulls for child fields, making them substantially different from the bulk of the documents in the index.
0 commit comments