You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-one-to-many-blobs.md
+24Lines changed: 24 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -112,6 +112,30 @@ If you do want to set up an explicit field mapping, make sure that the _sourceFi
112
112
> [!NOTE]
113
113
> The approach used by `AzureSearch_DocumentKey` of ensuring uniqueness per extracted entity is subject to change and therefore you should not rely on it's value for your application's needs.
114
114
115
+
## Specifiying the index key field in your data
116
+
117
+
Assuming the same index definition as the previous example and **parsingMode** is set to `jsonLines` without specifying any explicit field mappings so the mappings look like in the first example, suppose your blob container has blobs with the following structure:
118
+
119
+
_Blob1.json_
120
+
121
+
```json
122
+
id, temperature, pressure, timestamp
123
+
1, 100, 100,"2019-02-13T00:00:00Z"
124
+
2, 33, 30,"2019-02-14T00:00:00Z"
125
+
```
126
+
127
+
_Blob2.json_
128
+
129
+
```json
130
+
id, temperature, pressure, timestamp
131
+
1, 1, 1,"2018-01-12T00:00:00Z"
132
+
2, 120, 3,"2013-05-11T00:00:00Z"
133
+
```
134
+
135
+
Notice that each document contains the `id` field, which is defined as the `key` field in the index. In such a case, even though a document-unique `AzureSearch_DocumentKey` will be generated, it won't be used as the "key" for the document. Rather, the value of the `id` field will be mapped to the `key` field
136
+
137
+
Similar to the example above, this mapping will _not_ result in four documents showing up in the index, because the `id` field is not unique _across blobs_. When this is the case, any json entry that specifies an `id` will result in a merge on the existing document instead of an upload of a new document, and the state of the index will reflect the latest read entry with the specified `id`.
138
+
115
139
## Next steps
116
140
117
141
If you aren't already familiar with the basic structure and workflow of blob indexing, you should review [Indexing Azure Blob Storage with Azure Cognitive Search](search-howto-index-json-blobs.md) first. For more information about parsing modes for different blob content types, review the following articles.
0 commit comments