You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-one-to-many-blobs.md
+24Lines changed: 24 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -112,6 +112,30 @@ If you do want to set up an explicit field mapping, make sure that the _sourceFi
112
112
> [!NOTE]
113
113
> The approach used by `AzureSearch_DocumentKey` of ensuring uniqueness per extracted entity is subject to change and therefore you should not rely on it's value for your application's needs.
114
114
115
+
## Specifiying the index key field in your data
116
+
117
+
Assuming the same index definition as the previous example and **parsingMode** is set to `jsonLines` without specifying any explicit field mappings so the mappings look like in the first example, suppose your blob container has blobs with the following structure:
118
+
119
+
_Blob1.json_
120
+
121
+
```json
122
+
id, temperature, pressure, timestamp
123
+
1, 100, 100,"2019-02-13T00:00:00Z"
124
+
2, 33, 30,"2019-02-14T00:00:00Z"
125
+
```
126
+
127
+
_Blob2.json_
128
+
129
+
```json
130
+
id, temperature, pressure, timestamp
131
+
1, 1, 1,"2018-01-12T00:00:00Z"
132
+
2, 120, 3,"2013-05-11T00:00:00Z"
133
+
```
134
+
135
+
Notice that `id` is set as the target field with the source `AzureSearch_DocumentKey` but the field is also present in the structure of the blobs. In this case, the key taken from each json line will be used as the document key instead of the unique identifier that would be generated for `AzureSearch_DocumentKey`.
136
+
137
+
Similar to the example above, this mapping will _not_ result in 4 documents showing up in the index, because the `id` field is not unique _across blobs_. When this is the case, any json entry that specifies an `id` will result in a merge on the existing document instead of an upload of a new document, and the state of the index will reflect the latest read entry with the specified `id`.
138
+
115
139
## Next steps
116
140
117
141
If you aren't already familiar with the basic structure and workflow of blob indexing, you should review [Indexing Azure Blob Storage with Azure Cognitive Search](search-howto-index-json-blobs.md) first. For more information about parsing modes for different blob content types, review the following articles.
0 commit comments