|
1 | 1 | ---
|
| 2 | +applies_to: |
| 3 | + stack: |
| 4 | + serverless: |
| 5 | +products: |
| 6 | + - id: elasticsearch |
2 | 7 | navigation_title: "Sparse vector"
|
3 | 8 | mapped_pages:
|
4 | 9 | - https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html
|
@@ -91,24 +96,81 @@ GET my-index-000001/_search
|
91 | 96 | }
|
92 | 97 | ```
|
93 | 98 |
|
94 |
| -::::{note} |
95 |
| -`sparse_vector` fields can not be included in indices that were **created** on {{es}} versions between 8.0 and 8.10 |
96 |
| -:::: |
| 99 | +## Updating `sparse_vector` fields |
97 | 100 |
|
| 101 | +When using the [Update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) with the `doc` parameter, `sparse_vector` fields behave like nested objects and are **merged** rather than replaced. This means: |
98 | 102 |
|
99 |
| -::::{note} |
100 |
| -`sparse_vector` fields only support strictly positive values. Negative values will be rejected. |
101 |
| -:::: |
| 103 | +- Existing tokens in the sparse vector are preserved |
| 104 | +- New tokens are added |
| 105 | +- Tokens present in both the existing and new data will have their values updated |
102 | 106 |
|
| 107 | +This is different from primitive array fields (like `keyword`), which are replaced entirely during updates. |
103 | 108 |
|
104 |
| -::::{note} |
105 |
| -`sparse_vector` fields do not support [analyzers](docs-content://manage-data/data-store/text-analysis.md), querying, sorting or aggregating. They may only be used within specialized queries. The recommended query to use on these fields are [`sparse_vector`](/reference/query-languages/query-dsl/query-dsl-sparse-vector-query.md) queries. They may also be used within legacy [`text_expansion`](/reference/query-languages/query-dsl/query-dsl-text-expansion-query.md) queries. |
106 |
| -:::: |
| 109 | +### Example of merging behavior |
107 | 110 |
|
| 111 | +Original document: |
108 | 112 |
|
109 |
| -::::{note} |
110 |
| -`sparse_vector` fields only preserve 9 significant bits for the precision, which translates to a relative error of about 0.4%. |
111 |
| -:::: |
| 113 | +```console |
| 114 | +PUT /my-index/_doc/1 |
| 115 | +{ |
| 116 | + "my_vector": { |
| 117 | + "token_a": 0.5, |
| 118 | + "token_b": 0.8 |
| 119 | + } |
| 120 | +} |
| 121 | +``` |
| 122 | + |
| 123 | +Partial update: |
| 124 | + |
| 125 | +```console |
| 126 | +POST /my-index/_update/1 |
| 127 | +{ |
| 128 | + "doc": { |
| 129 | + "my_vector": { |
| 130 | + "token_c": 0.3 |
| 131 | + } |
| 132 | + } |
| 133 | +} |
| 134 | +``` |
| 135 | + |
| 136 | +Observe that tokens are merged, not replaced: |
| 137 | + |
| 138 | +```json |
| 139 | +{ |
| 140 | + "my_vector": { |
| 141 | + "token_a": 0.5, |
| 142 | + "token_b": 0.8, |
| 143 | + "token_c": 0.3 |
| 144 | + } |
| 145 | +} |
| 146 | +``` |
| 147 | + |
| 148 | +### Replacing the entire `sparse_vector` field |
| 149 | + |
| 150 | +To replace the entire contents of a `sparse_vector` field, use a [script](docs-content://explore-analyze/scripting/modules-scripting-using.md) in your update request: |
| 151 | + |
| 152 | +```console |
| 153 | +POST /my-index/_update/1 |
| 154 | +{ |
| 155 | + "script": { |
| 156 | + "source": "ctx._source.my_vector = params.new_vector", |
| 157 | + "params": { |
| 158 | + "new_vector": { |
| 159 | + "token_x": 1.0, |
| 160 | + "token_y": 0.6 |
| 161 | + } |
| 162 | + } |
| 163 | + } |
| 164 | +} |
| 165 | +``` |
112 | 166 |
|
| 167 | +:::{note} |
| 168 | +This same merging behavior also applies to [`rank_features` fields](/reference/elasticsearch/mapping-reference/rank-features.md), because they are also object-like structures. |
| 169 | +::: |
113 | 170 |
|
| 171 | +## Important notes and limitations |
114 | 172 |
|
| 173 | +- `sparse_vector` fields cannot be included in indices that were **created** on {{es}} versions between 8.0 and 8.10 |
| 174 | +- `sparse_vector` fields only support strictly positive values. Negative values will be rejected. |
| 175 | +- `sparse_vector` fields do not support [analyzers](docs-content://manage-data/data-store/text-analysis.md), querying, sorting or aggregating. They may only be used within specialized queries. The recommended query to use on these fields are [`sparse_vector`](/reference/query-languages/query-dsl/query-dsl-sparse-vector-query.md) queries. They may also be used within legacy [`text_expansion`](/reference/query-languages/query-dsl/query-dsl-text-expansion-query.md) queries. |
| 176 | +- `sparse_vector` fields only preserve 9 significant bits for the precision, which translates to a relative error of about 0.4%. |
0 commit comments