Skip to content

Commit ee49073

Browse files
Merge branch 'main' into indexLike_final
2 parents 081b473 + baf4f2f commit ee49073

File tree

28 files changed

+665
-79
lines changed

28 files changed

+665
-79
lines changed

docs/changelog/130705.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 130705
2+
summary: Fix `BytesRef2BlockHash`
3+
area: ES|QL
4+
type: bug
5+
issues: []

docs/changelog/130834.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 130834
2+
summary: Ensure vectors are always included in reindex actions
3+
area: Vector Search
4+
type: enhancement
5+
issues: []

docs/reference/elasticsearch/mapping-reference/sparse-vector.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ Parameters for `index_options` are:
8383
: (Optional, float) Tokens whose weight is less than `tokens_weight_threshold` are considered insignificant and pruned. This value must be between 0 and 1. Default: `0.4`.
8484

8585
::::{note}
86-
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSERv2 that provided the most optimal results.
86+
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSERv2 that provided the optimal results.
8787
::::
8888

8989
When token pruning is applied, non-significant tokens will be pruned from the query.

docs/reference/elasticsearch/rest-apis/retrievers.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1200,8 +1200,8 @@ Note, however, that wildcard field patterns will only resolve to fields that eit
12001200
12011201
### Examples
12021202
1203-
<!-- - [RRF with the multi-field query format](docs-content://solutions/search/retrievers-examples.md#retrievers-examples-rrf-multi-field-query-format) -->
1204-
<!-- - [Linear retriever with the multi-field query format](docs-content://solutions/search/retrievers-examples.md#retrievers-examples-linear-multi-field-query-format) -->
1203+
- [RRF with the multi-field query format](docs-content://solutions/search/retrievers-examples.md#retrievers-examples-rrf-multi-field-query-format)
1204+
- [Linear retriever with the multi-field query format](docs-content://solutions/search/retrievers-examples.md#retrievers-examples-linear-multi-field-query-format)
12051205
12061206
## Common usage guidelines [retriever-common-parameters]
12071207

docs/reference/query-languages/query-dsl/query-dsl-sparse-vector-query.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ GET _search
8080
: (Optional, boolean) If `true` we only input pruned tokens into scoring, and discard non-pruned tokens. It is strongly recommended to set this to `false` for the main query, but this can be set to `true` for a rescore query to get more relevant results. Default: `false`.
8181

8282
::::{note}
83-
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSERv2 that provided the most optimal results.
83+
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSERv2 that provided the optimal results.
8484
::::
8585

8686
When token pruning is applied, non-significant tokens will be pruned from the query.

docs/reference/query-languages/query-dsl/query-dsl-text-expansion-query.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ GET _search
6969
: (Optional, boolean) [preview] If `true` we only input pruned tokens into scoring, and discard non-pruned tokens. It is strongly recommended to set this to `false` for the main query, but this can be set to `true` for a rescore query to get more relevant results. Default: `false`.
7070

7171
::::{note}
72-
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSER that provided the most optimal results.
72+
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSER that provided the optimal results.
7373
::::
7474

7575

docs/reference/query-languages/query-dsl/query-dsl-weighted-tokens-query.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ POST _search
6666
: (Optional, boolean) If `true` we only input pruned tokens into scoring, and discard non-pruned tokens. It is strongly recommended to set this to `false` for the main query, but this can be set to `true` for a rescore query to get more relevant results. Default: `false`.
6767

6868
::::{note}
69-
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSER that provided the most optimal results.
69+
The default values for `tokens_freq_ratio_threshold` and `tokens_weight_threshold` were chosen based on tests using ELSER that provided the optimal results.
7070
::::
7171

7272

modules/mapper-extras/src/main/java/org/elasticsearch/index/mapper/extras/MatchOnlyTextFieldMapper.java

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
import org.apache.lucene.document.Field;
1515
import org.apache.lucene.document.FieldType;
1616
import org.apache.lucene.document.StoredField;
17-
import org.apache.lucene.index.DocValues;
1817
import org.apache.lucene.index.IndexOptions;
1918
import org.apache.lucene.index.LeafReaderContext;
2019
import org.apache.lucene.index.Term;
@@ -48,6 +47,7 @@
4847
import org.elasticsearch.index.mapper.BlockStoredFieldsReader;
4948
import org.elasticsearch.index.mapper.DocumentParserContext;
5049
import org.elasticsearch.index.mapper.FieldMapper;
50+
import org.elasticsearch.index.mapper.MappedFieldType;
5151
import org.elasticsearch.index.mapper.MapperBuilderContext;
5252
import org.elasticsearch.index.mapper.SourceValueFetcher;
5353
import org.elasticsearch.index.mapper.StringFieldType;
@@ -254,7 +254,8 @@ private IOFunction<LeafReaderContext, CheckedIntFunction<List<Object>, IOExcepti
254254
if (parent.isStored()) {
255255
return storedFieldFetcher(parentField);
256256
} else if (parent.hasDocValues()) {
257-
return docValuesFieldFetcher(parentField);
257+
var ifd = searchExecutionContext.getForField(parent, MappedFieldType.FielddataOperation.SEARCH);
258+
return docValuesFieldFetcher(ifd);
258259
} else {
259260
assert false : "parent field should either be stored or have doc values";
260261
}
@@ -266,7 +267,8 @@ private IOFunction<LeafReaderContext, CheckedIntFunction<List<Object>, IOExcepti
266267
if (fieldType.isStored()) {
267268
return storedFieldFetcher(fieldType.name());
268269
} else if (fieldType.hasDocValues()) {
269-
return docValuesFieldFetcher(fieldType.name());
270+
var ifd = searchExecutionContext.getForField(fieldType, MappedFieldType.FielddataOperation.SEARCH);
271+
return docValuesFieldFetcher(ifd);
270272
} else {
271273
assert false : "multi field should either be stored or have doc values";
272274
}
@@ -291,15 +293,16 @@ private IOFunction<LeafReaderContext, CheckedIntFunction<List<Object>, IOExcepti
291293
};
292294
}
293295

294-
private static IOFunction<LeafReaderContext, CheckedIntFunction<List<Object>, IOException>> docValuesFieldFetcher(String name) {
296+
private static IOFunction<LeafReaderContext, CheckedIntFunction<List<Object>, IOException>> docValuesFieldFetcher(
297+
IndexFieldData<?> ifd
298+
) {
295299
return context -> {
296-
var sortedDocValues = DocValues.getSortedSet(context.reader(), name);
300+
var sortedBinaryDocValues = ifd.load(context).getBytesValues();
297301
return docId -> {
298-
if (sortedDocValues.advanceExact(docId)) {
299-
var values = new ArrayList<>(sortedDocValues.docValueCount());
300-
for (int i = 0; i < sortedDocValues.docValueCount(); i++) {
301-
long ord = sortedDocValues.nextOrd();
302-
values.add(sortedDocValues.lookupOrd(ord).utf8ToString());
302+
if (sortedBinaryDocValues.advanceExact(docId)) {
303+
var values = new ArrayList<>(sortedBinaryDocValues.docValueCount());
304+
for (int i = 0; i < sortedBinaryDocValues.docValueCount(); i++) {
305+
values.add(sortedBinaryDocValues.nextValue().utf8ToString());
303306
}
304307
return values;
305308
} else {

modules/reindex/src/main/java/org/elasticsearch/reindex/AbstractAsyncBulkByScrollAction.java

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@
4444
import org.elasticsearch.script.Script;
4545
import org.elasticsearch.script.ScriptService;
4646
import org.elasticsearch.search.builder.SearchSourceBuilder;
47+
import org.elasticsearch.search.fetch.subphase.FetchSourceContext;
4748
import org.elasticsearch.search.sort.SortBuilder;
4849
import org.elasticsearch.threadpool.ThreadPool;
4950

@@ -119,6 +120,7 @@ public abstract class AbstractAsyncBulkByScrollAction<
119120
BulkByScrollTask task,
120121
boolean needsSourceDocumentVersions,
121122
boolean needsSourceDocumentSeqNoAndPrimaryTerm,
123+
boolean needsVectors,
122124
Logger logger,
123125
ParentTaskAssigningClient client,
124126
ThreadPool threadPool,
@@ -131,6 +133,7 @@ public abstract class AbstractAsyncBulkByScrollAction<
131133
task,
132134
needsSourceDocumentVersions,
133135
needsSourceDocumentSeqNoAndPrimaryTerm,
136+
needsVectors,
134137
logger,
135138
client,
136139
client,
@@ -146,6 +149,7 @@ public abstract class AbstractAsyncBulkByScrollAction<
146149
BulkByScrollTask task,
147150
boolean needsSourceDocumentVersions,
148151
boolean needsSourceDocumentSeqNoAndPrimaryTerm,
152+
boolean needsVectors,
149153
Logger logger,
150154
ParentTaskAssigningClient searchClient,
151155
ParentTaskAssigningClient bulkClient,
@@ -173,7 +177,7 @@ public abstract class AbstractAsyncBulkByScrollAction<
173177
bulkRetry = new Retry(BackoffPolicy.wrap(backoffPolicy, worker::countBulkRetry), threadPool);
174178
scrollSource = buildScrollableResultSource(
175179
backoffPolicy,
176-
prepareSearchRequest(mainRequest, needsSourceDocumentVersions, needsSourceDocumentSeqNoAndPrimaryTerm)
180+
prepareSearchRequest(mainRequest, needsSourceDocumentVersions, needsSourceDocumentSeqNoAndPrimaryTerm, needsVectors)
177181
);
178182
scriptApplier = Objects.requireNonNull(buildScriptApplier(), "script applier must not be null");
179183
}
@@ -186,7 +190,8 @@ public abstract class AbstractAsyncBulkByScrollAction<
186190
static <Request extends AbstractBulkByScrollRequest<Request>> SearchRequest prepareSearchRequest(
187191
Request mainRequest,
188192
boolean needsSourceDocumentVersions,
189-
boolean needsSourceDocumentSeqNoAndPrimaryTerm
193+
boolean needsSourceDocumentSeqNoAndPrimaryTerm,
194+
boolean needsVectors
190195
) {
191196
var preparedSearchRequest = new SearchRequest(mainRequest.getSearchRequest());
192197

@@ -205,6 +210,16 @@ static <Request extends AbstractBulkByScrollRequest<Request>> SearchRequest prep
205210
sourceBuilder.version(needsSourceDocumentVersions);
206211
sourceBuilder.seqNoAndPrimaryTerm(needsSourceDocumentSeqNoAndPrimaryTerm);
207212

213+
if (needsVectors) {
214+
// always include vectors in the response unless explicitly set
215+
var fetchSource = sourceBuilder.fetchSource();
216+
if (fetchSource == null) {
217+
sourceBuilder.fetchSource(FetchSourceContext.FETCH_ALL_SOURCE);
218+
} else if (fetchSource.excludeVectors() == null) {
219+
sourceBuilder.excludeVectors(false);
220+
}
221+
}
222+
208223
/*
209224
* Do not open scroll if max docs <= scroll size and not resuming on version conflicts
210225
*/

modules/reindex/src/main/java/org/elasticsearch/reindex/AsyncDeleteByQueryAction.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ public AsyncDeleteByQueryAction(
3434
ScriptService scriptService,
3535
ActionListener<BulkByScrollResponse> listener
3636
) {
37-
super(task, false, true, logger, client, threadPool, request, listener, scriptService, null);
37+
super(task, false, true, false, logger, client, threadPool, request, listener, scriptService, null);
3838
}
3939

4040
@Override

0 commit comments

Comments
 (0)