Skip to content

Commit 5249133

Browse files
authored
[8.19] Add index_options to semantic_text field mappings (#119967) (#129626)
* Add index_options to semantic_text field mappings (#119967) * Add index_options parameter to semantic_text field mapping * Cleanup & tests * Update docs * Update docs/changelog/119967.yaml * Addressed some PR feedbak * Update yaml tests * Refactoring * Cleanup * Fix some tests * Hack in inferring text_embedding task type from index options * [CI] Auto commit changes from spotless * Fix error inferring model settings * Update docs * Update tests * Update docs/reference/mapping/types/semantic-text.asciidoc Co-authored-by: Mike Pellegrini <[email protected]> * Address some minor PR feedback * Remove partial model_settings with inferred task type * Cleanup * Remove unnecessary changes * Fix errors from merge * [CI] Auto commit changes from spotless * Cleanup * Checkpoint, saving changes before merge * Update parsing * [CI] Auto commit changes from spotless * Stash changes * Fix compile errors * [CI] Auto commit changes from spotless * Cleanup error * fix test * fix test * Fix another test * A bit of cleanup * Fix tests * Spotless * Respect index options if set over defaults * Cleanup * [CI] Auto commit changes from spotless * Support updating to compatible versions, add some cleanup and validation * Remove test that can't be done here - needs to be unit test * Add validation * Cleanup * Fix some yaml tests * [CI] Auto commit changes from spotless * Happy path early index validation works now; edge cases surrounding default BBQ remain * Always emit index options, even when using defaults * Minor cleanup * Fix test compilation failures * Fix some tests * Continue to iterate on test failures * Remove index options from inference field metadata as it is only needed at field creation time * Fix some tests * Remove transport version, no longer needed * Fix yaml tests * Add tests * IndexOptions don't need to implement Writeable * [CI] Auto commit changes from spotless * Refactor - move SemanticTextIndexOptions * Remove writeable * Move index_options parsing to semantic text field mapper * Cleanup * Fix test compilation issue * Cleanup * Remove whitespace * Remove writeables from index options * Disable merging null options? * Add docs * [CI] Auto commit changes from spotless * Revert "Disable merging null options?" This reverts commit 2ef8b1d. * Remove default serialization * Include default index option type to defaults * [CI] Auto commit changes from spotless * Go back to allowing null updateS * Cleanup * Fix validation error * Revert "Include default index option type to defaults" This reverts commit b08e2a1. * Update tests * Revert "Update tests" This reverts commit aedfafe. * Better fix for null inputs * Remove redundant merge validation --------- Co-authored-by: elasticsearchmachine <[email protected]> Co-authored-by: Mike Pellegrini <[email protected]> (cherry picked from commit 813814b) # Conflicts: # docs/reference/elasticsearch/mapping-reference/semantic-text.md # server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapperTests.java * Fix errors in backport merge
1 parent cbce7e1 commit 5249133

File tree

11 files changed

+1450
-166
lines changed

11 files changed

+1450
-166
lines changed

docs/changelog/119967.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 119967
2+
summary: Add `index_options` to `semantic_text` field mappings
3+
area: Mapping
4+
type: enhancement
5+
issues: [ ]

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

Lines changed: 55 additions & 44 deletions
Large diffs are not rendered by default.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
/*
2+
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
3+
* or more contributor license agreements. Licensed under the "Elastic License
4+
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
5+
* Public License v 1"; you may not use this file except in compliance with, at
6+
* your election, the "Elastic License 2.0", the "GNU Affero General Public
7+
* License v3.0 only", or the "Server Side Public License, v 1".
8+
*/
9+
10+
package org.elasticsearch.index.mapper.vectors;
11+
12+
import org.elasticsearch.xcontent.ToXContent;
13+
14+
/**
15+
* Represents general index options that can be attached to a semantic or vector field.
16+
*/
17+
public interface IndexOptions extends ToXContent {}

server/src/test/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldTypeTests.java

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -53,14 +53,14 @@ private static DenseVectorFieldMapper.RescoreVector randomRescoreVector() {
5353
return new DenseVectorFieldMapper.RescoreVector(randomBoolean() ? 0 : randomFloatBetween(1.0F, 10.0F, false));
5454
}
5555

56-
private DenseVectorFieldMapper.IndexOptions randomIndexOptionsNonQuantized() {
56+
private DenseVectorFieldMapper.DenseVectorIndexOptions randomIndexOptionsNonQuantized() {
5757
return randomFrom(
5858
new DenseVectorFieldMapper.HnswIndexOptions(randomIntBetween(1, 100), randomIntBetween(1, 10_000)),
5959
new DenseVectorFieldMapper.FlatIndexOptions()
6060
);
6161
}
6262

63-
private DenseVectorFieldMapper.IndexOptions randomIndexOptionsAll() {
63+
public static DenseVectorFieldMapper.DenseVectorIndexOptions randomIndexOptionsAll() {
6464
return randomFrom(
6565
new DenseVectorFieldMapper.HnswIndexOptions(randomIntBetween(1, 100), randomIntBetween(1, 10_000)),
6666
new DenseVectorFieldMapper.Int8HnswIndexOptions(
@@ -93,11 +93,13 @@ private DenseVectorFieldMapper.IndexOptions randomIndexOptionsAll() {
9393
);
9494
}
9595

96-
private DenseVectorFieldMapper.IndexOptions randomIndexOptionsHnswQuantized() {
96+
private DenseVectorFieldMapper.DenseVectorIndexOptions randomIndexOptionsHnswQuantized() {
9797
return randomIndexOptionsHnswQuantized(randomBoolean() ? null : randomRescoreVector());
9898
}
9999

100-
private DenseVectorFieldMapper.IndexOptions randomIndexOptionsHnswQuantized(DenseVectorFieldMapper.RescoreVector rescoreVector) {
100+
private DenseVectorFieldMapper.DenseVectorIndexOptions randomIndexOptionsHnswQuantized(
101+
DenseVectorFieldMapper.RescoreVector rescoreVector
102+
) {
101103
return randomFrom(
102104
new DenseVectorFieldMapper.Int8HnswIndexOptions(
103105
randomIntBetween(1, 100),

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
import java.util.Set;
1818

1919
import static org.elasticsearch.xpack.inference.mapper.SemanticTextFieldMapper.SEMANTIC_TEXT_EXCLUDE_SUB_FIELDS_FROM_FIELD_CAPS;
20+
import static org.elasticsearch.xpack.inference.mapper.SemanticTextFieldMapper.SEMANTIC_TEXT_INDEX_OPTIONS;
2021
import static org.elasticsearch.xpack.inference.mapper.SemanticTextFieldMapper.SEMANTIC_TEXT_SUPPORT_CHUNKING_CONFIG;
2122
import static org.elasticsearch.xpack.inference.queries.SemanticKnnVectorQueryRewriteInterceptor.SEMANTIC_KNN_FILTER_FIX;
2223
import static org.elasticsearch.xpack.inference.queries.SemanticKnnVectorQueryRewriteInterceptor.SEMANTIC_KNN_VECTOR_QUERY_REWRITE_INTERCEPTION_SUPPORTED;
@@ -70,7 +71,8 @@ public Set<NodeFeature> getTestFeatures() {
7071
SemanticTextFieldMapper.SEMANTIC_TEXT_HANDLE_EMPTY_INPUT,
7172
SEMANTIC_TEXT_SUPPORT_CHUNKING_CONFIG,
7273
SEMANTIC_TEXT_MATCH_ALL_HIGHLIGHTER,
73-
SEMANTIC_TEXT_EXCLUDE_SUB_FIELDS_FROM_FIELD_CAPS
74+
SEMANTIC_TEXT_EXCLUDE_SUB_FIELDS_FROM_FIELD_CAPS,
75+
SEMANTIC_TEXT_INDEX_OPTIONS
7476
);
7577
}
7678
}

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java

Lines changed: 147 additions & 16 deletions
Large diffs are not rendered by default.
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
/*
2+
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
3+
* or more contributor license agreements. Licensed under the Elastic License
4+
* 2.0; you may not use this file except in compliance with the Elastic License
5+
* 2.0.
6+
*/
7+
8+
package org.elasticsearch.xpack.inference.mapper;
9+
10+
import org.elasticsearch.ElasticsearchException;
11+
import org.elasticsearch.common.Strings;
12+
import org.elasticsearch.common.xcontent.support.XContentMapValues;
13+
import org.elasticsearch.index.IndexVersion;
14+
import org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper;
15+
import org.elasticsearch.index.mapper.vectors.IndexOptions;
16+
import org.elasticsearch.xcontent.ToXContent;
17+
import org.elasticsearch.xcontent.XContentBuilder;
18+
19+
import java.io.IOException;
20+
import java.util.Arrays;
21+
import java.util.Locale;
22+
import java.util.Map;
23+
24+
/**
25+
* Represents index options for a semantic_text field.
26+
* We represent semantic_text index_options as nested within their respective type. For example:
27+
* "index_options": {
28+
* "dense_vector": {
29+
* "type": "bbq_hnsw
30+
* }
31+
* }
32+
*/
33+
public class SemanticTextIndexOptions implements ToXContent {
34+
35+
private static final String TYPE_FIELD = "type";
36+
37+
private final SupportedIndexOptions type;
38+
private final IndexOptions indexOptions;
39+
40+
public SemanticTextIndexOptions(SupportedIndexOptions type, IndexOptions indexOptions) {
41+
this.type = type;
42+
this.indexOptions = indexOptions;
43+
}
44+
45+
public SupportedIndexOptions type() {
46+
return type;
47+
}
48+
49+
public IndexOptions indexOptions() {
50+
return indexOptions;
51+
}
52+
53+
public enum SupportedIndexOptions {
54+
DENSE_VECTOR("dense_vector") {
55+
@Override
56+
public IndexOptions parseIndexOptions(String fieldName, Map<String, Object> map, IndexVersion indexVersion) {
57+
return parseDenseVectorIndexOptionsFromMap(fieldName, map, indexVersion);
58+
}
59+
};
60+
61+
public final String value;
62+
63+
SupportedIndexOptions(String value) {
64+
this.value = value;
65+
}
66+
67+
public abstract IndexOptions parseIndexOptions(String fieldName, Map<String, Object> map, IndexVersion indexVersion);
68+
69+
public static SupportedIndexOptions fromValue(String value) {
70+
return Arrays.stream(SupportedIndexOptions.values())
71+
.filter(option -> option.value.equals(value))
72+
.findFirst()
73+
.orElseThrow(() -> new IllegalArgumentException("Unknown index options type [" + value + "]"));
74+
}
75+
}
76+
77+
@Override
78+
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
79+
builder.startObject();
80+
builder.field(type.value.toLowerCase(Locale.ROOT));
81+
indexOptions.toXContent(builder, params);
82+
builder.endObject();
83+
return builder;
84+
}
85+
86+
@Override
87+
public String toString() {
88+
return Strings.toString(this);
89+
}
90+
91+
private static DenseVectorFieldMapper.DenseVectorIndexOptions parseDenseVectorIndexOptionsFromMap(
92+
String fieldName,
93+
Map<String, Object> map,
94+
IndexVersion indexVersion
95+
) {
96+
try {
97+
Object type = map.remove(TYPE_FIELD);
98+
if (type == null) {
99+
throw new IllegalArgumentException("Required " + TYPE_FIELD);
100+
}
101+
DenseVectorFieldMapper.VectorIndexType vectorIndexType = DenseVectorFieldMapper.VectorIndexType.fromString(
102+
XContentMapValues.nodeStringValue(type, null)
103+
).orElseThrow(() -> new IllegalArgumentException("Unsupported index options " + TYPE_FIELD + " " + type));
104+
105+
return vectorIndexType.parseIndexOptions(fieldName, map, indexVersion);
106+
} catch (Exception exc) {
107+
throw new ElasticsearchException(exc);
108+
}
109+
}
110+
}

x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/mapper/SemanticInferenceMetadataFieldsMapperTests.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,5 +114,4 @@ static IndexVersion getRandomCompatibleIndexVersion(boolean useLegacyFormat) {
114114
);
115115
}
116116
}
117-
118117
}

0 commit comments

Comments
 (0)