Skip to content

Commit d69d151

Browse files
authored
Merge branch 'main' into share-histo-serialization
2 parents fd65f20 + 530a85d commit d69d151

File tree

17 files changed

+809
-61
lines changed

17 files changed

+809
-61
lines changed

.claude/settings.local.json

Lines changed: 0 additions & 10 deletions
This file was deleted.

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11

2+
# claude
3+
.claude
4+
25
# intellij files
36
.idea/
47
*.iml

bin/chezmoi

-32.1 MB
Binary file not shown.

docs/changelog/129693.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 129693
2+
summary: Add top level normalizer for linear retriever
3+
area: Search
4+
type: enhancement
5+
issues: []

docs/reference/elasticsearch/rest-apis/retrievers/linear-retriever.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,16 @@ Combining `query` and `retrievers` is not supported.
3131
`normalizer` {applies_to}`stack: ga 9.1`
3232
: (Optional, String)
3333

34-
The normalizer to use when using the [multi-field query format](../retrievers.md#multi-field-query-format).
34+
The top-level normalizer to use when combining results.
3535
See [normalizers](#linear-retriever-normalizers) for supported values.
3636
Required when `query` is specified.
37+
38+
When used with the [multi-field query format](../retrievers.md#multi-field-query-format) (`query` parameter), normalizes scores per [field grouping](../retrievers.md#multi-field-field-grouping).
39+
Otherwise serves as the default normalizer for any sub-retriever that doesn't specify its own normalizer. Per-retriever normalizers always take precedence over the top-level normalizer.
40+
41+
:::{note}
42+
**Top-level normalizer support for sub-retrievers**: The ability to use a top-level normalizer as a default for sub-retrievers was introduced in Elasticsearch 9.2+. In earlier versions, only per-retriever normalizers are supported.
43+
:::
3744

3845
::::{warning}
3946
Avoid using `none` as that will disable normalization and may bias the result set towards lexical matches.
@@ -74,9 +81,10 @@ Each entry in the `retrievers` array specifies the following parameters:
7481
`normalizer`
7582
: (Optional, String)
7683

77-
Specifies how the retrievers score will be normalized before applying the specified `weight`.
84+
Specifies how the retriever's score will be normalized before applying the specified `weight`.
7885
See [normalizers](#linear-retriever-normalizers) for supported values.
79-
Defaults to `none`.
86+
If not specified, uses the top-level `normalizer` or defaults to `none` if no top-level normalizer is set.
87+
{applies_to}`stack: ga 9.2`
8088

8189
See also [this hybrid search example](retrievers-examples.md#retrievers-examples-linear-retriever) using a linear retriever on how to independently configure and apply normalizers to retrievers.
8290

@@ -94,7 +102,7 @@ The `linear` retriever supports the following normalizers:
94102
95103
## Example
96104
97-
This example of a hybrid search weights KNN results five times more heavily than BM25 results in the final ranking.
105+
This example of a hybrid search weights KNN results five times more heavily than BM25 results in the final ranking, with a top-level normalizer applied to all retrievers.
98106
99107
```console
100108
GET my_index/_search
@@ -105,23 +113,33 @@ GET my_index/_search
105113
{
106114
"retriever": {
107115
"knn": {
108-
...
116+
"field": "title_vector",
117+
"query_vector": [0.1, 0.2, 0.3],
118+
"k": 10,
119+
"num_candidates": 100
109120
}
110121
},
111122
"weight": 5 # KNN query weighted 5x
112123
},
113124
{
114125
"retriever": {
115126
"standard": {
116-
...
127+
"query": {
128+
"match": {
129+
"title": "elasticsearch"
130+
}
131+
}
117132
}
118133
},
119134
"weight": 1.5 # BM25 query weighted 1.5x
120135
}
121-
]
136+
],
137+
"normalizer": "minmax"
122138
}
123139
}
124140
}
125141
```
126142

143+
In this example, the `minmax` normalizer is applied to both the kNN retriever and the standard retriever. The top-level normalizer serves as a default that can be overridden by individual sub-retrievers. When using the multi-field query format, the top-level normalizer is applied to all generated inner retrievers.
144+
127145
See also [this hybrid search example](retrievers-examples.md#retrievers-examples-linear-retriever).

docs/reference/elasticsearch/rest-apis/searching-with-query-rules.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,21 @@ $$$query-rules$$$
1616
* Personalized metadata about users (e.g. country, language, etc)
1717
* A particular topic
1818
* A referring site
19-
* etc.
19+
2020

2121
Query rules define a metadata key that will be used to match the metadata provided in the [rule retriever](/reference/elasticsearch/rest-apis/retrievers/rule-retriever.md) with the criteria specified in the rule.
2222

2323
When a query rule matches the rule metadata according to its defined criteria, the query rule action is applied to the underlying `organic` query.
2424

2525
For example, a query rule could be defined to match a user-entered query string of `pugs` and a country `us` and promote adoptable shelter dogs if the rule query met both criteria.
2626

27-
Rules are defined using the [query rules API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-query_rules) and searched using the [rule retriever](/reference/elasticsearch/rest-apis/retrievers/rule-retriever.md) or the [rule query](/reference/query-languages/query-dsl/query-dsl-rule-query.md).
27+
You can create and manage query rules using either:
28+
- [Query rules API]({{es-apis}}v9/group/endpoint-query_rules)
29+
- [Query Rules UI](docs-content://solutions/search/query-rules-ui.md)
2830

31+
You can search with query rules using either:
32+
- [Retrievers syntax](/reference/elasticsearch/rest-apis/retrievers/rule-retriever.md)
33+
- [Query DSL syntax](/reference/query-languages/query-dsl/query-dsl-rule-query.md)
2934

3035
## Rule definition [query-rule-definition]
3136

@@ -68,7 +73,7 @@ The actions to take when the rule matches a query:
6873
Use `ids` when searching over a single index, and `docs` when searching over multiple indices. `ids` and `docs` cannot be combined in the same query.
6974

7075

71-
## Add query rules [add-query-rules]
76+
## Manage query rules [manage-query-rules]
7277

7378
You can add query rules using the [Create or update query ruleset](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-query-rules-put-ruleset) call. This adds a ruleset containing one or more query rules that will be applied to queries that match their specified criteria.
7479

@@ -145,6 +150,8 @@ There is a limit of 100 rules per ruleset. This can be increased up to 1000 usin
145150

146151
You can use the [Get query ruleset](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-query-rules-get-ruleset) call to retrieve the ruleset you just created, the [List query rulesets](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-query-rules-list-rulesets) call to retrieve a summary of all query rulesets, and the [Delete query ruleset](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-query-rules-delete-ruleset) call to delete a query ruleset.
147152

153+
To manage rules using the Query Rules UI, refer to [Manage query rules](https://www.elastic.co/docs/solutions/search/query-rules-ui#manage-existing-rules).
154+
148155

149156
## Search using query rules [rule-query-search]
150157

test/framework/src/main/java/org/elasticsearch/datageneration/FieldType.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,8 @@ public static FieldType tryParse(String name) {
108108
case "wildcard" -> FieldType.WILDCARD;
109109
case "passthrough" -> FieldType.PASSTHROUGH;
110110
case "match_only_text" -> FieldType.MATCH_ONLY_TEXT;
111-
default -> throw new IllegalArgumentException("Unknown field type: " + name);
111+
// Custom types will fail to parse and will return null
112+
default -> null;
112113
};
113114
}
114115

x-pack/plugin/mapper-exponential-histogram/src/test/java/org/elasticsearch/xpack/exponentialhistogram/ExponentialHistogramFieldMapperTests.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
import org.elasticsearch.plugins.Plugin;
1818
import org.elasticsearch.xcontent.XContentBuilder;
1919
import org.junit.AssumptionViolatedException;
20+
import org.junit.Before;
2021

2122
import java.io.IOException;
2223
import java.util.ArrayList;
@@ -39,6 +40,14 @@
3940

4041
public class ExponentialHistogramFieldMapperTests extends MapperTestCase {
4142

43+
@Before
44+
public void setup() {
45+
assumeTrue(
46+
"Only when exponential_histogram feature flag is enabled",
47+
ExponentialHistogramFieldMapper.EXPONENTIAL_HISTOGRAM_FEATURE.isEnabled()
48+
);
49+
}
50+
4251
protected Collection<? extends Plugin> getPlugins() {
4352
return Collections.singletonList(new ExponentialHistogramMapperPlugin());
4453
}

x-pack/plugin/mapper-exponential-histogram/src/yamlRestTest/java/org/elasticsearch/xpack/exponentialhistogram/ExponentialHistogramYamlTestSuiteIT.java

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,11 @@
1010
import com.carrotsearch.randomizedtesting.annotations.Name;
1111
import com.carrotsearch.randomizedtesting.annotations.ParametersFactory;
1212

13+
import org.elasticsearch.Build;
1314
import org.elasticsearch.test.cluster.ElasticsearchCluster;
1415
import org.elasticsearch.test.rest.yaml.ClientYamlTestCandidate;
1516
import org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase;
17+
import org.junit.Before;
1618
import org.junit.ClassRule;
1719

1820
public class ExponentialHistogramYamlTestSuiteIT extends ESClientYamlSuiteTestCase {
@@ -21,6 +23,12 @@ public ExponentialHistogramYamlTestSuiteIT(@Name("yaml") ClientYamlTestCandidate
2123
super(testCandidate);
2224
}
2325

26+
@Before
27+
public void setup() {
28+
// TODO: remove when FeatureFlag is removed and add minimum required version to yaml spec
29+
assumeTrue("Only when exponential_histogram feature flag is enabled", Build.current().isSnapshot());
30+
}
31+
2432
@ParametersFactory
2533
public static Iterable<Object[]> parameters() throws Exception {
2634
return ESClientYamlSuiteTestCase.createParameters();

x-pack/plugin/rank-rrf/src/internalClusterTest/java/org/elasticsearch/xpack/rank/linear/LinearRetrieverIT.java

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -835,4 +835,37 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
835835
);
836836
assertThat(numAsyncCalls.get(), equalTo(4));
837837
}
838+
839+
public void testMixedNormalizerInheritance() throws IOException {
840+
client().prepareIndex(INDEX).setId("1").setSource("field1", "elasticsearch only", "field2", "no technology here").get();
841+
client().prepareIndex(INDEX).setId("2").setSource("field1", "no elasticsearch", "field2", "technology only").get();
842+
client().prepareIndex(INDEX).setId("3").setSource("field1", "search term", "field2", "no technology").get();
843+
refresh(INDEX);
844+
845+
LinearRetrieverBuilder linearRetriever = new LinearRetrieverBuilder(
846+
List.of(
847+
CompoundRetrieverBuilder.RetrieverSource.from(
848+
new StandardRetrieverBuilder(QueryBuilders.matchQuery("field1", "elasticsearch"))
849+
),
850+
CompoundRetrieverBuilder.RetrieverSource.from(
851+
new StandardRetrieverBuilder(QueryBuilders.matchQuery("field2", "technology"))
852+
),
853+
CompoundRetrieverBuilder.RetrieverSource.from(new StandardRetrieverBuilder(QueryBuilders.matchQuery("field1", "search")))
854+
),
855+
null,
856+
null,
857+
MinMaxScoreNormalizer.INSTANCE,
858+
10,
859+
new float[] { 1.0f, 1.0f, 1.0f },
860+
new ScoreNormalizer[] { null, L2ScoreNormalizer.INSTANCE, null }
861+
);
862+
863+
assertThat(linearRetriever.getNormalizers()[0], equalTo(MinMaxScoreNormalizer.INSTANCE));
864+
assertThat(linearRetriever.getNormalizers()[1], equalTo(L2ScoreNormalizer.INSTANCE));
865+
assertThat(linearRetriever.getNormalizers()[2], equalTo(MinMaxScoreNormalizer.INSTANCE));
866+
867+
assertResponse(client().prepareSearch(INDEX).setSource(new SearchSourceBuilder().retriever(linearRetriever)), searchResponse -> {
868+
assertThat(searchResponse.getHits().getTotalHits().value(), equalTo(3L));
869+
});
870+
}
838871
}

0 commit comments

Comments
 (0)