Skip to content

Comments

Esql mv rerank#7

Closed
afoucret wants to merge 79 commits intoesql-mv-rerank-pocfrom
esql-mv-rerank
Closed

Esql mv rerank#7
afoucret wants to merge 79 commits intoesql-mv-rerank-pocfrom
esql-mv-rerank

Conversation

@afoucret
Copy link
Owner

@afoucret afoucret commented Jan 13, 2026

  • Have you signed the contributor license agreement?
  • Have you followed the contributor guidelines?
  • If submitting code, have you built your formula locally prior to submission with gradle check?
  • If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
  • If submitting code, have you checked that your submission is for an OS and architecture that we support?
  • If you are submitting this code for a class then read our policy for that.

Note

  • ES|QL: Rerank operator extended for multi‑value fields with new examples (incl. TOP_SNIPPETS); vector similarity functions and TEXT_EMBEDDING/KNN docs promoted to GA; tutorial and metadata docs updated; changelog entries added.
  • Benchmarks: New TSDB codec encode/decode benchmarks for multiple data patterns; existing benchmarks refactored to use dynamic getBlockSize().
  • Compression/Histogram: Upgrade native zstd to 1.5.7; add ExponentialHistogramUtils.removeMergeNoise helper and tests; expose DEFAULT_MAX_HISTOGRAM_BUCKETS.
  • Allocator: Internal refactor in BalancedShardsAllocator/NodeSorter to carry threshold via sorter.
  • REST API: Remove project_routing param from search and async_search.submit specs.
  • Docker/IronBank: Add IronBank Dockerfile, switch path in updatecli; bump UBI base tag to 9.7; update Wolfi/FIPS image digests; hardening manifest updated.
  • Tests: TSDB synthetic IDs snapshot/restore coverage; snapshot shutdown progress logging by node role; snapshot metrics tweaks; muted tests list updated.
  • Docs/Release notes: BBQ and ILM docs tweaks; known issues; mark 9.2.4/9.1.10 released.

Written by Cursor Bugbot for commit 1335684. This will update automatically on new commits. Configure here.

fang-xing-esql and others added 21 commits January 14, 2026 07:46
…stic#139058)

* remove implicit limit appended to each subquery
Continuation of elastic#139797, adding more tests for timeseries
…stic#140562)

Update IronBank Dockerfile path in updatecli configuration and bump the oblt-updatecli-policies/ironbank/templates version.
…astic#140528)

This PR removes the snapshot protection of FAIL and NULLIFY options for unmapped fields (only LOAD remains protected under snapshot).

Follow up to elastic#140463.
Related: elastic#138888.
Examples of queries that are supported now:
* `network.bytes_in * 8`
* `network.eth0.rx + network.eth0.tx`
* `max(network.total_bytes_in) * 8`
* `network.total_bytes_in{cluster!="prod"} / network.total_bytes_in{cluster!="staging"}`

Follow-up from elastic#140135
…-spec:string.Url_encode_component tests with table reads} elastic#140621
Test verifies that we can still search by id and all documents are
present after restoring index from snapshot.
…stManyRandomTextFieldsInSubqueryIntermediateResultsWithSortManyFields elastic#140664
Mikep86 and others added 8 commits January 14, 2026 11:35
Makes GetInferenceFieldsAction an indices action dependent on the indices read permission. This allows the action to be executed by users with read access to the indices queried.

---------

Co-authored-by: Elliot Barlas <elliot.barlas@elastic.co>
* Finalize docs for v9.1.10 release

* Update breaking-changes.md

* Fix heading formatting for deprecations in release notes

* Update index.md

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Charlotte Hoblik <116336412+charlotte-hoblik@users.noreply.github.com>
* Finalize docs for v9.2.4 release

* Update breaking changes for version 9.1.10 and 9.2.4

* Update deprecations.md

* Revise release notes for Elasticsearch 9.2.4

Updated release notes to reflect changes from version 9.1.10 to 9.2.4, including features, enhancements, and fixes.

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Charlotte Hoblik <116336412+charlotte-hoblik@users.noreply.github.com>
…stic#140183)

We have observed some edge cases where many inference failures can cause OOMs in ShardBulkInferenceActionFilter. This PR addresses this edge case by deduplicating the failures stored in memory.
…40625)

This is already called out, but only at the very end in a section. This
adds it right underneath the documentation for the
`max_primary_shard_docs` configuration parameter.
afoucret and others added 27 commits January 15, 2026 14:12
Zstd version 1.5.7 improved the decompression speed for small blocks: https://github.com/facebook/zstd/releases/tag/v1.5.7 . We limit binary doc value blocks to a maximum of 1024 values, to reduce the performance impact of decompressing a whole block when only a few values are needed. For small values, this can result in small blocks, which are inefficient to decompress. The Zstd improvement will help mitigate this issue.
During hollowing, segment info files (.si) are replicated into the hollow commit blob which can trigger GETs to referenced blobs.
This patch optimize the process by reading segment info from memory instead of performing GET requests.
An issue is that segment info serialization is not deterministic as segment info map fields serialization is linked to their internal order which can change. To solve that problem, the patch enforce a map serialization order for segment info map fields (diagnostics and attributes).

Closes ES-13399
Modifies SnapshotShardsService to stop logging snapshot shutting down
progress on search nodes on serverless, since they do not have snapshots
. This limits the functionality to indexing nodes only.

Relates: ES-13363
…s testPushDownMetadataTierInAndNotOperator {default} elastic#140752
…GroupingAggregatorFunctionTests testSimpleWithCranky elastic#140763
Backports for elastic#139910
were released in different releases, causing some upgrade paths to be
broken. This commit adds a note about a failure that can occur between
9.1.10 and 9.2.4.
This PR updates the FlattenedFieldMapper to use binary doc values instead
of sorted set doc values
@afoucret afoucret closed this Jan 15, 2026
Level.INFO,
"Shard snapshot completion stats since shutdown began*"
);
snapshotShutdownProgressTrackerToNotRunExpectation.awaitMatched(1000);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test expectations never registered with mock logger

Medium Severity

The PatternNotSeenEventExpectation objects in testStatefulNodesThatDoNotContainDataDoesNotLogSnapshotShuttingDownProgress and testStatefulCoordinatingOnlyNodeDoesNotLogSnapshotShuttingDownProgress are created as local variables but never added to mockLog via mockLog.addExpectation(...). When mockLog.assertAllExpectationsMatched() is called at the end, it only checks expectations in its internal list, which doesn't include these local expectations. The tests will always pass without actually verifying that log messages were not produced.

Additional Locations (1)

Fix in Cursor Fix in Web

ioanatia pushed a commit that referenced this pull request Jan 27, 2026
…tic#140027)

This PR fixes the issue where `INLINE STATS GROUP BY null` was being
incorrectly pruned by `PruneLeftJoinOnNullMatchingField`.

Fixes elastic#139887

## Problem For query:

```
FROM employees
| INLINE STATS c = COUNT(*) BY n = null
| KEEP c, n
| LIMIT 3
```

During `LogicalPlanOptimizer`:

```
Limit[3[INTEGER],false,false]
\_EsqlProject[[c{r}#2, n{r}#4]]
  \_InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
    |_Eval[[null[NULL] AS n#4]]
    | \_EsRelation[employees][<no-fields>{r$}#7]
    \_Aggregate[[n{r}#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}#4]]
      \_StubRelation[[<no-fields>{r$}#7, n{r}#4]]
```

The following join node:

```
InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
|_Eval[[null[NULL] AS n#4]]
| \_EsRelation[employees][<no-fields>{r$}#7]
\_Aggregate[[n{r}#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}#4]]
  \_StubRelation[[<no-fields>{r$}#7, n{r}#4]]
```

should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the
right side is an `Aggregate` (originating from `INLINE STATS`). Since
`STATS` supports `GROUP BY null`, the join key being null is a valid use
case. Pruning this join would incorrectly eliminate the aggregation
results, changing the query semantics.

During `LocalLogicalPlanOptimizer`:

```
ProjectExec[[c{r}#2, n{r}#4]]
\_LimitExec[3[INTEGER],null]
  \_ExchangeExec[[c{r}#2, n{r}#4],false]
    \_FragmentExec[filter=null, estimatedRowSize=0, reducer=[], fragment=[<>
Project[[c{r}#2, n{r}#4]]
\_Limit[3[INTEGER],false,false]
  \_InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
    |_Eval[[null[NULL] AS n#4]]
    | \_EsRelation[employees][<no-fields>{r$}#7]
    \_LocalRelation[[c{r}#2, n{r}#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}]<>]]
```

The following join node:

```
InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
|_Eval[[null[NULL] AS n#4]]
| \_EsRelation[employees][<no-fields>{r$}#7]
\_LocalRelation[[c{r}#2, n{r}#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}]
```

should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the
right side is a `LocalRelation` (the `Aggregate` was optimized into a
`LocalRelation` containing the pre-computed aggregation results).
Pruning this join when the join key is null would discard the valid
aggregation results stored in the `LocalRelation`, incorrectly producing
null values instead of the expected count.

## Solution The fix ensures that `PruneLeftJoinOnNullMatchingField` only
applies to `LOOKUP JOIN` nodes, where `join.right()` is an `EsRelation`.
For `INLINE STATS` joins, the right side can be:

 - `Aggregate` (before optimization), or
 - `LocalRelation` (after the aggregate is optimized)

By checking `join.right() instanceof EsRelation`, we correctly skip the
pruning optimization for `INLINE STATS` joins, preserving the expected
query results when grouping by null.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.