Skip to content

Conversation

carlosdelest
Copy link
Member

@carlosdelest carlosdelest commented Oct 3, 2025

Closes #81960

When formatting date types, take into account Long.MAX_VALUE and Long.MIN_VALUE. These values can be present when formatting sort values, and the value is missing.

Long.MIN_VALUE is transformed into the epoch, and Long.MAX_VALUE into "9999-12-31T23:59:59.999999999Z". These values have been chosen as the safest for the predefined date formats, as years are at most represented with 4 digits.

This doesn't prevent custom formats to fail, in which cause an IAE will be thrown.

@carlosdelest carlosdelest added >bug v9.2.0 auto-backport Automatically create backport pull requests when merged v8.19.6 v9.1.6 v9.0.9 Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch :Search Relevance/Search Catch all for Search Relevance labels Oct 3, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @carlosdelest, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @carlosdelest, I've updated the changelog YAML for you.

elasticsearchmachine added 2 commits October 3, 2025 12:39
…-null-values' into bugfix/sort-date-formatter-null-values
@carlosdelest carlosdelest marked this pull request as ready for review October 3, 2025 13:59
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@john-wagster
Copy link
Contributor

PR looks good and makes sense. I may be misunderstanding the min value though does this mean we would sort desc min value (as epoch time) before an actual indexed date that's prior to the epoch. And if so is that a problem? It wasn't entirely clear to me from the original issue when we specifically encounter Long.MIN_VALUE or Long.MAX_VALUE.

@carlosdelest
Copy link
Member Author

@john-wagster

It wasn't entirely clear to me from the original issue when we specifically encounter Long.MIN_VALUE or Long.MAX_VALUE.

Those values are encountered when we're sorting values and the field value for a doc is missing. Then, it is replaced by the max / min value possible depending on the sort order (missing: _first or missing: _last) so it is ordered accordingly.

The trick here is that Lucene does not have date field data type, so we're using Long to store them. Hence the returned values make sense from a Long perspective but not from a Date perspective.

Hope it makes sense!

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this works, but I am slightly worried about backwards compatiblity.

Negative dates are valid, dates indexed in the index can be from before 1970, and with this change, missing values will no longer be sorted to before 1970.

@carlosdelest
Copy link
Member Author

I am slightly worried about backwards compatiblity.

I understand - but right now this use case doesn't work at all, as it is a 400 error. So users who encountered this will have navigated around this problem by using a different format or not formatting sorted results.

Negative dates are valid, dates indexed in the index can be from before 1970, and with this change, missing values will no longer be sorted to before 1970.

I see your point. Missing values will be sorted correctly, but it's a possibility that the date is less than the epoch and thus doesn't represent the actual value.

The only other options I see for correctly formatting dates that are less than the epoch would be:

  • Having custom min / max values depending on the format used - so the standard date formatters have a specific value for missing values depending on the sort and is used for formatting.
  • Use null as a formatted value in these cases - which can potentially break clients that expect a formatted value.
  • Do not format the value in these cases and return the raw long value - again, can potentially break clients that expect a formatted value.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Search Relevance/Search Catch all for Search Relevance Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.19.6 v9.0.9 v9.1.6 v9.2.0 v9.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ascending sort with missing _first fails on datefields with missing values
4 participants