-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Fix date fields sort formatting with missing values #135899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix date fields sort formatting with missing values #135899
Conversation
Hi @carlosdelest, I've created a changelog YAML for you. |
Hi @carlosdelest, I've updated the changelog YAML for you. |
…-null-values' into bugfix/sort-date-formatter-null-values
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
PR looks good and makes sense. I may be misunderstanding the min value though does this mean we would sort desc min value (as epoch time) before an actual indexed date that's prior to the epoch. And if so is that a problem? It wasn't entirely clear to me from the original issue when we specifically encounter Long.MIN_VALUE or Long.MAX_VALUE. |
Those values are encountered when we're sorting values and the field value for a doc is missing. Then, it is replaced by the max / min value possible depending on the sort order (missing: _first or missing: _last) so it is ordered accordingly. The trick here is that Lucene does not have date field data type, so we're using Long to store them. Hence the returned values make sense from a Long perspective but not from a Date perspective. Hope it makes sense! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this works, but I am slightly worried about backwards compatiblity.
Negative dates are valid, dates indexed in the index can be from before 1970, and with this change, missing values will no longer be sorted to before 1970.
I understand - but right now this use case doesn't work at all, as it is a 400 error. So users who encountered this will have navigated around this problem by using a different format or not formatting sorted results.
I see your point. Missing values will be sorted correctly, but it's a possibility that the date is less than the epoch and thus doesn't represent the actual value. The only other options I see for correctly formatting dates that are less than the epoch would be:
What do you think? |
Closes #81960
When formatting date types, take into account
Long.MAX_VALUE
andLong.MIN_VALUE
. These values can be present when formatting sort values, and the value is missing.Long.MIN_VALUE
is transformed into the epoch, andLong.MAX_VALUE
into "9999-12-31T23:59:59.999999999Z". These values have been chosen as the safest for the predefined date formats, as years are at most represented with 4 digits.This doesn't prevent custom formats to fail, in which cause an IAE will be thrown.