-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Speed up sorts on secondary sort fields #137533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 30 commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
99d1db6
WIP: pluggable competitiveDISIBuilder and secondary sort-based iterator
romseygeek 45882c8
wip
romseygeek f72677c
Force use of competitive comparators
romseygeek 4e704f1
tests
romseygeek ecea998
Add competitive sort tests to MapperTestCase
romseygeek 842d14c
iter
romseygeek 3e1c2d2
Merge branch 'sort/base-fieldmapper-sort-tests' into bug/sort-still-slow
romseygeek b04481e
Add competitive iterator check for logsdb-style timestamp field
romseygeek 63b9329
Merge remote-tracking branch 'origin/main' into bug/sort-still-slow
romseygeek 2fd86b5
NumericComparator: immediately check whether a segment is comparative…
martijnvg f9f8191
Add indexSort method to IndexFieldData
romseygeek 08cea96
Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…
romseygeek 1eb03cd
Merge remote-tracking branch 'origin/main' into bug/sort-still-slow
romseygeek 8603d58
Update docs/changelog/137533.yaml
romseygeek 0595fff
[CI] Auto commit changes from spotless
3a933b4
spotless
romseygeek 1548a2f
Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…
romseygeek 79bf9b2
tests
romseygeek 32201c0
Merge remote-tracking branch 'origin/main' into bug/sort-still-slow
romseygeek c3b270c
[CI] Auto commit changes from spotless
dd1f994
need to add iterators back in for pruning - follow up
romseygeek 0b79c60
Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…
romseygeek 4fa6f8e
dont' advance past maxdoc
romseygeek 1249286
compilation
romseygeek 0e0e392
added logic from lucene pr
martijnvg cb9a7f5
fencepost
romseygeek e32f535
Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…
romseygeek d0a2218
only run accelerator on dense fields
romseygeek e603627
Merge remote-tracking branch 'origin/main' into bug/sort-still-slow
romseygeek 2dc921e
Revert subclassing nonsense
romseygeek 25dbeb4
Add unit test for SSI
romseygeek fe98352
Merge branch 'main' into bug/sort-still-slow
romseygeek 8c495a5
Merge branch 'main' into bug/sort-still-slow
romseygeek 7485e03
Merge remote-tracking branch 'origin/main' into bug/sort-still-slow
romseygeek 3b6d308
conflicts
romseygeek 8460a14
Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…
romseygeek File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| pr: 137533 | ||
| summary: Speed up sorts on secondary sort fields | ||
| area: Search | ||
| type: enhancement | ||
| issues: [] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
168 changes: 168 additions & 0 deletions
168
...rc/main/java/org/elasticsearch/index/fielddata/fieldcomparator/SecondarySortIterator.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,168 @@ | ||
| /* | ||
| * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| * or more contributor license agreements. Licensed under the "Elastic License | ||
| * 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side | ||
| * Public License v 1"; you may not use this file except in compliance with, at | ||
| * your election, the "Elastic License 2.0", the "GNU Affero General Public | ||
| * License v3.0 only", or the "Server Side Public License, v 1". | ||
| */ | ||
|
|
||
| package org.elasticsearch.index.fielddata.fieldcomparator; | ||
|
|
||
| import org.apache.lucene.index.DocValuesSkipper; | ||
| import org.apache.lucene.index.NumericDocValues; | ||
| import org.apache.lucene.search.DocIdSetIterator; | ||
|
|
||
| import java.io.IOException; | ||
|
|
||
| /** | ||
| * A competitive DocIdSetIterator that examines the values of a secondary | ||
| * sort field and tries to exclude documents with values outside a given | ||
| * range, using DocValueSkippers on the primary sort field to advance rapidly | ||
| * to the next block of values. | ||
| */ | ||
| class SecondarySortIterator extends DocIdSetIterator { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe add a unit test for this iterator? Maybe we can do a test duel against |
||
|
|
||
| final NumericDocValues values; | ||
|
|
||
| final DocValuesSkipper valueSkipper; | ||
| final DocValuesSkipper primaryFieldSkipper; | ||
| final long minValue; | ||
| final long maxValue; | ||
|
|
||
| int docID = -1; | ||
| boolean skipperMatch; | ||
| int primaryFieldUpTo = -1; | ||
| int valueFieldUpTo = -1; | ||
|
|
||
| SecondarySortIterator( | ||
| NumericDocValues values, | ||
| DocValuesSkipper valueSkipper, | ||
| DocValuesSkipper primaryFieldSkipper, | ||
| long minValue, | ||
| long maxValue | ||
| ) { | ||
| this.values = values; | ||
| this.valueSkipper = valueSkipper; | ||
| this.primaryFieldSkipper = primaryFieldSkipper; | ||
| this.minValue = minValue; | ||
| this.maxValue = maxValue; | ||
|
|
||
| valueFieldUpTo = valueSkipper.maxDocID(0); | ||
| primaryFieldUpTo = primaryFieldSkipper.maxDocID(0); | ||
| } | ||
|
|
||
| @Override | ||
| public int docID() { | ||
| return docID; | ||
| } | ||
|
|
||
| @Override | ||
| public int nextDoc() throws IOException { | ||
| return advance(docID + 1); | ||
| } | ||
|
|
||
| @Override | ||
| public int advance(int target) throws IOException { | ||
| skipperMatch = false; | ||
| target = values.advance(target); | ||
| if (target == DocIdSetIterator.NO_MORE_DOCS) { | ||
| return docID = target; | ||
| } | ||
| while (true) { | ||
| if (target > valueFieldUpTo) { | ||
| valueSkipper.advance(target); | ||
| valueFieldUpTo = valueSkipper.maxDocID(0); | ||
| long minValue = valueSkipper.minValue(0); | ||
| long maxValue = valueSkipper.maxValue(0); | ||
| if (minValue > this.maxValue || maxValue < this.minValue) { | ||
| // outside the desired range, skip forward | ||
| for (int level = 1; level < valueSkipper.numLevels(); level++) { | ||
| minValue = valueSkipper.minValue(level); | ||
| maxValue = valueSkipper.maxValue(level); | ||
| if (minValue > this.maxValue || maxValue < this.minValue) { | ||
| valueFieldUpTo = valueSkipper.maxDocID(level); | ||
| } else { | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| int upTo = valueFieldUpTo; | ||
| if (maxValue < this.minValue) { | ||
| // We've moved past the end of the valid values in the secondary sort field | ||
| // for this primary value. Advance the primary skipper to find the starting point | ||
| // for the next primary value, where the secondary field values will have reset | ||
| primaryFieldSkipper.advance(target); | ||
| primaryFieldUpTo = primaryFieldSkipper.maxDocID(0); | ||
| if (primaryFieldSkipper.minValue(0) == primaryFieldSkipper.maxValue(0)) { | ||
| for (int level = 1; level < primaryFieldSkipper.numLevels(); level++) { | ||
| if (primaryFieldSkipper.minValue(level) == primaryFieldSkipper.maxValue(level)) { | ||
| primaryFieldUpTo = primaryFieldSkipper.maxDocID(level); | ||
| } else { | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| if (primaryFieldUpTo > upTo) { | ||
| upTo = primaryFieldUpTo; | ||
| } | ||
| } | ||
|
|
||
| target = values.advance(upTo + 1); | ||
| if (target == DocIdSetIterator.NO_MORE_DOCS) { | ||
| return docID = target; | ||
| } | ||
| } else if (minValue >= this.minValue && maxValue <= this.maxValue) { | ||
| assert valueSkipper.docCount(0) == valueSkipper.maxDocID(0) - valueSkipper.minDocID(0) + 1; | ||
| skipperMatch = true; | ||
| return docID = target; | ||
| } | ||
| } | ||
|
|
||
| long value = values.longValue(); | ||
| if (value < minValue && target > primaryFieldUpTo) { | ||
| primaryFieldSkipper.advance(target); | ||
| primaryFieldUpTo = primaryFieldSkipper.maxDocID(0); | ||
| if (primaryFieldSkipper.minValue(0) == primaryFieldSkipper.maxValue(0)) { | ||
| for (int level = 1; level < primaryFieldSkipper.numLevels(); level++) { | ||
| if (primaryFieldSkipper.minValue(level) == primaryFieldSkipper.maxValue(level)) { | ||
| primaryFieldUpTo = primaryFieldSkipper.maxDocID(level); | ||
| } else { | ||
| break; | ||
| } | ||
| } | ||
| target = values.advance(primaryFieldUpTo + 1); | ||
| if (target == DocIdSetIterator.NO_MORE_DOCS) { | ||
| return docID = target; | ||
| } | ||
| } else { | ||
| target = values.nextDoc(); | ||
| if (target == DocIdSetIterator.NO_MORE_DOCS) { | ||
| return docID = target; | ||
| } | ||
| } | ||
| } else if (value >= minValue && value <= maxValue) { | ||
| return docID = target; | ||
| } else { | ||
| target = values.nextDoc(); | ||
| if (target == DocIdSetIterator.NO_MORE_DOCS) { | ||
| return docID = target; | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| @Override | ||
| public int docIDRunEnd() throws IOException { | ||
| if (skipperMatch) { | ||
| return valueFieldUpTo + 1; | ||
| } | ||
| return super.docIDRunEnd(); | ||
| } | ||
|
|
||
| @Override | ||
| public long cost() { | ||
| return values.cost(); | ||
| } | ||
|
|
||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case
primaryFieldSkipperis null, then the secondary sort field can be treated as primary sort and we can go faster (like inSortedNumericDocValuesRangeQuery#getDocIdSetIteratorOrNullForPrimarySort). But I don't think this happens now? Becausesuper.buildCompetitiveDISIBuilder(context);will not detect this?The same applies if primary sort field has just one value.
If this is true. Let's address this then in a followup?