-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Move HitQueue in TopScoreDocCollector to a LongHeap #14714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 25 commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
7f0ed2a
Speed up term query
gf2121 487bbc8
iter
gf2121 6ad87d1
iter
gf2121 b55336a
iter
gf2121 ab6773d
tidy
gf2121 b53ef5f
fix
gf2121 8ec9930
feedback iter
gf2121 8b25eb3
fix
gf2121 1af4d1b
iter
gf2121 6c7c2eb
iter
gf2121 ac598df
iter
gf2121 212d73d
Merge branch 'opt_term_query' into int_score
gf2121 505e0ab
fix
gf2121 8d129ce
iter
gf2121 dea53e8
iter
gf2121 de56623
license
gf2121 d1ac4b1
Revert "Merge branch 'opt_term_query' into int_score"
gf2121 270f63e
Merge remote-tracking branch 'origin/main' into int_score
gf2121 5f94725
minimum override
gf2121 b729ea7
iter
gf2121 776e6e8
CHANGES
gf2121 e650ac4
simplify
gf2121 2ed31cd
reveiew iter
gf2121 4b4878b
fix doc
gf2121 8f1abc6
fix
gf2121 784058d
review iter
gf2121 a1d7699
Merge branch 'main' into int_score
gf2121 ee1b72d
Merge branch 'main' into int_score
gf2121 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
42 changes: 42 additions & 0 deletions
42
lucene/core/src/java/org/apache/lucene/search/DocScoreEncoder.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.lucene.search; | ||
|
|
||
| import org.apache.lucene.util.NumericUtils; | ||
|
|
||
| /** | ||
| * An encoder do encode (doc, score) pair as a long whose sort order is same as {@code (o1, o2) -> | ||
| * Float.compare(o1.score, o2.score)).thenComparing(Comparator.comparingInt((ScoreDoc o) -> | ||
| * o.doc).reversed())} | ||
| */ | ||
| class DocScoreEncoder { | ||
|
|
||
| static final long LEAST_COMPETITIVE_CODE = encode(Integer.MAX_VALUE, Float.NEGATIVE_INFINITY); | ||
|
|
||
| static long encode(int docId, float score) { | ||
| return (((long) NumericUtils.floatToSortableInt(score)) << 32) | (Integer.MAX_VALUE - docId); | ||
| } | ||
|
|
||
| static float toScore(long value) { | ||
| return NumericUtils.sortableIntToFloat((int) (value >>> 32)); | ||
| } | ||
|
|
||
| static int docId(long value) { | ||
| return Integer.MAX_VALUE - ((int) value); | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
70 changes: 70 additions & 0 deletions
70
lucene/core/src/test/org/apache/lucene/search/TestDocScoreEncoder.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.lucene.search; | ||
|
|
||
| import org.apache.lucene.tests.util.LuceneTestCase; | ||
|
|
||
| public class TestDocScoreEncoder extends LuceneTestCase { | ||
|
|
||
| public void testRandom() { | ||
| for (int i = 0; i < 1000; i++) { | ||
| doAssert( | ||
| Float.intBitsToFloat(random().nextInt()), | ||
| random().nextInt(Integer.MAX_VALUE), | ||
| Float.intBitsToFloat(random().nextInt()), | ||
| random().nextInt(Integer.MAX_VALUE)); | ||
| } | ||
| } | ||
|
|
||
| public void testSameDoc() { | ||
| for (int i = 0; i < 1000; i++) { | ||
| doAssert( | ||
| Float.intBitsToFloat(random().nextInt()), 1, Float.intBitsToFloat(random().nextInt()), 1); | ||
| } | ||
| } | ||
|
|
||
| public void testSameScore() { | ||
| for (int i = 0; i < 1000; i++) { | ||
| doAssert(1f, random().nextInt(Integer.MAX_VALUE), 1f, random().nextInt(Integer.MAX_VALUE)); | ||
| } | ||
| } | ||
|
|
||
| private void doAssert(float score1, int doc1, float score2, int doc2) { | ||
| if (Float.isNaN(score1) || Float.isNaN(score2)) { | ||
| return; | ||
| } | ||
|
|
||
| long code1 = DocScoreEncoder.encode(doc1, score1); | ||
| long code2 = DocScoreEncoder.encode(doc2, score2); | ||
|
|
||
| assertEquals(doc1, DocScoreEncoder.docId(code1)); | ||
| assertEquals(doc2, DocScoreEncoder.docId(code2)); | ||
| assertEquals(score1, DocScoreEncoder.toScore(code1), 0f); | ||
| assertEquals(score2, DocScoreEncoder.toScore(code2), 0f); | ||
|
|
||
| if (score1 < score2) { | ||
| assertTrue(code1 < code2); | ||
| } else if (score1 > score2) { | ||
| assertTrue(code1 > code2); | ||
| } else if (doc1 == doc2) { | ||
| assertEquals(code1, code2); | ||
| } else { | ||
| assertEquals(code1 > code2, doc1 < doc2); | ||
| } | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: a for loop would be a bit more readable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in a follow-up we can add a new ctor parameter to
LongHeapso that it accepts an initial value.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old ctor used to create an empty heap so I added a new ctor.