-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Slightly improve TrackingPostingsInMemoryBytesCodec #132905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slightly improve TrackingPostingsInMemoryBytesCodec #132905
Conversation
and use int hashset to keep track of seen fields.
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
jordan-powers
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Martijn!
| // As far as I know only when bloom filter for _id filter gets written this method gets invoked twice for the same field. | ||
| // So maybe we can get rid of the seenFields here? And just keep track of whether _id field has been seen? | ||
| return terms; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's difficult to guarantee that we'll only ever invoke this method twice for _id fields. I think only checking the _id field would potentially open us up to some subtle bugs in the future.
| // Alternatively, we can consider using a FixedBitSet here and size to max(fieldNumber). T | ||
| // his should be faster without worrying too much about memory usage. | ||
| this.seenFields = new IntHashSet(state.fieldInfos.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's probably better to prioritize speed over memory efficiency here.
Replace int hashmap by a counter in TrackingPostingsInMemoryBytesCodec and use int hashset to keep track of seen fields.
Replace int hashmap by a counter in TrackingPostingsInMemoryBytesCodec and use int hashset to keep track of seen fields.
Replace int hashmap by a counter in TrackingPostingsInMemoryBytesCodec and use int hashset to keep track of seen fields.