Skip to content

[Feature Request] Multi-Field DocValues skip coordinationΒ #20666

@sgup432

Description

@sgup432

Is your feature request related to a problem? Please describe

As of now, Lucene's DocValues fields maintain a multi-level skip list structure (DocValuesSkipper) that stores min/max value statistics for fixed-size blocks of documents (default 4,096 docs per block, grouped into 4 levels with 8x fanout). Range queries use this skip list to classify entire blocks as YES (all docs match), NO (no docs match), or MAYBE (need per-doc evaluation), avoiding expensive per-document value reads for YES and NO blocks.

Today, when a query ANDs multiple range filters on different fields (e.g., price:[10,50] AND rating:[4,5] AND date:[2024-01-01, 2024-12-31]), each field's skip list is evaluated independently. If the price skip list determines a block has no matching documents, the rating and date skip lists still read and evaluate their metadata for that same block, only for the conjunction(ConjunctionDISI) later on to discard it at the document level anyway.

Describe the solution you'd like

This proposal coordinates the skip list evaluation across fields. All fields' skip metadata is checked together at the block level, short-circuiting on the first field that says NO. When one field eliminates a block, the other fields skip it entirely, no skip reads, no per-doc evaluation.

This obviously will only operate if we have 2+ numeric range filters on different fields.

One way would be to rewrite and wrap multiple range queries and coordinator their skip evaluation via MultiFieldDocValuesRangeIterator. MultiFieldDocValuesRangeIterator that wraps the per-field skip iterators and advance them together ie when the lead field skips past a block, all other fields jump to the same position without reading their skip data for the skipped blocks.

Related component

Search:Performance

Describe alternatives you've considered

None at this point

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

πŸ†• New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions