You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
X-link: facebookincubator/nimble#480
This diff implements stripe-level batched index read support for Nimble files in Velox. The key motivation is to improve index lookup performance by processing multiple lookup requests in batches at the stripe level, rather than processing each request individually.
**New `SelectiveNimbleIndexReader` class**: A new format-specific index reader that handles:
- Encoding index bounds into Nimble-specific encoded keys
- Looking up stripes and row ranges using the tablet index
- Managing stripe iteration and data reading with batched processing
- Returning results in request order via an iterator pattern (`startLookup`/`hasNext`/`next`)
**Batched stripe processing**: Instead of loading stripes per-request, the reader:
- Maps all lookup requests to their matching stripes upfront
- Merges overlapping row ranges within stripes for efficient reading
- Tracks output references with ref-counting to share read data across requests
**Optimized row range handling**:
- Without filters: Merges overlapping row ranges and each request extracts its portion
- With filters: Splits overlapping ranges into non-overlapping segments to preserve filter semantics
**HiveIndexReader refactoring**: Simplified to focus on index bounds creation and result assembly, delegating control logic to format-specific readers.
**KeyEncoder enhancements**: Added support for encoding index bounds with constant values for more efficient range queries.
**New runtime stats**: Added metrics for tracking index lookup performance:
- `kNumIndexLookupRequests`: Total lookup requests
- `kNumIndexLookupStripes`: Number of stripes accessed
- `kNumIndexLookupReadSegments`: Number of read segments processed
Differential Revision: D92848948
0 commit comments