tp: add IndexedFilterIn bytecode for In on indexed columns#5158
Open
LalitMaganti wants to merge 22 commits intomainfrom
Open
tp: add IndexedFilterIn bytecode for In on indexed columns#5158LalitMaganti wants to merge 22 commits intomainfrom
LalitMaganti wants to merge 22 commits intomainfrom
Conversation
🎨 Perfetto UI Builds
|
f19f7dd to
ad716eb
Compare
Add SetFilterValueListUnchecked to TypedCursor allowing callers to pass a pointer+size array of FilterValue for In filters without allocation. Plumb this through the codegen'd ConstCursor/Cursor. Optimize the In bytecode by pre-building lookup structures during CastFilterValueList instead of rebuilding on every Execute(): - For dense Id/Uint32: BitVector (built once, not per-call) - For large sparse integer/string lists: FlatHashMapV2 for O(1) - For small lists (<=16): linear scan (cache-friendly) The lookup is stored as a variant in CastFilterValueListResult, replacing the separate value_list field. Migrate experimental_slice_layout to use In filter on track_id.
0f90ef9 to
2ed402d
Compare
When a column has an index and the query uses an In filter, the planner now emits IndexedFilterIn instead of the generic In bytecode. For each value in the list, IndexedFilterIn binary- searches the index permutation vector (O(log N) per value) and concatenates the matching ranges. This reduces In filter cost from O(N) to O(k log N + matches) where k is the number of values and N is the table size.
ad716eb to
c4641da
Compare
PrefixPopcount was emitted after IndexedFilterEq/In because alloc_popcount() was called inside the AddOpcode block. For SparseNull columns, this meant the popcount register was uninitialized when the filter executed, causing a SIGSEGV on LEFT JOINs over indexed columns. Move alloc_popcount() before AddOpcode so PrefixPopcount is emitted first.
IndexedFilterIn wrote results back into the source register's memory via memcpy, but the source register points directly to the persistent index permutation vector. This corrupted the index for all subsequent queries on the same table. Fix: allocate a separate slab+span pair (via AllocateIndices) for the dest register, following the existing EnsureIndicesAreInSlab pattern. IndexedFilterIn now writes into this pre-allocated buffer instead of the source. The dest_register is changed to RwHandle since we need to both read the pre-allocated span and write back the adjusted boundaries. Added a regression test that verifies an Eq query on the same index returns correct results after an IN query has executed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
IndexedFilterInbytecode that uses binary search on index permutation vectors forInfiltersIn, the planner now emitsIndexedFilterIninstead of the genericInbytecodeInfilter cost from O(N) to O(k log N + matches) where k is the number of valuesStack
Test plan
IndexedFilterIn_Uint32_NonNull_MultipleValues,_NoMatch,_SingleValue,_String_SparseNull_MultipleValues)PlanQuery_SingleColIndex_InFilter_NonNullInt)TypedCursorInFilterWithIndex)