feat: DH-21522: allow column region optimizations in Predicate Pushdown filtering#7666
feat: DH-21522: allow column region optimizations in Predicate Pushdown filtering#7666lbooker42 wants to merge 26 commits intodeephaven:mainfrom
Conversation
…delegate to the ColumnRegion for predicate pushdown.
No docs changes detected for 62c13dc |
There was a problem hiding this comment.
Pull request overview
This PR refactors predicate pushdown filtering for regioned column sources to support both table-location and per-column-region optimizations, enabling more granular (region-level) pushdown planning and execution.
Changes:
- Introduces a new
RegionedPushdownActionmodel (Location vs Region actions) and new regioned pushdown filter context types. - Updates regioned pushdown execution to run per-region and merge results, enabling column-region pushdown participation.
- Refactors
ParquetTableLocationpushdown logic from an internal enum to the new action-based API and updates the engine interfaces accordingly.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| extensions/parquet/table/src/test/java/io/deephaven/parquet/table/location/ParquetTableLocationTest.java | Removes the prior unit test that validated pushdown mode cost ordering. |
| extensions/parquet/table/src/main/java/io/deephaven/parquet/table/location/ParquetTableLocation.java | Implements location-level supported actions and action contexts for parquet pushdown planning/execution. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownHelper.java | Adds shared utilities for region-thread context and combining per-region pushdown results. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownFilterMatcher.java | Introduces the regioned action-based pushdown matcher API. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownFilterContext.java | Adds a regioned pushdown context carrying column definitions + rename mappings. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownFilterLocationContext.java | Extends the regioned context with access to the current table location. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownAction.java | Defines the new pushdown action abstraction (Location/Region) and related context interfaces. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceManager.java | Refactors manager-level pushdown scheduling/merging and exposes internals needed for region pushdown. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceBase.java | Refactors regioned column source pushdown to operate directly on regions with location-aware contexts. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/GenericColumnRegionBase.java | Adds default region-level pushdown orchestration combining region + location actions by cost. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/ColumnRegion.java | Makes column regions pushdown-capable via RegionedPushdownFilterMatcher. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/UnionSourceManager.java | Updates to use the new BasePushdownFilterContext#filter() accessor. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/locations/impl/AbstractTableLocation.java | Adds default action-based pushdown planning/execution for table locations. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/locations/TableLocation.java | Switches TableLocation to the new RegionedPushdownFilterMatcher API. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/PushdownResult.java | Adds a new cost constant for region-level single-value optimizations. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/PushdownFilterMatcher.java | Provides default implementations for pushdown matcher methods. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/BasePushdownFilterContext.java | Makes the base context abstract and introduces filter() accessor (encapsulation change). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...le/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownAction.java
Outdated
Show resolved
Hide resolved
...ns/parquet/table/src/main/java/io/deephaven/parquet/table/location/ParquetTableLocation.java
Show resolved
Hide resolved
...ns/parquet/table/src/main/java/io/deephaven/parquet/table/location/ParquetTableLocation.java
Show resolved
Hide resolved
...ns/parquet/table/src/main/java/io/deephaven/parquet/table/location/ParquetTableLocation.java
Show resolved
Hide resolved
...c/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceManager.java
Show resolved
Hide resolved
...c/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceManager.java
Outdated
Show resolved
Hide resolved
.../src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceBase.java
Outdated
Show resolved
Hide resolved
…nderneath Unioned table to activate Column Region optimizations.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 64 out of 64 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...rc/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceObject.java
Show resolved
Hide resolved
...ne/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/ColumnRegionByte.java
Show resolved
Hide resolved
...src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceShort.java
Show resolved
Hide resolved
.../src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceLong.java
Show resolved
Hide resolved
...e/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceInt.java
Show resolved
Hide resolved
extensions/parquet/table/src/test/java/io/deephaven/parquet/table/ParquetTableFilterTest.java
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/BasePushdownFilterContext.java
Outdated
Show resolved
Hide resolved
...rc/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceDouble.java
Show resolved
Hide resolved
.../src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceChar.java
Show resolved
Hide resolved
.../src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceByte.java
Show resolved
Hide resolved
engine/test-utils/src/main/java/io/deephaven/engine/testutil/filters/RowSetCapturingFilter.java
Outdated
Show resolved
Hide resolved
cpwright
left a comment
There was a problem hiding this comment.
Is the new code fully covered by the tests?
| import io.deephaven.chunk.attributes.Any; | ||
| import io.deephaven.chunk.util.pools.MultiChunkPool; | ||
|
|
||
| import io.deephaven.function.ArraySort; |
There was a problem hiding this comment.
You've added imports without any actual code changes here and in WritableBooleanChunk.
There was a problem hiding this comment.
These were auto-added when I executed the replication code. I'll look deeper and try to understand how/why
engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/ColumnRegion.java
Outdated
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/ColumnRegion.java
Outdated
Show resolved
Hide resolved
...ne/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/ColumnRegionChar.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| @Override | ||
| @MustBeInvokedByOverriders |
There was a problem hiding this comment.
Why the MustBeInvokedByOverrides annotation?
There was a problem hiding this comment.
Here is my logic, maybe it's wrong:
Might be incorrect annotation, but if we implement an override of this class I think we should include parent supported actions in our list (since they would also be supported). So a Null or Constant Char region would still be optimized.
engine/table/src/main/java/io/deephaven/engine/table/impl/BasePushdownFilterContext.java
Outdated
Show resolved
Hide resolved
.../src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceBase.java
Show resolved
Hide resolved
extensions/parquet/table/src/test/java/io/deephaven/parquet/table/ParquetTableFilterTest.java
Outdated
Show resolved
Hide resolved
| if (localColumnName.equals(filterColumnName)) { | ||
| continue; | ||
| /** | ||
| * Get (or create) a map from column source to column name. |
There was a problem hiding this comment.
bad c/p of javadoc
is there a reason we've decided to defer this?
There was a problem hiding this comment.
This and columnSourceToName are only used in filtering. We were deferring creating columnSourceToName but not columnNameToDefinition. Would like to be consistent between these two (either deferring both or neither).
...le/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownAction.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 74 out of 74 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
.../main/java/io/deephaven/engine/table/impl/sources/immutable/ImmutableConstantByteSource.java
Show resolved
Hide resolved
...a/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownFilterLocationContext.java
Show resolved
Hide resolved
.../table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/ColumnRegionObject.java
Outdated
Show resolved
Hide resolved
| try { | ||
| ctx.shiftedRowSet = tle.subsetAndShiftIntoLocationSpace(selection); | ||
| getRegion(regionIndex).estimatePushdownFilterCost( | ||
| filter, | ||
| ctx.shiftedRowSet, | ||
| usePrev, | ||
| newCtx, | ||
| jobScheduler, | ||
| regionCost -> { | ||
| minCost.updateAndGet(old -> Math.min(old, regionCost)); | ||
| resume.run(); | ||
| newCtx.close(); | ||
| }, | ||
| nec); | ||
| } catch (final Exception e) { | ||
| // In the case of an exception, clean up the temporary context. | ||
| newCtx.close(); | ||
| throw e; | ||
| } |
There was a problem hiding this comment.
Potential resource management issue: newCtx.close() is called both in the success callback (line 156) and in the catch block (line 161). If an exception occurs after the callback is invoked but before it completes, the context could be closed twice. Consider using try-with-resources or a more explicit cleanup pattern to avoid potential double-close scenarios.
There was a problem hiding this comment.
Can't do try with resources because of the thread switch. This is the best we can do, release on error (although this context current holds no resources).
This PR refactors predicate pushdown filtering for regioned column sources to support both table-location and per-column-region optimizations, enabling more granular (region-level) pushdown planning and execution.
Changes:
RegionedPushdownActionmodel (Location vs Region actions) and new regioned pushdown filter context types.ParquetTableLocationpushdown logic from an internal enum to the new action-based API and updates the engine interfaces accordingly.Code Coverage Summary: