Skip to content

feat: exec pushdown with global aggregates#124

Merged
Xuanwo merged 7 commits intomainfrom
exec-pushdown-p0-always-on
Jan 7, 2026
Merged

feat: exec pushdown with global aggregates#124
Xuanwo merged 7 commits intomainfrom
exec-pushdown-p0-always-on

Conversation

@Xuanwo
Copy link
Collaborator

@Xuanwo Xuanwo commented Jan 6, 2026

This PR introduces P0 subplan pushdown for ungrouped aggregates over Lance scans.

Key points:

  • Add a compact ExecIR container that reuses existing FilterIR bytes.
  • Optimizer rewrite replaces eligible plans with the internal __lance_exec table function (no user-visible SQL surface).
  • Rust side validates ExecIR and executes it via DataFusion on top of the Lance scanner, returning a (typically tiny) Arrow stream to DuckDB.
  • The rewrite is always enabled but conservative: on any encode/validate/type mismatch it falls back to the original plan.

Tests:

  • GEN=ninja make test_debug

Xuanwo added 2 commits January 7, 2026 00:49
Add a small ExecIR container (reusing existing FilterIR bytes) and an optimizer rewrite that replaces ungrouped aggregates over Lance scans with the internal __lance_exec table function.

The Rust side validates the IR and executes it via DataFusion on top of the Lance scanner, returning a (typically tiny) Arrow stream back to DuckDB.

The rewrite is always enabled but remains conservative: on any encode/validate/type mismatch it falls back to the original plan.

Tests: GEN=ninja make test_debug
@Xuanwo Xuanwo changed the title Exec pushdown: internal __lance_exec for global aggregates feat: exec pushdown with global aggregates Jan 6, 2026
Xuanwo added 5 commits January 7, 2026 02:43
LogicalGet.table_filters uses column IDs as map keys, not scan column positions. Exec pushdown previously fed these filters into BuildLanceTableFilterIRParts with a column_ids vector sized to the projected columns, causing all_filters_pushed=false and a silent fallback for TPC-H Q6.

Fix by using an identity column_ids mapping sized by the max filter column id and collecting extra scan columns by column id.

Add a sqllogictest that asserts Q6 rewrites to __lance_exec.

Tests: GEN=ninja make test_debug
@Xuanwo Xuanwo merged commit 5a097f0 into main Jan 7, 2026
12 checks passed
@Xuanwo Xuanwo deleted the exec-pushdown-p0-always-on branch January 7, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant