Skip to content

feat(snowflake)!: Transpilation support for Snowflake's BITMAP_CONSTRUCT_AGG function to DuckDB#6745

Merged
georgesittas merged 7 commits intomainfrom
feature/transpile-bitmapconstruct
Jan 15, 2026
Merged

feat(snowflake)!: Transpilation support for Snowflake's BITMAP_CONSTRUCT_AGG function to DuckDB#6745
georgesittas merged 7 commits intomainfrom
feature/transpile-bitmapconstruct

Conversation

@fivetran-kwoodbeck
Copy link
Collaborator

@fivetran-kwoodbeck fivetran-kwoodbeck commented Jan 14, 2026

Added transpilation of BITMAP_CONSTRUCT_AGG from Snowflake to DuckDB.

Details:
- Uses BITMAP_CONSTRUCT_AGG_TEMPLATE - pre-parsed SQL template
- Added bitmapconstructagg_sql() method that replicates Snowflake's bitmap binary format:
- Small (<5 values): 2-byte big-endian count + little-endian values + padding to 10 bytes
- Large (≥5 values): 10-byte header (0x08 + 9 zeros) + little-endian values

See Jira for full testing.

@fivetran-kwoodbeck fivetran-kwoodbeck changed the title feat(snowflake)!: Adds transpilation support for Snowflake's BITMAP_CONSTRUCT_AGG function to DuckDB feat(snowflake)!: Transpilation support for Snowflake's BITMAP_CONSTRUCT_AGG function to DuckDB Jan 14, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 14, 2026

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:feature/transpile-bitmapconstruct, sqlglot version: feature/transpile-bitmapconstruct)
  • baseline (main, sqlglot version: 28.6.1.dev14)

⚠️ Limited to dialects: bigquery, duckdb, snowflake

By Dialect

dialect main sqlglot:feature/transpile-bitmapconstruct difference links
bigquery -> bigquery 2576/2621 passed (98.3%) 2576/2621 passed (98.3%) No change full result / delta
bigquery -> duckdb 1845/2620 passed (70.4%) 1845/2620 passed (70.4%) No change full result / delta
duckdb -> duckdb 4003/4003 passed (100.0%) 4003/4003 passed (100.0%) No change full result / delta
snowflake -> duckdb 599/847 passed (70.7%) 599/847 passed (70.7%) No change full result / delta
snowflake -> snowflake 847/847 passed (100.0%) 847/847 passed (100.0%) No change full result / delta

Overall

main: 10938 total, 9870 passed (pass rate: 90.2%), sqlglot version: 28.6.1.dev14

sqlglot:feature/transpile-bitmapconstruct: 10938 total, 9870 passed (pass rate: 90.2%), sqlglot version: feature/transpile-bitmapconstruct

Difference: No change

# Phase 1: Data Preparation (SELECT LIST_SORT): removes nulls, deduplicates, sorts the input list
# Phase 2: Hex String Construction (LIST_TRANSFORM): builds hex representation of values
# Phase 3: Final Assembly (CASE): constructs final bitmap based on size of unique values
BITMAP_CONSTRUCT_AGG_TEMPLATE: exp.Expression = exp.maybe_parse(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @fivetran-kwoodbeck, how did you arrive at this template? Is this documented somewhere? It looks fairly complicated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unable to review the logic as-is and am also hesitant about getting it in. If we need something this complicated to transpile, I'd rather we just use self.unsupported instead. These BITMAP_* functions should also likely be deprioritized, I doubt they're that frequent in Snowflake land. I'll adjust the task priorities shortly.

Copy link
Collaborator Author

@fivetran-kwoodbeck fivetran-kwoodbeck Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, I picked the construct one because it was hard. The flow is to sanitize the input, pack it into hex, then print it out. Snowflake has some nuances that make it more complicated, but it works on an exhaustive test set. Why would we not include it if it works?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of the other BITMAP functions have been transpiled. BITMAP_CONSTRUCT_AGG is the gateway to the BITMAP functions, as that's how they're able to be created in the first place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was this implementation based on? I cannot reason about this at the moment, there's a lot of stuff going on. At the very least, the template should be sufficiently documented in a docstring or something, to help folks debug it in the future, if needed. The Snowflake docs on this function are lacking from what I saw.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through the documentation here to understand what it does, but also experimented to see observed behavior (see Jira). I can enhance the documentation, not sure how detailed you want but I'll update it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's make sure the implementation is properly documented with a comment next to the template and we can get it in afterwards.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded the documentation, good idea because that made me realize we didn't have a range check (added).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@georgesittas georgesittas merged commit 6df0288 into main Jan 15, 2026
10 of 11 checks passed
@georgesittas georgesittas deleted the feature/transpile-bitmapconstruct branch January 15, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants