Skip to content

feat(snowflake)!: Transpilation support for Snowflake REGEXP_COUNT to DuckDB#7054

Merged
georgesittas merged 2 commits intomainfrom
feature/transpile-regex-count
Feb 12, 2026
Merged

feat(snowflake)!: Transpilation support for Snowflake REGEXP_COUNT to DuckDB#7054
georgesittas merged 2 commits intomainfrom
feature/transpile-regex-count

Conversation

@fivetran-kwoodbeck
Copy link
Collaborator

Adds transpilation for Snowflake's REGEXP_COUNT function to DuckDB. It uses DuckDB's REGEXP_EXTRACT_ALL, which requires flags embedded in pattern as (?ims), not as a separate argument.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:feature/transpile-regex-count, sqlglot version: feature/transpile-regex-count)
  • baseline (main, sqlglot version: 28.10.1.dev82)

⚠️ Limited to dialects: duckdb, snowflake

By Dialect

dialect main sqlglot:feature/transpile-regex-count transitions links
duckdb -> duckdb 4003/4003 passed (100.0%) 4003/4003 passed (100.0%) No change full result / delta
snowflake -> duckdb 1553/2449 passed (63.4%) 1560/2449 passed (63.7%) 7 fail -> pass full result / delta
snowflake -> snowflake 2661/2669 passed (99.7%) 2661/2669 passed (99.7%) No change full result / delta

Overall

main: 9121 total, 8217 passed (pass rate: 90.1%), sqlglot version: 28.10.1.dev82

sqlglot:feature/transpile-regex-count: 9121 total, 8224 passed (pass rate: 90.2%), sqlglot version: feature/transpile-regex-count

Transitions:
7 fail -> pass

Copy link
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add tests.

Comment on lines 3228 to 3230
elif "e" in flag_str:
self.unsupported("'e' (extract) flag is not supported in DuckDB")
flag_str = flag_str.replace("e", "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this branch be consolidated with the one in L3220? So, we'd pass the supported flags wherever this helper is used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do, yes, it's better to explicitly force the flags that are supported.

if position:
this = exp.Substring(this=this, start=position)

# Embed flags in pattern (REGEXP_EXTRACT_ALL doesn't support flags argument)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the pattern already contains the flags in validated_flags? Does it cause any issues?

Copy link
Collaborator Author

@fivetran-kwoodbeck fivetran-kwoodbeck Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would be a valid query in Snowflake, something like REGEXP_COUNT(text, '(?i)hello', 1, 'i') won't even run. The flags for REGEXP_COUNT are meant to be in parameters.

Also, on DuckDB side, prepending another one seems to run:

select REGEXP_EXTRACT_ALL('Hello World', '(?im)L');
select REGEXP_EXTRACT_ALL('Hello World', '(?i)(?im)L');

@fivetran-kwoodbeck fivetran-kwoodbeck force-pushed the feature/transpile-regex-count branch from 099b756 to ba34492 Compare February 12, 2026 16:53
@georgesittas georgesittas merged commit 4f8a49c into main Feb 12, 2026
9 checks passed
@georgesittas georgesittas deleted the feature/transpile-regex-count branch February 12, 2026 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants