Skip to content

Conversation

@Chaho12
Copy link
Member

@Chaho12 Chaho12 commented Nov 25, 2025

Description

  • Add regex support for query partition filter required schemas (Hive & Iceberg)

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Section
* Add regex support for query partition filter required schemas

@github-actions github-actions bot added docs iceberg Iceberg connector hive Hive connector labels Nov 25, 2025
@ebyhr
Copy link
Member

ebyhr commented Nov 25, 2025

@Chaho12 Is this backward incompatible change? What if the schema name contains special characters such as .?

@findepi
Copy link
Member

findepi commented Nov 25, 2025

Please expand the PR description with more text and some examples.

@Chaho12
Copy link
Member Author

Chaho12 commented Nov 25, 2025

Is this backward incompatible change? What if the schema name contains special characters such as

@ebyhr Well not totally. It is partially backward compatible.

It first checks with contains so it would do exact match first.

  • e.g. SET SESSION hive.query_partition_filter_required_schemas = ARRAY['prod.analytics'];
  • In this example it would first search as prod.analytics with contains, and if not found, it would search 'prodXanalytics', 'prod1analytics' etc. due to regex.

Other ideas of supporting regex

Any recommendations?

0. Exact Match First (current PR)

Pros: Exact names are not interpreted as regex
Cons: Regex metacharacters can still cause unintended behavior

1. Explicit Prefix (e.g. regex)

SET SESSION hive.query_partition_filter_required_schemas = ARRAY['regex:prod_.*', 'test_schema'];

Pros:

  • Backward compatible
  • Clear intent with regex: prefix
  • Safe for special characters
  • Easy to document

2. Separate Property

-- For exact matching
SET SESSION hive.query_partition_filter_required_schemas = ARRAY['exact_schema'];

-- For regex matching (new property)
SET SESSION hive.query_partition_filter_required_schema_patterns = ARRAY['test_.*', 'prod_.*'];

Pros: Complete backward compatibility
Cons: More properties to manage, more complex configuration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs hive Hive connector iceberg Iceberg connector

Development

Successfully merging this pull request may close these issues.

3 participants