Skip to content

Conversation

@abhishekrb19
Copy link
Contributor

@abhishekrb19 abhishekrb19 commented Nov 20, 2025

Currently, if a user runs regexp_like(c2, '[abc-d-12]'), they receive a cryptic error message like "Illegal character range near index ...". When the SQL query is quite complex, this message becomes even harder to understand and debug.

Move the RegexpExtractExprMacro.compilePattern() utility into a shared location and have all Regexp* macros use it, so that invalid regex patterns produce nicer error messages for users.

The error would now look something like this:

An invalid pattern [[default-byom-gp]]] was provided for the regexp_like function, error: [Illegal character range near index 9 [default-byom-gp]] ^]

Release note

Nicer user-facing error messages for invalid patterns used in the regexp* functions.


This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

…essage

Move the compilePattern() utility and wire it up to all the REGEXP* functions
so any invalid regex pattern will return nicer error messages to users.
Otherwise, a user would get a cryptic error message "Illegal character range near index.."
@abhishekrb19 abhishekrb19 force-pushed the regexp_pattern_validation branch from a376b5b to 94585c9 Compare November 20, 2025 03:50
@jtuglu1 jtuglu1 self-requested a review November 20, 2025 03:56
throw InvalidInput.exception(
e,
StringUtils.format(
"An invalid pattern [%s] was provided for the %s function, error: [%s]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe box the function name as well?

final String patternString = (String) patternExpr.getLiteralValue();

this.arg = args.get(0);
this.pattern = patternString != null ? Pattern.compile(patternString) : null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are moving from Pattern.compile(patternString) to Pattern.compile(StringUtils.nullToEmptyNonDruidDataString(patternString)). Is this ok?

Copy link
Contributor Author

@abhishekrb19 abhishekrb19 Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is okay because the original null semantics is retained at this call site.

RegexpExprUtils.compilePattern() is called only when patternString is not null, otherwise pattern continues to remain null

@abhishekrb19 abhishekrb19 merged commit c89ec0c into master Nov 20, 2025
57 checks passed
@abhishekrb19 abhishekrb19 deleted the regexp_pattern_validation branch November 20, 2025 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants