Support parse command with Calcite#3474
Conversation
Signed-off-by: Lantao Jin <ltjin@amazon.com>
| context.relBuilder.peek().getRowType().getFieldNames().stream() | ||
| .map( | ||
| cur -> { | ||
| String noNumericSuffix = cur.replaceAll("\\d", ""); |
There was a problem hiding this comment.
What if original fields contains SAL, SAL1 and overriding fields contains SAL?
Seems this logic will remove the numeric suffix of SAL1 as well, which is incorrect.
There was a problem hiding this comment.
done. related IT added.
| RexNode sourceField = rexVisitor.analyze(node.getSourceField(), context); | ||
| ParseMethod parseMethod = node.getParseMethod(); | ||
| java.util.Map<String, Literal> arguments = node.getArguments(); | ||
| assert arguments.isEmpty(); |
There was a problem hiding this comment.
Please add assert message for better understanding if error really happened
Signed-off-by: Lantao Jin <ltjin@amazon.com>
| RexNode overrideField = null; | ||
| String alias = | ||
| ((RexLiteral) ((RexCall) eval).getOperands().get(1)).getValueAs(String.class); | ||
| if (originalFieldNames.contains(alias)) { |
There was a problem hiding this comment.
[Non blocking] How about extract line 291-298 into a separate method for processing plus new columns with overriding? It should have interface like:
void projectOverride(List<RexNode> newExprs, List<String> newColumnNames, CalcitePlanContext context)
It should be reused for many places involves overriding purpose.
| } else { | ||
| // Overriding the existing field if the alias has the same name with original field | ||
| // name. | ||
| RexNode overrideField = null; |
There was a problem hiding this comment.
[Minor] This attr seems to be put in wrong place since it's only used for branch originalFieldNames.contains(alias)
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
| verifyDataRows(result, rows("a@a.com", "a.com", "c@c.com")); | ||
| result = | ||
| executeQuery( | ||
| "source = test | parse email '.+@(?<email>.+)' | fields email, email0, email1"); |
There was a problem hiding this comment.
what if pattern group have same name? i.e. '.+@(?.+) to .+@(?.+)'
There was a problem hiding this comment.
It will fallback to V2 since multiple group matchers are unsupported now.
|
@dai-chen could you help to review this? |
Description
Support
parsecommand with CalciteLimitation:
Multiple capturing groups are not allowed in Calcite REGEXP_EXTRACT function.
| parse address '(?<streetNumber>\d+) (?<street>.+)'will fallback to V2.Related Issues
Resolves #3463
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.