-
Notifications
You must be signed in to change notification settings - Fork 5.5k
fix(native): Fix Dereference expression index type in VeloxToPrestoExprConverter #27057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's guide (collapsed on small PRs)Reviewer's GuideAdjusts the Velox-to-Presto dereference expression index type to use INTEGER instead of BIGINT and adds end-to-end tests to cover dereference usage via array_least_frequent under the native optimizer. Sequence diagram for DEREFERENCE conversion with INTEGER indexsequenceDiagram
participant NativeOptimizer
participant VeloxToPrestoExprConverter as Converter
participant velox_core_FieldAccessTypedExpr as FieldAccessExpr
participant velox_core_ConstantTypedExpr as ConstantExpr
participant protocol_RowExpression as RowExpr
NativeOptimizer->>Converter: getDereferenceExpression(FieldAccessExpr)
Converter->>FieldAccessExpr: inputs()
FieldAccessExpr-->>Converter: std_vector_TypedExprPtr
Converter->>FieldAccessExpr: index()
FieldAccessExpr-->>Converter: index_int32
Converter->>ConstantExpr: new ConstantTypedExpr(velox_INTEGER, index_int32)
Converter->>Converter: build dereferenceInputs[0] = base_expr
Converter->>Converter: build dereferenceInputs[1] = ConstantExpr
loop for each input in dereferenceInputs
Converter->>Converter: getRowExpression(input)
Converter-->>Converter: RowExpr
end
Converter-->>NativeOptimizer: RowExpr representing DEREFERENCE with INTEGER index
Class diagram for updated dereference conversion in VeloxToPrestoExprConverterclassDiagram
class VeloxToPrestoExprConverter {
+getDereferenceExpression(dereferenceExpr : velox_core_FieldAccessTypedExprPtr) SpecialFormExpressionPtr
+getRowExpression(input : velox_core_TypedExprPtr) protocol_RowExpressionPtr
}
class velox_core_FieldAccessTypedExpr {
+inputs() std_vector_velox_core_TypedExprPtr
+index() int32_t
}
class velox_core_TypedExpr {
}
class velox_core_ConstantTypedExpr {
+ConstantTypedExpr(type : velox_TypePtr, value : int32_t)
}
class velox_Type {
}
class protocol_RowExpression {
}
VeloxToPrestoExprConverter --> velox_core_FieldAccessTypedExpr : uses
VeloxToPrestoExprConverter --> velox_core_TypedExpr : builds_dereferenceInputs
VeloxToPrestoExprConverter --> velox_core_ConstantTypedExpr : creates_index_constant
velox_core_ConstantTypedExpr --> velox_Type : has_type
VeloxToPrestoExprConverter --> protocol_RowExpression : returns
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've left some high level feedback:
- The change from BIGINT to INTEGER introduces a narrowing cast on
dereferenceExpr->index(); consider adding an assertion or explicit range check (or a comment explaining why the index is guaranteed to fit in int32) to make the assumption about index bounds clear.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The change from BIGINT to INTEGER introduces a narrowing cast on `dereferenceExpr->index()`; consider adding an assertion or explicit range check (or a comment explaining why the index is guaranteed to fit in int32) to make the assumption about index bounds clear.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@aditi-pandit, could you please help review this fix? |
pdabre12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pramodsatya Changes look good.
Seems like the test case is failing because of a known varchar(N) issue, can you instead add another test case without varchar column ?
Optionally, you can keep this test case with an assertQueryFails message , whatever you prefer.
aditi-pandit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pramodsatya. Please add more tests.
| assertQuerySucceeds(session, "SELECT row_number() OVER (PARTITION BY orderdate ORDER BY orderdate) FROM orders"); | ||
|
|
||
| // Test dereference expression with array_least_frequent function | ||
| assertQuerySucceeds(session, "SELECT array_least_frequent(quantities) from orders_ex"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pramodsatya : Can you add tests with other Dereference expressions as well ?
Description
Changes the type of index in
DEREFERENCEexpression fromBIGINTtoINTEGERto match the PrestoRowExpressiontype.Motivation and Context
Resolves #27037.
Impact
Sql invoked functions with dereference expression won't fail with native expression optimizer enabled.
Test Plan
Added e2e tests.
Summary by Sourcery
Align DEREFERENCE expression index type with Presto RowExpression and add coverage for native optimizer queries using dereference.
Bug Fixes:
Tests: