-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-53786][SQL] Default value with special column name should not conflict with real column #52504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
u.copy(child = newChild) | ||
} | ||
|
||
case d @ DefaultValueExpression(u: UnresolvedAttribute, _, _) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: before this fix, Default value expression would fall to UnresolvedAttribute above. It would then think the default value refers to the conflicting column name and fail.
case u @ UnresolvedAttribute(nameParts) =>
val result = withPosition(u) {
resolveColumnByName(nameParts)
.orElse(LiteralFunctionResolution.resolve(nameParts))
.map {
// We trim unnecessary alias here. Note that, we cannot trim the alias at top-level,
// as we should resolve `UnresolvedAttribute` to a named expression. The caller side
// can trim the top-level alias if it's safe to do so. Since we will call
// CleanupAliases later in Analyzer, trim non top-level unnecessary alias is safe.
case Alias(child, _) if !isTopLevel => child
case other => other
}
.getOrElse(u)
}
logDebug(s"Resolving $u to $result")
result
b0350a5
to
da190b3
Compare
case d @ DefaultValueExpression(u: UnresolvedAttribute, _, _) => | ||
d.copy(child = LiteralFunctionResolution.resolve(u.nameParts) | ||
.map { | ||
case Alias(child, _) if !isTopLevel => child |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied this from the other code, we dont need it right?
failure may not be related: [error] org.apache.spark.sql.kafka010.KafkaMicroBatchV1SourceWithConsumerSuite, rerunning to verify |
a better approach is here: #52530 |
What changes were proposed in this pull request?
Fix the analysis of default value expression to not include column names
Why are the changes needed?
The following query:
CREATE TABLE t (current_timestamp DEFAULT current_timestamp)
fails with an exception:
This is because , to create a default value DSV2 expression, the code now uses the main analyzer to analyze the default value, which resolves it to the column current_timestamp. However, analyzer should not try to resolve default value to other columns.
Does this PR introduce any user-facing change?
Should fix a regression
How was this patch tested?
Add new unit test in DataSourceV2SQLSuite
Was this patch authored or co-authored using generative AI tooling?
No