[SPARK-53786][SQL] Default value with special column name should not conflict with real column #52504

szehon-ho · 2025-10-02T06:09:39Z

What changes were proposed in this pull request?

Fix the analysis of default value expression to not include column names

Why are the changes needed?

The following query:

CREATE TABLE t (current_timestamp DEFAULT current_timestamp)

fails with an exception:

[INVALID_DEFAULT_VALUE.NOT_CONSTANT] Failed to execute CREATE TABLE command because the destination column or variable `current_timestamp` has a DEFAULT value CURRENT_TIMESTAMP, which is not a constant expression whose equivalent value is known at query planning time. SQLSTATE: 42623;

This is introduced in : #50631, there CreateTable child's ResolvedIdentifier starts to have output, which are the CREATE TABLE columns. Thus the analyzer will resolve the default value against the other columns, causing the regression. Previously the CreateTable output is empty, so the resolver will fail to resolve against the columns and fallback to literal functions.

Does this PR introduce any user-facing change?

Should fix a regression of Spark 4.0.

How was this patch tested?

Add new unit test in DataSourceV2DataFrameSuite

Was this patch authored or co-authored using generative AI tooling?

No

szehon-ho · 2025-10-02T06:10:13Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala

+            u.copy(child = newChild)
          }

+        case d @ DefaultValueExpression(u: UnresolvedAttribute, _, _) =>


Note: before this fix, Default value expression would fall to UnresolvedAttribute above. It would then think the default value refers to the conflicting column name and fail.

case u @ UnresolvedAttribute(nameParts) => val result = withPosition(u) { resolveColumnByName(nameParts) .orElse(LiteralFunctionResolution.resolve(nameParts)) .map { // We trim unnecessary alias here. Note that, we cannot trim the alias at top-level, // as we should resolve `UnresolvedAttribute` to a named expression. The caller side // can trim the top-level alias if it's safe to do so. Since we will call // CleanupAliases later in Analyzer, trim non top-level unnecessary alias is safe. case Alias(child, _) if !isTopLevel => child case other => other } .getOrElse(u) } logDebug(s"Resolving $u to $result") result

…umn name

szehon-ho · 2025-10-02T06:22:08Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala

+        case d @ DefaultValueExpression(u: UnresolvedAttribute, _, _) =>
+          d.copy(child = LiteralFunctionResolution.resolve(u.nameParts)
+            .map {
+              case Alias(child, _) if !isTopLevel => child


I just copied this from the other code, we dont need it right?

szehon-ho · 2025-10-02T16:06:32Z

failure may not be related:

[error] org.apache.spark.sql.kafka010.KafkaMicroBatchV1SourceWithConsumerSuite, rerunning to verify

aokolnychyi · 2025-10-08T16:46:19Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala

+            u.copy(child = newChild)
          }

+        case d @ DefaultValueExpression(c: Expression, _, _) =>


I assume this works because we resolve expressions top to bottom, hence we see DefaultValueExpression before we see the unresolved attribute?

aokolnychyi · 2025-10-08T16:52:50Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala

    }
  }

+  private def resolveLiteralColumns(e: Expression) = {


I am a little bit worried that this doesn't fix the root problem that ResolvedIdentifier output in CREATE actually is used as candidates for resolving default values. Let me think.

Actually, maybe it does solve the root problem. Am I right we will not recurse into DefaultValueExpression child so we effectively ensure that default value resolution doesn't have access to attributes?

yes, this is matched before children

aokolnychyi · 2025-10-08T16:59:38Z

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala

    }
  }

+  test("test default value special column name conflicting with real column name") {


Do we have tests when a default value references other data columns? It is illegal, just to want to make sure we throw a good error message in this case.

col1 INT, col2 INT DEFAULT col1

yes , do you mean like "test default value should not refer to real column", later in the file?

sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala

aokolnychyi · 2025-10-08T17:20:03Z

@szehon-ho, could you also update the description to describe the role ResolvedIdentifier in CREATE?

szehon-ho · 2025-10-09T00:06:07Z

@aokolnychyi thanks for looking , addressed comments, and edited the pr description.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2DataFrameSuite.scala

gengliangwang · 2025-10-09T16:08:45Z

Here is a detailed RCA:
When parsing the default value expression, the expression in the reproduction will produce UnresolvedAttribute

  private def getDefaultExpression(
      exprCtx: ExpressionContext,
      place: String): DefaultValueExpression = {
    // Make sure it can be converted to Catalyst expressions.
    val expr = expression(exprCtx)
    if (expr.containsPattern(PARAMETER)) {
      throw QueryParsingErrors.parameterMarkerNotAllowed(place, expr.origin)
    }
    DefaultValueExpression(expr, getOriginalText(exprCtx))
  }

  override def visitCurrentLike(ctx: CurrentLikeContext): Expression = withOrigin(ctx) {
    ...
      UnresolvedAttribute.quoted(ctx.name.getText)
  }

However, in ColumnResolutionHelper.innerResolve , Spark will try to resolve UnresolvedAttribute as Literal Function if it fail to find the column from the child node:

        case u @ UnresolvedAttribute(nameParts) =>
          val result = withPosition(u) {
            resolveColumnByName(nameParts)
              .orElse(LiteralFunctionResolution.resolve(nameParts))

After commit fc1cb78, the child node of CreateTable has the output and Spark can find the column from it, thus it is resolved as a column instead of function.

github-actions bot added the SQL label Oct 2, 2025

szehon-ho commented Oct 2, 2025

View reviewed changes

[SPARK-53786][SQL] Default value should not conflict with special col…

da190b3

…umn name

szehon-ho force-pushed the default_value_conflict branch from b0350a5 to da190b3 Compare October 2, 2025 06:19

szehon-ho commented Oct 2, 2025

View reviewed changes

szehon-ho changed the title ~~[SPARK-53786][SQL] Default value should not conflict with special column name~~ [SPARK-53786][SQL] Default value with special column name should not conflict with real column Oct 2, 2025

szehon-ho added 4 commits October 2, 2025 14:16

Add more tests

ad8f55e

Fix another case where special column is in the function

5bb76b4

simplify repeated code in tests

2caf43f

Fix tests

8f7179c

aokolnychyi reviewed Oct 8, 2025

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala Show resolved Hide resolved

aokolnychyi reviewed Oct 8, 2025

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala Outdated Show resolved Hide resolved

aokolnychyi reviewed Oct 8, 2025

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala Outdated Show resolved Hide resolved

gengliangwang approved these changes Oct 8, 2025

View reviewed changes

Review comments/add tests

e1b38bb

cloud-fan reviewed Oct 9, 2025

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala Show resolved Hide resolved

cloud-fan reviewed Oct 9, 2025

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2DataFrameSuite.scala Outdated Show resolved Hide resolved

cloud-fan approved these changes Oct 9, 2025

View reviewed changes

aokolnychyi approved these changes Oct 9, 2025

View reviewed changes

Review comments

c6c30c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-53786][SQL] Default value with special column name should not conflict with real column #52504

[SPARK-53786][SQL] Default value with special column name should not conflict with real column #52504

szehon-ho commented Oct 2, 2025 •

edited

Loading

Uh oh!

szehon-ho Oct 2, 2025 •

edited

Loading

Uh oh!

szehon-ho Oct 2, 2025

Uh oh!

szehon-ho commented Oct 2, 2025

Uh oh!

aokolnychyi Oct 8, 2025

Uh oh!

aokolnychyi Oct 8, 2025

Uh oh!

aokolnychyi Oct 8, 2025

Uh oh!

szehon-ho Oct 8, 2025

Uh oh!

aokolnychyi Oct 8, 2025

Uh oh!

szehon-ho Oct 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aokolnychyi commented Oct 8, 2025

Uh oh!

szehon-ho commented Oct 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

gengliangwang commented Oct 9, 2025

Uh oh!

Uh oh!

[SPARK-53786][SQL] Default value with special column name should not conflict with real column #52504

Are you sure you want to change the base?

[SPARK-53786][SQL] Default value with special column name should not conflict with real column #52504

Conversation

szehon-ho commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

szehon-ho Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

szehon-ho Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

szehon-ho commented Oct 2, 2025

Uh oh!

aokolnychyi Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

szehon-ho Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

szehon-ho Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aokolnychyi commented Oct 8, 2025

Uh oh!

szehon-ho commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gengliangwang commented Oct 9, 2025

Uh oh!

Uh oh!

szehon-ho commented Oct 2, 2025 •

edited

Loading

szehon-ho Oct 2, 2025 •

edited

Loading

szehon-ho commented Oct 9, 2025 •

edited

Loading