Skip to content

Fix COALESCE(null, 42) returning wrong type string instead of int#5316

Closed
qianheng-aws wants to merge 1 commit intomainfrom
fix/coalesce-null-type-5175
Closed

Fix COALESCE(null, 42) returning wrong type string instead of int#5316
qianheng-aws wants to merge 1 commit intomainfrom
fix/coalesce-null-type-5175

Conversation

@qianheng-aws
Copy link
Copy Markdown
Collaborator

Summary

Fixes #5175

COALESCE(null, 42) was returning "42" with schema type string instead of numeric 42 with type integer. The root cause involved three interacting issues in the Calcite engine's COALESCE handling:

  1. QualifiedNameResolver: When an unresolved field name (like the bare null keyword, which PPL parses as a field name) was encountered inside COALESCE, it was replaced with a null:VARCHAR literal. This caused the return type inference to see VARCHAR as a concrete type and include it in the common type computation.

  2. EnhancedCoalesceFunction: The return type inference called leastRestrictive on all operand types including NULL-typed operands. Calcite's leastRestrictive returns null when it cannot reconcile NULL with other types, causing a fallback to VARCHAR.

  3. CalciteRexNodeVisitor: The inCoalesceFunction context flag was not scoped to direct COALESCE arguments — it leaked into nested function calls. This meant that commands like nomv does_not_exist (which internally rewrites to a COALESCE expression) would silently replace deeply-nested unresolved fields with null instead of throwing an error.

Changes

  • QualifiedNameResolver.replaceWithNullLiteralInCoalesce: Changed the null literal type from VARCHAR to NULL, allowing the return type inference to correctly derive the type from non-null operands.

  • EnhancedCoalesceFunction.getReturnTypeInference: Filter out NULL-typed operands before calling leastRestrictive, then mark the result as nullable. If all operands are NULL, return nullable NULL type.

  • CalciteRexNodeVisitor.visitFunction: Save/restore the inCoalesceFunction flag so that only direct COALESCE arguments get the null-replacement behavior, not deeply nested expressions.

Test plan

  • Added testCoalesceNullWithInteger — verifies COALESCE(null, 42) has return type INTEGER
  • Added testCoalesceIntegerWithNull — verifies COALESCE(42, null) has return type INTEGER
  • Added testCoalesceNullWithIntegerResult — verifies execution produces numeric 42, not string "42"
  • Updated existing tests for nonexistent fields to expect null:NULL instead of null:VARCHAR
  • All existing PPL and core unit tests pass with no regressions
  • CalcitePPLNoMvTest.testNoMvNonExistentField continues to correctly throw an error

)

Three changes fix the type inference bug:

1. QualifiedNameResolver: Use NULL type (not VARCHAR) for null literals
   created from unresolved field names inside COALESCE. This allows the
   return type inference to correctly ignore NULL-typed operands.

2. EnhancedCoalesceFunction: Filter out NULL-typed operands before
   calling leastRestrictive, so COALESCE(null, 42) infers INTEGER from
   the non-null operand rather than falling back to VARCHAR.

3. CalciteRexNodeVisitor: Scope the inCoalesceFunction flag to direct
   COALESCE arguments only, clearing it for nested function calls.
   This prevents unresolved fields deep inside expressions like
   array_compact(does_not_exist) from silently becoming null.

Signed-off-by: Heng Qian <qianheng@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Fix COALESCE return type inference for NULL-typed operands

Relevant files:

  • core/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
  • core/src/main/java/org/opensearch/sql/expression/function/udf/condition/EnhancedCoalesceFunction.java

Sub-PR theme: Fix inCoalesceFunction flag scoping and add regression tests

Relevant files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLEnhancedCoalesceTest.java

⚡ Recommended focus areas for review

VARCHAR Fallback

When leastRestrictive returns null for non-null operand types (e.g., incompatible types like INTEGER and VARCHAR), the code silently falls back to VARCHAR. This may mask type mismatches and produce unexpected results. Consider whether a more explicit error or a different fallback strategy is appropriate.

var commonType = opBinding.getTypeFactory().leastRestrictive(nonNullTypes);
if (commonType != null) {
  // Ensure the result is nullable since at least one operand could be NULL
  return opBinding.getTypeFactory().createTypeWithNullability(commonType, true);
}
return opBinding.getTypeFactory().createSqlType(SqlTypeName.VARCHAR);
Flag Restoration

The wasInCoalesce flag is saved before the isCoalesce check and restored in finally. However, if an exception is thrown before the finally block is reached (e.g., during argument processing), the flag is still correctly restored. This looks correct, but it's worth verifying that the flag restoration doesn't interfere with concurrent or reentrant visitor usage if the context is shared across threads.

boolean wasInCoalesce = context.isInCoalesceFunction();
if (isCoalesce) {
  context.setInCoalesceFunction(true);
} else {
  // Clear the flag for non-COALESCE nested functions so that unresolved fields
  // inside e.g. array_compact(does_not_exist) still throw errors properly.
  context.setInCoalesceFunction(false);
}

List<RexNode> capturedVars = null;
try {
  for (UnresolvedExpression arg : args) {
    if (arg instanceof LambdaFunction) {
      CalcitePlanContext lambdaContext =
          prepareLambdaContext(
              context, (LambdaFunction) arg, arguments, node.getFuncName(), null);
      RexNode lambdaNode = analyze(arg, lambdaContext);
      if (node.getFuncName().equalsIgnoreCase("reduce")) {
        lambdaContext =
            prepareLambdaContext(
                context,
                (LambdaFunction) arg,
                arguments,
                node.getFuncName(),
                lambdaNode.getType());
        lambdaNode = analyze(arg, lambdaContext);
      }
      arguments.add(lambdaNode);
      // Capture any external variables that were referenced in the lambda
      capturedVars = lambdaContext.getCapturedVariables();
    } else {
      arguments.add(analyze(arg, context));
    }
  }
} finally {
  // Restore the previous inCoalesceFunction state
  context.setInCoalesceFunction(wasInCoalesce);
Duplicate Test

testCoalesceNullWithInteger and testCoalesceNullWithIntegerResult both test COALESCE(null, 42) — the first checks the logical plan and type, the second checks the result. Consider merging them into a single test or clarifying the distinction more explicitly in the test names and comments.

@Test
public void testCoalesceNullWithInteger() {
  // Regression test for #5175: COALESCE(null, 42) should return INTEGER, not VARCHAR
  String ppl = "source=EMP | eval x = coalesce(null, 42) | fields EMPNO, x | head 1";
  RelNode root = getRelNode(ppl);
  String expectedLogical =
      "LogicalSort(fetch=[1])\n"
          + "  LogicalProject(EMPNO=[$0], x=[COALESCE(null:NULL, 42)])\n"
          + "    LogicalTableScan(table=[[scott, EMP]])\n";
  verifyLogical(root, expectedLogical);
  // Verify the COALESCE return type is INTEGER, not VARCHAR
  org.apache.calcite.rel.logical.LogicalSort sort =
      (org.apache.calcite.rel.logical.LogicalSort) root;
  org.apache.calcite.rel.logical.LogicalProject proj =
      (org.apache.calcite.rel.logical.LogicalProject) sort.getInput();
  org.apache.calcite.rex.RexNode coalesceExpr = proj.getProjects().get(1);
  org.junit.Assert.assertEquals(
      org.apache.calcite.sql.type.SqlTypeName.INTEGER, coalesceExpr.getType().getSqlTypeName());
}

@Test
public void testCoalesceIntegerWithNull() {
  // Regression test for #5175: COALESCE(42, null) should return INTEGER, not VARCHAR
  String ppl = "source=EMP | eval x = coalesce(42, null) | fields EMPNO, x | head 1";
  RelNode root = getRelNode(ppl);
  String expectedLogical =
      "LogicalSort(fetch=[1])\n"
          + "  LogicalProject(EMPNO=[$0], x=[COALESCE(42, null:NULL)])\n"
          + "    LogicalTableScan(table=[[scott, EMP]])\n";
  verifyLogical(root, expectedLogical);
  // Verify the COALESCE return type is INTEGER, not VARCHAR
  org.apache.calcite.rel.logical.LogicalSort sort =
      (org.apache.calcite.rel.logical.LogicalSort) root;
  org.apache.calcite.rel.logical.LogicalProject proj =
      (org.apache.calcite.rel.logical.LogicalProject) sort.getInput();
  org.apache.calcite.rex.RexNode coalesceExpr = proj.getProjects().get(1);
  org.junit.Assert.assertEquals(
      org.apache.calcite.sql.type.SqlTypeName.INTEGER, coalesceExpr.getType().getSqlTypeName());
}

@Test
public void testCoalesceNullWithIntegerResult() {
  // Regression test for #5175: COALESCE(null, 42) should return numeric 42, not string "42"
  String ppl = "source=EMP | eval x = coalesce(null, 42) | fields x | head 1";
  RelNode root = getRelNode(ppl);
  verifyResult(root, "x=42\n");
}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Ensure fallback type is consistently nullable

When leastRestrictive returns null for the non-null types, the fallback silently
returns VARCHAR, which can still produce wrong types for non-string inputs. The
nullability should also be set consistently on the fallback type, and ideally the
fallback should reflect whether any NULL operands were present.

core/src/main/java/org/opensearch/sql/expression/function/udf/condition/EnhancedCoalesceFunction.java [110-115]

 var commonType = opBinding.getTypeFactory().leastRestrictive(nonNullTypes);
 if (commonType != null) {
   // Ensure the result is nullable since at least one operand could be NULL
-  return opBinding.getTypeFactory().createTypeWithNullability(commonType, true);
+  boolean hasNullOperand = nonNullTypes.size() < operandTypes.size();
+  return opBinding.getTypeFactory().createTypeWithNullability(commonType, hasNullOperand || commonType.isNullable());
 }
-return opBinding.getTypeFactory().createSqlType(SqlTypeName.VARCHAR);
+return opBinding.getTypeFactory().createTypeWithNullability(
+    opBinding.getTypeFactory().createSqlType(SqlTypeName.VARCHAR), true);
Suggestion importance[1-10]: 4

__

Why: The suggestion to make the fallback VARCHAR type nullable is a minor improvement for consistency, but the leastRestrictive returning null for non-null concrete types is an edge case that rarely occurs in practice. The nullability change on the fallback is valid but low impact.

Low
Possible issue
Preserve COALESCE flag for nested function calls

When a non-COALESCE function is nested inside a COALESCE (e.g.,
COALESCE(some_func(field), 42)), the flag is cleared before processing some_func's
arguments. This means unresolved fields inside some_func that are direct arguments
of COALESCE won't get the null-replacement behavior. The flag should only be cleared
when entering a non-COALESCE function, but the wasInCoalesce state should still be
restored in the finally block, which it is — so the real issue is that clearing the
flag here prevents COALESCE's direct non-function arguments from being treated
correctly if they appear after a nested function call. Consider only clearing the
flag when !isCoalesce && !wasInCoalesce to avoid incorrectly suppressing the flag
for sibling arguments.

core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java [505-512]

 boolean wasInCoalesce = context.isInCoalesceFunction();
 if (isCoalesce) {
   context.setInCoalesceFunction(true);
-} else {
-  // Clear the flag for non-COALESCE nested functions so that unresolved fields
-  // inside e.g. array_compact(does_not_exist) still throw errors properly.
+} else if (!wasInCoalesce) {
+  // Only clear if we weren't already inside a COALESCE context
   context.setInCoalesceFunction(false);
 }
Suggestion importance[1-10]: 2

__

Why: The suggestion misunderstands the PR's intent. The PR deliberately clears inCoalesceFunction when entering a non-COALESCE nested function to prevent unresolved fields inside nested functions from silently becoming null. The finally block restores wasInCoalesce, so sibling arguments are handled correctly. The suggested change else if (!wasInCoalesce) would be a no-op since setInCoalesceFunction(false) when wasInCoalesce is already false changes nothing.

Low

@qianheng-aws qianheng-aws deleted the fix/coalesce-null-type-5175 branch April 4, 2026 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] COALESCE(null, 42) returns wrong type string instead of int due to leastRestrictive type inference

1 participant