Skip to content

Commit 65552a8

Browse files
yaooqinncloud-fan
authored andcommitted
[SPARK-30083][SQL] visitArithmeticUnary should wrap PLUS case with UnaryPositive for type checking
### What changes were proposed in this pull request? `UnaryPositive` only accepts numeric and interval as we defined, but what we do for this in `AstBuider.visitArithmeticUnary` is just bypassing it. This should not be omitted for the type checking requirement. ### Why are the changes needed? bug fix, you can find a pre-discussion here apache#26578 (comment) ### Does this PR introduce any user-facing change? yes, +non-numeric-or-interval is now invalid. ``` -- !query 14 select +date '1900-01-01' -- !query 14 schema struct<DATE '1900-01-01':date> -- !query 14 output 1900-01-01 -- !query 15 select +timestamp '1900-01-01' -- !query 15 schema struct<TIMESTAMP '1900-01-01 00:00:00':timestamp> -- !query 15 output 1900-01-01 00:00:00 -- !query 16 select +map(1, 2) -- !query 16 schema struct<map(1, 2):map<int,int>> -- !query 16 output {1:2} -- !query 17 select +array(1,2) -- !query 17 schema struct<array(1, 2):array<int>> -- !query 17 output [1,2] -- !query 18 select -'1' -- !query 18 schema struct<(- CAST(1 AS DOUBLE)):double> -- !query 18 output -1.0 -- !query 19 select -X'1' -- !query 19 schema struct<> -- !query 19 output org.apache.spark.sql.AnalysisException cannot resolve '(- X'01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'X'01'' is of binary type.; line 1 pos 7 -- !query 20 select +X'1' -- !query 20 schema struct<X'01':binary> -- !query 20 output ``` ### How was this patch tested? add ut check Closes apache#26716 from yaooqinn/SPARK-30083. Authored-by: Kent Yao <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
1 parent 39291cf commit 65552a8

File tree

11 files changed

+145
-45
lines changed

11 files changed

+145
-45
lines changed

docs/sql-migration-guide.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -254,6 +254,8 @@ license: |
254254
</tr>
255255
</table>
256256

257+
- Since Spark 3.0, the unary arithmetic operator plus(`+`) only accepts string, numeric and interval type values as inputs. Besides, `+` with a integral string representation will be coerced to double value, e.g. `+'1'` results `1.0`. In Spark version 2.4 and earlier, this operator is ignored. There is no type checking for it, thus, all type values with a `+` prefix are valid, e.g. `+ array(1, 2)` is valid and results `[1, 2]`. Besides, there is no type coercion for it at all, e.g. in Spark 2.4, the result of `+'1'` is string `1`.
258+
257259
## Upgrading from Spark SQL 2.4 to 2.4.1
258260

259261
- The value of `spark.executor.heartbeatInterval`, when specified without units like "30" rather than "30s", was

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ package object dsl {
6464
trait ImplicitOperators {
6565
def expr: Expression
6666

67+
def unary_+ : Expression = UnaryPositive(expr)
6768
def unary_- : Expression = UnaryMinus(expr)
6869
def unary_! : Predicate = Not(expr)
6970
def unary_~ : Expression = BitwiseNot(expr)

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1461,7 +1461,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
14611461
val value = expression(ctx.valueExpression)
14621462
ctx.operator.getType match {
14631463
case SqlBaseParser.PLUS =>
1464-
value
1464+
UnaryPositive(value)
14651465
case SqlBaseParser.MINUS =>
14661466
UnaryMinus(value)
14671467
case SqlBaseParser.TILDE =>

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -226,10 +226,10 @@ class ExpressionParserSuite extends AnalysisTest {
226226
}
227227

228228
test("unary arithmetic expressions") {
229-
assertEqual("+a", 'a)
229+
assertEqual("+a", +'a)
230230
assertEqual("-a", -'a)
231231
assertEqual("~a", ~'a)
232-
assertEqual("-+~~a", -(~(~'a)))
232+
assertEqual("-+~~a", -( +(~(~'a))))
233233
}
234234

235235
test("cast expressions") {

sql/core/src/test/resources/sql-tests/inputs/literals.sql

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,11 @@ select -integer '7';
110110
select +integer '7';
111111
select +date '1999-01-01';
112112
select +timestamp '1999-01-01';
113+
select +interval '1 day';
114+
select +map(1, 2);
115+
select +array(1,2);
116+
select +named_struct('a', 1, 'b', 'spark');
117+
select +X'1';
113118
-- can't negate date/timestamp/binary
114119
select -date '1999-01-01';
115120
select -timestamp '1999-01-01';

sql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -255,15 +255,15 @@ struct<(- INTERVAL '-1 months 1 days -1 seconds'):interval>
255255
-- !query 31
256256
select +interval '-1 month 1 day -1 second'
257257
-- !query 31 schema
258-
struct<INTERVAL '-1 months 1 days -1 seconds':interval>
258+
struct<(+ INTERVAL '-1 months 1 days -1 seconds'):interval>
259259
-- !query 31 output
260260
-1 months 1 days -1 seconds
261261

262262

263263
-- !query 32
264264
select +interval -1 month 1 day -1 second
265265
-- !query 32 schema
266-
struct<INTERVAL '-1 months 1 days -1 seconds':interval>
266+
struct<(+ INTERVAL '-1 months 1 days -1 seconds'):interval>
267267
-- !query 32 output
268268
-1 months 1 days -1 seconds
269269

sql/core/src/test/resources/sql-tests/results/ansi/literals.sql.out

Lines changed: 59 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
-- Automatically generated by SQLQueryTestSuite
2-
-- Number of queries: 50
2+
-- Number of queries: 55
33

44

55
-- !query 0
@@ -433,49 +433,95 @@ struct<(- 7):int>
433433
-- !query 44
434434
select +integer '7'
435435
-- !query 44 schema
436-
struct<7:int>
436+
struct<(+ 7):int>
437437
-- !query 44 output
438438
7
439439

440440

441441
-- !query 45
442442
select +date '1999-01-01'
443443
-- !query 45 schema
444-
struct<DATE '1999-01-01':date>
444+
struct<>
445445
-- !query 45 output
446-
1999-01-01
446+
org.apache.spark.sql.AnalysisException
447+
cannot resolve '(+ DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
447448

448449

449450
-- !query 46
450451
select +timestamp '1999-01-01'
451452
-- !query 46 schema
452-
struct<TIMESTAMP '1999-01-01 00:00:00':timestamp>
453+
struct<>
453454
-- !query 46 output
454-
1999-01-01 00:00:00
455+
org.apache.spark.sql.AnalysisException
456+
cannot resolve '(+ TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
455457

456458

457459
-- !query 47
458-
select -date '1999-01-01'
460+
select +interval '1 day'
459461
-- !query 47 schema
460-
struct<>
462+
struct<(+ INTERVAL '1 days'):interval>
461463
-- !query 47 output
462-
org.apache.spark.sql.AnalysisException
463-
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
464+
1 days
464465

465466

466467
-- !query 48
467-
select -timestamp '1999-01-01'
468+
select +map(1, 2)
468469
-- !query 48 schema
469470
struct<>
470471
-- !query 48 output
471472
org.apache.spark.sql.AnalysisException
472-
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
473+
cannot resolve '(+ map(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'map(1, 2)' is of map<int,int> type.; line 1 pos 7
473474

474475

475476
-- !query 49
476-
select -x'2379ACFe'
477+
select +array(1,2)
477478
-- !query 49 schema
478479
struct<>
479480
-- !query 49 output
480481
org.apache.spark.sql.AnalysisException
482+
cannot resolve '(+ array(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'array(1, 2)' is of array<int> type.; line 1 pos 7
483+
484+
485+
-- !query 50
486+
select +named_struct('a', 1, 'b', 'spark')
487+
-- !query 50 schema
488+
struct<>
489+
-- !query 50 output
490+
org.apache.spark.sql.AnalysisException
491+
cannot resolve '(+ named_struct('a', 1, 'b', 'spark'))' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'named_struct('a', 1, 'b', 'spark')' is of struct<a:int,b:string> type.; line 1 pos 7
492+
493+
494+
-- !query 51
495+
select +X'1'
496+
-- !query 51 schema
497+
struct<>
498+
-- !query 51 output
499+
org.apache.spark.sql.AnalysisException
500+
cannot resolve '(+ X'01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'X'01'' is of binary type.; line 1 pos 7
501+
502+
503+
-- !query 52
504+
select -date '1999-01-01'
505+
-- !query 52 schema
506+
struct<>
507+
-- !query 52 output
508+
org.apache.spark.sql.AnalysisException
509+
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
510+
511+
512+
-- !query 53
513+
select -timestamp '1999-01-01'
514+
-- !query 53 schema
515+
struct<>
516+
-- !query 53 output
517+
org.apache.spark.sql.AnalysisException
518+
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
519+
520+
521+
-- !query 54
522+
select -x'2379ACFe'
523+
-- !query 54 schema
524+
struct<>
525+
-- !query 54 output
526+
org.apache.spark.sql.AnalysisException
481527
cannot resolve '(- X'2379ACFE')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'X'2379ACFE'' is of binary type.; line 1 pos 7

sql/core/src/test/resources/sql-tests/results/interval.sql.out

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -255,15 +255,15 @@ struct<(- INTERVAL '-1 months 1 days -1 seconds'):interval>
255255
-- !query 31
256256
select +interval '-1 month 1 day -1 second'
257257
-- !query 31 schema
258-
struct<INTERVAL '-1 months 1 days -1 seconds':interval>
258+
struct<(+ INTERVAL '-1 months 1 days -1 seconds'):interval>
259259
-- !query 31 output
260260
-1 months 1 days -1 seconds
261261

262262

263263
-- !query 32
264264
select +interval -1 month 1 day -1 second
265265
-- !query 32 schema
266-
struct<INTERVAL '-1 months 1 days -1 seconds':interval>
266+
struct<(+ INTERVAL '-1 months 1 days -1 seconds'):interval>
267267
-- !query 32 output
268268
-1 months 1 days -1 seconds
269269

sql/core/src/test/resources/sql-tests/results/literals.sql.out

Lines changed: 59 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
-- Automatically generated by SQLQueryTestSuite
2-
-- Number of queries: 50
2+
-- Number of queries: 55
33

44

55
-- !query 0
@@ -433,49 +433,95 @@ struct<(- 7):int>
433433
-- !query 44
434434
select +integer '7'
435435
-- !query 44 schema
436-
struct<7:int>
436+
struct<(+ 7):int>
437437
-- !query 44 output
438438
7
439439

440440

441441
-- !query 45
442442
select +date '1999-01-01'
443443
-- !query 45 schema
444-
struct<DATE '1999-01-01':date>
444+
struct<>
445445
-- !query 45 output
446-
1999-01-01
446+
org.apache.spark.sql.AnalysisException
447+
cannot resolve '(+ DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
447448

448449

449450
-- !query 46
450451
select +timestamp '1999-01-01'
451452
-- !query 46 schema
452-
struct<TIMESTAMP '1999-01-01 00:00:00':timestamp>
453+
struct<>
453454
-- !query 46 output
454-
1999-01-01 00:00:00
455+
org.apache.spark.sql.AnalysisException
456+
cannot resolve '(+ TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
455457

456458

457459
-- !query 47
458-
select -date '1999-01-01'
460+
select +interval '1 day'
459461
-- !query 47 schema
460-
struct<>
462+
struct<(+ INTERVAL '1 days'):interval>
461463
-- !query 47 output
462-
org.apache.spark.sql.AnalysisException
463-
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
464+
1 days
464465

465466

466467
-- !query 48
467-
select -timestamp '1999-01-01'
468+
select +map(1, 2)
468469
-- !query 48 schema
469470
struct<>
470471
-- !query 48 output
471472
org.apache.spark.sql.AnalysisException
472-
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
473+
cannot resolve '(+ map(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'map(1, 2)' is of map<int,int> type.; line 1 pos 7
473474

474475

475476
-- !query 49
476-
select -x'2379ACFe'
477+
select +array(1,2)
477478
-- !query 49 schema
478479
struct<>
479480
-- !query 49 output
480481
org.apache.spark.sql.AnalysisException
482+
cannot resolve '(+ array(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'array(1, 2)' is of array<int> type.; line 1 pos 7
483+
484+
485+
-- !query 50
486+
select +named_struct('a', 1, 'b', 'spark')
487+
-- !query 50 schema
488+
struct<>
489+
-- !query 50 output
490+
org.apache.spark.sql.AnalysisException
491+
cannot resolve '(+ named_struct('a', 1, 'b', 'spark'))' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'named_struct('a', 1, 'b', 'spark')' is of struct<a:int,b:string> type.; line 1 pos 7
492+
493+
494+
-- !query 51
495+
select +X'1'
496+
-- !query 51 schema
497+
struct<>
498+
-- !query 51 output
499+
org.apache.spark.sql.AnalysisException
500+
cannot resolve '(+ X'01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'X'01'' is of binary type.; line 1 pos 7
501+
502+
503+
-- !query 52
504+
select -date '1999-01-01'
505+
-- !query 52 schema
506+
struct<>
507+
-- !query 52 output
508+
org.apache.spark.sql.AnalysisException
509+
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
510+
511+
512+
-- !query 53
513+
select -timestamp '1999-01-01'
514+
-- !query 53 schema
515+
struct<>
516+
-- !query 53 output
517+
org.apache.spark.sql.AnalysisException
518+
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
519+
520+
521+
-- !query 54
522+
select -x'2379ACFe'
523+
-- !query 54 schema
524+
struct<>
525+
-- !query 54 output
526+
org.apache.spark.sql.AnalysisException
481527
cannot resolve '(- X'2379ACFE')' due to data type mismatch: argument 1 requires (numeric or interval) type, however, 'X'2379ACFE'' is of binary type.; line 1 pos 7

0 commit comments

Comments
 (0)