Skip to content

Commit e5743a1

Browse files
sundy-liBohuTANGzhang2014
authored
fix(query): fix rule_eager_aggregation replace column index bug (#18079)
* add h * test h r1 * chore(query): report case when then bug * update * update * update * update * update --------- Co-authored-by: BohuTANG <[email protected]> Co-authored-by: zhang2014 <[email protected]>
1 parent 2901bc3 commit e5743a1

File tree

26 files changed

+1677
-216
lines changed

26 files changed

+1677
-216
lines changed

src/query/service/tests/it/sql/planner/optimizer/data/cases/Q03.txt

Lines changed: 0 additions & 79 deletions
This file was deleted.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
name: "Q04"
2+
description: "Cross test"
3+
4+
sql: |
5+
SELECT SUM(i1.i), MIN(i1.i), MAX(i2.i) FROM integers i1, integers i2;
6+
7+
8+
auto_statistics: true
9+
10+
good_plan: |
11+
Result [output: DT.D_YEAR, ITEM.I_BRAND_ID, ITEM.I_BRAND, SUM(...)]
12+
└── SortWithLimit [limit: 100]
13+
├── sort keys: [DT.D_YEAR ASC NULLS LAST, SUM(SS_EXT_SALES_PRICE) DESC NULLS FIRST, ITEM.I_BRAND_ID ASC NULLS LAST]
14+
└── Aggregate [group by: DT.D_YEAR, ITEM.I_BRAND, ITEM.I_BRAND_ID]
15+
└── Aggregate [group by: DT.D_YEAR, ITEM.I_BRAND, ITEM.I_BRAND_ID]
16+
└── InnerJoin [join key: (DT.D_DATE_SK = STORE_SALES.SS_SOLD_DATE_SK)]
17+
├── Filter [condition: DT.D_MOY = 11]
18+
│ └── TableScan (DATE_DIM as DT) [partitions: 1/1, bytes: 2,138,624]
19+
│ └── columns: [D_DATE_SK, D_YEAR, D_MOY]
20+
└── Aggregate [group by: ITEM.I_BRAND_ID, ITEM.I_BRAND, STORE_SALES.SS_SOLD_DATE_SK]
21+
└── InnerJoin [join key: (ITEM.I_ITEM_SK = STORE_SALES.SS_ITEM_SK)]
22+
├── Aggregate [group by: ITEM.I_ITEM_SK, ITEM.I_BRAND_ID, ITEM.I_BRAND]
23+
│ └── Filter [condition: ITEM.I_MANUFACT_ID = 128]
24+
│ └── TableScan (ITEM) [partitions: 2/2, bytes: 23,811,584]
25+
│ └── columns: [I_ITEM_SK, I_BRAND_ID, I_BRAND, I_MANUFACT_ID]
26+
└── Aggregate [group by: STORE_SALES.SS_SOLD_DATE_SK, STORE_SALES.SS_ITEM_SK]
27+
└── Filter [condition: STORE_SALES.SS_SOLD_DATE_SK IS NOT NULL]
28+
└── JoinFilter [join key: (DT.D_DATE_SK = STORE_SALES.SS_SOLD_DATE_SK)]
29+
└── TableScan (STORE_SALES) [partitions: 70,412/72,718, bytes: 1,212,628,258,304]
30+
└── columns: [SS_SOLD_DATE_SK, SS_ITEM_SK, SS_EXT_SALES_PRICE]
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
name: "Q99"
2+
description: "TPC-DS Query 99 optimizer test"
3+
4+
sql: |
5+
SELECT t.sell_mnt = 0 FROM (SELECT a.a0d, a.a0k, a.a0m, c.a5m, avg(CASE WHEN d.a1v = '603020' THEN 1 ELSE 0 END) + 3 AS sell_mnt FROM a0c AS a LEFT OUTER JOIN a1z AS b ON a.a0k = b.a0k AND a.a0n = b.a0n AND b.a2c <= a.a0d AND b.a2k > a.a0d LEFT OUTER JOIN a2x AS c ON a.a0m = c.a0m LEFT OUTER JOIN a5r AS d ON a.a0l = d.a5t WHERE a.a0d BETWEEN '20240526' AND '20250525' AND b.a2t = '624100' AND SUBSTRING(c.a4m FROM 20 FOR 1) = '1' AND SUBSTRING(d.a5w FROM 1 FOR 1) = '1' GROUP BY a.a0d, a.a0k, a.a0m, c.a5m) AS t;
6+
7+
# Reference to external statistics file
8+
statistics_file: tpcds_obfuscated.yaml
9+
10+
# Expected good plan after optimization
11+
good_plan: |
12+
Result
13+
└── Project [t.sell_mnt = 0]
14+
└── SubqueryAlias [t]
15+
└── Aggregate [aggExprs: [AVG(CASE WHEN d.a1v = '603020' THEN 1 ELSE 0 END) + 3 AS sell_mnt], groupKeys: [a.a0d, a.a0k, a.a0m, c.a5m]]
16+
└── Filter [a.a0d BETWEEN '20240526' AND '20250525' AND b.a2t = '624100' AND SUBSTRING(c.a4m FROM 20 FOR 1) = '1' AND SUBSTRING(d.a5w FROM 1 FOR 1) = '1']
17+
└── LeftJoin [joinKey: (a.a0l = d.a5t)]
18+
├── LeftJoin [joinKey: (a.a0m = c.a0m)]
19+
│ ├── LeftJoin [joinKey: (a.a0k = b.a0k AND a.a0n = b.a0n), joinFilter: (b.a2c <= a.a0d AND b.a2k > a.a0d)]
20+
│ │ ├── TableScan [a0c] [a0d, a0k, a0m, a0n, a0l] [partitions: 35/35, bytes: 5,772,964,979,745]
21+
│ │ └── TableScan [a1z] [a0k, a0n, a2c, a2k, a2t] [partitions: 1/1, bytes: 43,826,881,850]
22+
│ └── TableScan [a2x] [a0m, a4m, a5m] [partitions: 1/1, bytes: 375,779,508]
23+
└── TableScan [a5r] [a5t, a1v, a5w] [partitions: 1/1, bytes: 1,017,281]
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
name: "Q99"
2+
description: "TPC-DS Query 99 optimizer test"
3+
4+
sql: |
5+
SELECT t.sell_mnt = 0 FROM (SELECT a.a0d, a.a0k, a.a0m, c.a5m, sum(CASE WHEN d.a1v = '603020' THEN 1 ELSE 0 END) AS sell_mnt FROM a0c AS a LEFT OUTER JOIN a1z AS b ON a.a0k = b.a0k AND a.a0n = b.a0n AND b.a2c <= a.a0d AND b.a2k > a.a0d LEFT OUTER JOIN a2x AS c ON a.a0m = c.a0m LEFT OUTER JOIN a5r AS d ON a.a0l = d.a5t WHERE a.a0d BETWEEN '20240526' AND '20250525' AND b.a2t = '624100' AND SUBSTRING(c.a4m FROM 20 FOR 1) = '1' AND SUBSTRING(d.a5w FROM 1 FOR 1) = '1' GROUP BY a.a0d, a.a0k, a.a0m, c.a5m) AS t;
6+
7+
# Reference to external statistics file
8+
statistics_file: tpcds_obfuscated.yaml
9+
10+
# Expected good plan after optimization
11+
good_plan: |
12+
Result
13+
└── Project [t.sell_mnt = 0]
14+
└── SubqueryAlias [t]
15+
└── Aggregate [aggExprs: [SUM(CASE WHEN d.a1v = '603020' THEN 1 ELSE 0 END) AS sell_mnt], groupKeys: [a.a0d, a.a0k, a.a0m, c.a5m]]
16+
└── Filter [a.a0d BETWEEN '20240526' AND '20250525' AND b.a2t = '624100' AND SUBSTRING(c.a4m FROM 20 FOR 1) = '1' AND SUBSTRING(d.a5w FROM 1 FOR 1) = '1']
17+
└── LeftJoin [joinKey: (a.a0l = d.a5t)]
18+
├── LeftJoin [joinKey: (a.a0m = c.a0m)]
19+
│ ├── LeftJoin [joinKey: (a.a0k = b.a0k AND a.a0n = b.a0n), joinFilter: (b.a2c <= a.a0d AND b.a2k > a.a0d)]
20+
│ │ ├── TableScan [a0c] [a0d, a0k, a0m, a0n, a0l] [partitions: 35/35, bytes: 5,772,964,979,745]
21+
│ │ └── TableScan [a1z] [a0k, a0n, a2c, a2k, a2t] [partitions: 1/1, bytes: 43,826,881,850]
22+
│ └── TableScan [a2x] [a0m, a4m, a5m] [partitions: 1/1, bytes: 375,779,508]
23+
└── TableScan [a5r] [a5t, a1v, a5w] [partitions: 1/1, bytes: 1,017,281]
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
Limit
2+
├── limit: [100]
3+
├── offset: [0]
4+
└── Sort
5+
├── sort keys: [default.customer.c_customer_id (#79) ASC NULLS LAST]
6+
├── limit: [100]
7+
└── Exchange(MergeSort)
8+
└── Sort
9+
├── sort keys: [default.customer.c_customer_id (#79) ASC NULLS LAST]
10+
├── limit: [100]
11+
└── EvalScalar
12+
├── scalars: [customer.c_customer_id (#79) AS (#79), ctr1.ctr_total_return (#48) AS (#154), scalar_subquery_147 (#147) AS (#155), store.s_store_sk (#49) AS (#156), ctr1.ctr_store_sk (#7) AS (#157), store.s_state (#73) AS (#158), ctr1.ctr_customer_sk (#3) AS (#159), customer.c_customer_sk (#78) AS (#160)]
13+
└── Join(Inner)
14+
├── build keys: [ctr1.ctr_customer_sk (#3)]
15+
├── probe keys: [customer.c_customer_sk (#78)]
16+
├── other filters: []
17+
├── Scan
18+
│ ├── table: default.customer (#3)
19+
│ ├── filters: []
20+
│ ├── order by: []
21+
│ └── limit: NONE
22+
└── Exchange(Broadcast)
23+
└── Join(Inner)
24+
├── build keys: [sr_store_sk (#103)]
25+
├── probe keys: [sr_store_sk (#7)]
26+
├── other filters: [gt(ctr1.ctr_total_return (#48), scalar_subquery_147 (#147))]
27+
├── Aggregate(Final)
28+
│ ├── group items: [store_returns.sr_customer_sk (#3) AS (#3), store_returns.sr_store_sk (#7) AS (#7)]
29+
│ ├── aggregate functions: [Sum(sr_return_amt) AS (#48)]
30+
│ └── Aggregate(Partial)
31+
│ ├── group items: [store_returns.sr_customer_sk (#3) AS (#3), store_returns.sr_store_sk (#7) AS (#7)]
32+
│ ├── aggregate functions: [Sum(sr_return_amt) AS (#48)]
33+
│ └── Exchange(Hash)
34+
│ ├── Exchange(Hash): keys: [store_returns.sr_customer_sk (#3)]
35+
│ └── EvalScalar
36+
│ ├── scalars: [store_returns.sr_customer_sk (#3) AS (#3), store_returns.sr_store_sk (#7) AS (#7), store_returns.sr_return_amt (#11) AS (#11), store_returns.sr_returned_date_sk (#0) AS (#148), date_dim.d_date_sk (#20) AS (#149), date_dim.d_year (#26) AS (#150)]
37+
│ └── Join(Inner)
38+
│ ├── build keys: [date_dim.d_date_sk (#20)]
39+
│ ├── probe keys: [store_returns.sr_returned_date_sk (#0)]
40+
│ ├── other filters: []
41+
│ ├── Scan
42+
│ │ ├── table: default.store_returns (#0)
43+
│ │ ├── filters: []
44+
│ │ ├── order by: []
45+
│ │ └── limit: NONE
46+
│ └── Exchange(Broadcast)
47+
│ └── Scan
48+
│ ├── table: default.date_dim (#1)
49+
│ ├── filters: [eq(date_dim.d_year (#26), 2001)]
50+
│ ├── order by: []
51+
│ └── limit: NONE
52+
└── Exchange(Broadcast)
53+
└── Join(Inner)
54+
├── build keys: [sr_store_sk (#103)]
55+
├── probe keys: [store.s_store_sk (#49)]
56+
├── other filters: []
57+
├── Scan
58+
│ ├── table: default.store (#2)
59+
│ ├── filters: [eq(store.s_state (#73), 'TN')]
60+
│ ├── order by: []
61+
│ └── limit: NONE
62+
└── Exchange(Broadcast)
63+
└── EvalScalar
64+
├── scalars: [outer.sr_store_sk (#103) AS (#103), multiply(divide(sum(ctr_total_return) (#145), if(eq(count(ctr_total_return) (#146), 0), 1, count(ctr_total_return) (#146))), 1.2) AS (#147)]
65+
└── Aggregate(Final)
66+
├── group items: [outer.sr_store_sk (#103) AS (#103)]
67+
├── aggregate functions: [sum(ctr_total_return) AS (#145), count(ctr_total_return) AS (#146)]
68+
└── Aggregate(Partial)
69+
├── group items: [outer.sr_store_sk (#103) AS (#103)]
70+
├── aggregate functions: [sum(ctr_total_return) AS (#145), count(ctr_total_return) AS (#146)]
71+
└── Exchange(Hash)
72+
├── Exchange(Hash): keys: [outer.sr_store_sk (#103)]
73+
└── Aggregate(Final)
74+
├── group items: [store_returns.sr_customer_sk (#99) AS (#99), store_returns.sr_store_sk (#103) AS (#103)]
75+
├── aggregate functions: [Sum(sr_return_amt) AS (#144)]
76+
└── Aggregate(Partial)
77+
├── group items: [store_returns.sr_customer_sk (#99) AS (#99), store_returns.sr_store_sk (#103) AS (#103)]
78+
├── aggregate functions: [Sum(sr_return_amt) AS (#144)]
79+
└── Exchange(Hash)
80+
├── Exchange(Hash): keys: [store_returns.sr_customer_sk (#99)]
81+
└── EvalScalar
82+
├── scalars: [store_returns.sr_customer_sk (#99) AS (#99), store_returns.sr_store_sk (#103) AS (#103), store_returns.sr_return_amt (#107) AS (#107), store_returns.sr_returned_date_sk (#96) AS (#151), date_dim.d_date_sk (#116) AS (#152), date_dim.d_year (#122) AS (#153)]
83+
└── Join(Inner)
84+
├── build keys: [date_dim.d_date_sk (#116)]
85+
├── probe keys: [store_returns.sr_returned_date_sk (#96)]
86+
├── other filters: []
87+
├── Scan
88+
│ ├── table: default.store_returns (#4)
89+
│ ├── filters: []
90+
│ ├── order by: []
91+
│ └── limit: NONE
92+
└── Exchange(Broadcast)
93+
└── Scan
94+
├── table: default.date_dim (#5)
95+
├── filters: [eq(date_dim.d_year (#122), 2001)]
96+
├── order by: []
97+
└── limit: NONE
98+

0 commit comments

Comments
 (0)