Skip to content

Commit 75ec86f

Browse files
authored
docs: SQL API joins through __cubeJoinField
1 parent 291977b commit 75ec86f

File tree

1 file changed

+60
-98
lines changed

1 file changed

+60
-98
lines changed

docs/content/SQL-API/Joins.mdx

Lines changed: 60 additions & 98 deletions
Original file line numberDiff line numberDiff line change
@@ -6,70 +6,68 @@ subCategory: Reference
66
menuOrder: 6
77
---
88

9-
<WarningBox>
10-
11-
The Cube SQL API currently does not support `INNER JOIN`, `LEFT JOIN`,
12-
`RIGHT JOIN` and `FULL OUTER JOIN`. We plan to support these types of joins in
13-
future releases.
14-
15-
</WarningBox>
16-
17-
The SQL API supports `CROSS JOIN`s between cubes. When a `CROSS JOIN` is being
18-
processed by Cube, it generates the correct joining conditions for the
19-
underlying data source.
9+
The SQL API supports joins through `__cubeJoinField` virtual column, which is available in every cube table.
10+
Join can also be done through `CROSS JOIN`.
11+
Usage of `__cubeJoinField` in a join instructs Cube to perform join as it's defined in a data schema.
12+
Cube generates the correct joining conditions for the underlying data source.
2013

2114
For example, the following query joins the `Orders` and `Products` tables under
2215
the hood with `Orders.product_id = Products.id`, exactly the same way as the
2316
REST API query does:
2417

2518
```sql
26-
cube=> SELECT * FROM Orders CROSS JOIN Products LIMIT 5;
27-
count | avgValue | totalValue | number | value | status | createdAt | completedAt | __user | count | name | description | createdAt | __user
28-
-------+----------+------------+--------+-------+-----------+----------------------------+----------------------------+--------+-------+---------------------------+------------------------------------+----------------------------+--------
29-
1 | 20 | 20 | 40 | 20 | completed | 2020-10-26 00:00:00.000000 | 2020-11-07 00:00:00.000000 | | 1 | Incredible Fresh Chicken | Electronics Generic Fresh Computer | 2020-06-29 00:00:00.000000 |
30-
1 | 20 | 20 | 14 | 20 | completed | 2021-02-07 00:00:00.000000 | 2021-03-02 00:00:00.000000 | | 1 | Unbranded Wooden Mouse | Outdoors Incredible Rubber Car | 2019-07-16 00:00:00.000000 |
31-
1 | 20 | 20 | 23 | 20 | completed | 2022-07-23 00:00:00.000000 | 2022-08-11 00:00:00.000000 | | 1 | Handcrafted Plastic Chair | Electronics Sleek Rubber Tuna | 2021-02-27 00:00:00.000000 |
32-
1 | 20 | 20 | 86 | 20 | completed | 2023-04-19 00:00:00.000000 | 2023-04-25 00:00:00.000000 | | 1 | Practical Metal Chicken | Toys Awesome Frozen Chips | 2020-07-24 00:00:00.000000 |
33-
1 | 20 | 20 | 27 | 20 | completed | 2019-06-27 00:00:00.000000 | 2019-07-21 00:00:00.000000 | | 1 | Sleek Rubber Chair | Computers Refined Cotton Shirt | 2021-09-26 00:00:00.000000 |
19+
cube=> SELECT p.name, SUM(o.count) FROM Orders o LEFT JOIN Products p ON o.__cubeJoinField = p.__cubeJoinField GROUP BY 1 LIMIT 5;
20+
name | SUM(o.count)
21+
--------------------------+--------------
22+
Tasty Plastic Mouse | 121
23+
Intelligent Cotton Ball | 119
24+
Ergonomic Steel Tuna | 116
25+
Intelligent Rubber Pants | 116
26+
Generic Wooden Gloves | 116
27+
(5 rows)
28+
```
29+
30+
Or through `CROSS JOIN`:
31+
32+
```sql
33+
cube=> SELECT p.name, sum(o.count) FROM Orders o CROSS JOIN Products p GROUP BY 1 LIMIT 5;
34+
name | SUM(o.count)
35+
--------------------------+--------------
36+
Tasty Plastic Mouse | 121
37+
Intelligent Cotton Ball | 119
38+
Ergonomic Steel Tuna | 116
39+
Intelligent Rubber Pants | 116
40+
Generic Wooden Gloves | 116
3441
(5 rows)
3542
```
3643

3744
In the resulting query plan, you won't see any joins as you can't see those for
3845
REST API queries either:
3946

4047
```sql
41-
cube=> EXPLAIN SELECT * FROM Orders CROSS JOIN Products LIMIT 5;
42-
plan_type | plan
43-
---------------+-----------------------------
44-
logical_plan | CubeScan: request={ +
45-
| "measures": [ +
46-
| "Orders.count", +
47-
| "Orders.avgValue", +
48-
| "Orders.totalValue", +
49-
| "Orders.number", +
50-
| "Products.count" +
51-
| ], +
52-
| "dimensions": [ +
53-
| "Orders.value", +
54-
| "Orders.status", +
55-
| "Orders.createdAt", +
56-
| "Orders.completedAt", +
57-
| "Products.name", +
58-
| "Products.description",+
59-
| "Products.createdAt" +
60-
| ], +
61-
| "segments": [], +
62-
| "limit": 5 +
48+
cube=> EXPLAIN SELECT p.name, sum(o.count) FROM Orders o LEFT JOIN Products p ON o.__cubeJoinField = p.__cubeJoinField GROUP BY 1 LIMIT 5;
49+
plan_type | plan
50+
---------------+-----------------------
51+
logical_plan | CubeScan: request={ +
52+
| "measures": [ +
53+
| "Orders.count" +
54+
| ], +
55+
| "dimensions": [ +
56+
| "Products.name" +
57+
| ], +
58+
| "segments": [], +
59+
| "limit": 5 +
6360
| }
64-
physical_plan | CubeScanExecutionPlan +
65-
|
61+
physical_plan | CubeScanExecutionPlan+
62+
|
6663
(2 rows)
6764
```
6865

69-
This feature allows you to `CROSS JOIN` cubes even with transitive joins only.
66+
This feature allows you to join cubes even joined transitively only.
7067

71-
Typically, in tools that allow defining custom SQL datasets, you'd use joined
72-
tables as a dataset SQL. For example:
68+
In most of the BI tools you'd use `__cubeJoinField` to define joins between cube tables.
69+
In tools that allow defining custom SQL datasets, you can use joined tables as a dataset SQL.
70+
For example:
7371

7472
```sql
7573
SELECT o.count as count, p.name as product_name, p.description as product_description
@@ -137,74 +135,38 @@ LIMIT 5;
137135
Please note even if `product_description` is in the inner selection, it isn't
138136
evaluated in the final query as it isn't used in any way.
139137

138+
## Proxy Dimensions and Views
139+
140140
As an alternative to achieve joins it is also possible to define proxy dimension
141-
or measure inside the Cube.
141+
or measure inside a cube or a view.
142+
This is the preferred way of joining as it provides you control over the joining path for complex use cases.
142143

143144
```javascript
144-
cube(`Orders`, {
145-
sql: `SELECT * FROM public.orders`,
146-
147-
joins: {
148-
Users: {
149-
relationship: `belongsTo`,
150-
sql: `${CUBE}.user_id = ${Users}.id`,
151-
},
152-
},
153-
154-
measures: {
155-
count: {
156-
type: `count`,
157-
},
158-
},
159-
145+
view(`OrdersUsers`, {
146+
includes: [Orders],
160147
dimensions: {
161-
id: {
162-
sql: `id`,
163-
type: `number`,
164-
primaryKey: true,
165-
},
166-
167148
// this is proxy dimension
168149
user_city: {
169150
sql: `${Users.city}`,
170151
type: `string`,
171152
},
172153
},
173154
});
174-
175-
cube(`Users`, {
176-
sql: `SELECT * FROM public.users`,
177-
178-
measures: {},
179-
180-
dimensions: {
181-
id: {
182-
sql: `id`,
183-
type: `number`,
184-
primaryKey: true,
185-
},
186-
187-
city: {
188-
sql: `city`,
189-
type: `string`,
190-
},
191-
},
192-
});
193155
```
194156

195157
Now, it is possible to get orders count by users city with the following query.
196158

197159
```
198-
cube=> SELECT count, user_city FROM Orders;
199-
count | user_city
160+
cube=> SELECT count, user_city FROM OrdersUsers;
161+
count | user_city
200162
-------+---------------
201-
9524 | New York
202-
9408 | San Francisco
203-
6360 | Mountain View
204-
6262 | Seattle
205-
4393 | Los Angeles
206-
3183 | Chicago
207-
3060 | Austin
208-
1804 | Palo Alto
163+
1416 | Los Angeles
164+
1412 | Seattle
165+
1365 | Mountain View
166+
1263 | New York
167+
1220 | Austin
168+
1164 | Chicago
169+
1101 | San Francisco
170+
1059 | Palo Alto
209171
(8 rows)
210172
```

0 commit comments

Comments
 (0)