You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: data-explorer/kusto/query/join-rightouter.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: rightouter join
3
3
description: Learn how to use the rightouter join flavor to merge the rows of two tables.
4
4
ms.reviewer: alexans
5
5
ms.topic: reference
6
-
ms.date: 08/11/2024
6
+
ms.date: 01/21/2025
7
7
---
8
8
9
9
# rightouter join
@@ -29,6 +29,8 @@ The `rightouter` join flavor returns all the records from the right side and onl
29
29
30
30
## Example
31
31
32
+
This query returns all rows from table Y and any matching rows from table X, filling in NULL values where there is no match from X.
33
+
32
34
:::moniker range="azure-data-explorer"
33
35
> [!div class="nextstepaction"]
34
36
> <ahref="https://dataexplorer.azure.com/clusters/help/databases/Samples?query=H4sIAAAAAAAAA8tJLVGIULBVSEksAcKknFQN79RKq+KSosy8dB2FsMSc0lRDq5z8vHRNrmguBSBQT1TXMdSBMJPUdYwQTGMoM1ldx4Qr1porB2h0JH6jjVCNBhpiaIAwxQiJbQxjpwBNNwAZH6FQo5CVn5mnkJ2Zl2JblJmeUZJfWpJaBLQzP08BaBUAPvRgAtsAAAA="target="_blank">Run the query</a>
Copy file name to clipboardExpand all lines: data-explorer/kusto/query/join-rightsemi.md
+7-1Lines changed: 7 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: rightsemi join
3
3
description: Learn how to use the rightsemi join flavor to merge the rows of two tables.
4
4
ms.reviewer: alexans
5
5
ms.topic: reference
6
-
ms.date: 08/11/2024
6
+
ms.date: 01/21/2025
7
7
---
8
8
9
9
# rightsemi join
@@ -29,6 +29,8 @@ The `rightsemi` join flavor returns all records from the right side that match a
29
29
30
30
## Example
31
31
32
+
This query filters and returns only those rows from table Y that have a matching key in table X.
33
+
32
34
:::moniker range="azure-data-explorer"
33
35
> [!div class="nextstepaction"]
34
36
> <ahref="https://dataexplorer.azure.com/clusters/help/databases/Samples?query=H4sIAAAAAAAAA8tJLVGIULBVSEksAcKknFQN79RKq+KSosy8dB2FsMSc0lRDq5z8vHRNrmguBSBQT1TXMdSBMJPUdYwQTGMoM1ldx4Qr1porB2h0JH6jjVCNBhpiaIAwxQiJbQxjpwBNNwAZH6FQo5CVn5mnkJ2Zl2JblJmeUVKcmpsJtDI/TwFoEwCXFUWa2gAAAA=="target="_blank">Run the query</a>
@@ -59,3 +61,7 @@ X | join kind=rightsemi Y on Key
59
61
| b | 10 |
60
62
| c | 20 |
61
63
| c | 30 |
64
+
65
+
## Related content
66
+
67
+
* Learn about other [join flavors](join-operator.md#returns)
It's often useful to join between two large datasets on some high-cardinality key, such as an operation ID or a session ID, and further limit the right-hand-side ($right) records that need to match up with each left-hand-side ($left) record by adding a restriction on the "time-distance" between `datetime` columns on the left and on the right.
13
13
14
-
The above operation differs from the usual Kusto join operation, since for the `equi-join` part of matching the high-cardinality key between the left and right datasets, the system can also apply a distance function and use it to considerably speed up the join.
14
+
The above operation differs from the usual join operation, since for the `equi-join` part of matching the high-cardinality key between the left and right datasets, the system can also apply a distance function and use it to considerably speed up the join.
15
15
16
16
> [!NOTE]
17
-
> A distance function doesn't behave like equality (that is, when both dist(x,y) and dist(y,z) are true it doesn't follow that dist(x,z) is also true.) Internally, we sometimes refer to this as "diagonal join".
17
+
> A distance function doesn't behave like equality (that is, when both dist(x,y) and dist(y,z) are true it doesn't follow that dist(x,z) is also true.) This is sometimes referred to as a "diagonal join".
18
18
19
-
For example, if you want to identify event sequences within a relatively small time window, assume that you have a table `T` with the following schema:
19
+
## Example to identify event sequences without time window
20
+
21
+
To identify event sequences within a relatively small time window, this example uses a table `T` with the following schema:
20
22
21
23
*`SessionId`: A column of type `string` with correlation IDs.
22
24
*`EventType`: A column of type `string` that identifies the event type of the record.
23
25
*`Timestamp`: A column of type `datetime` indicates when the event described by the record happened.
24
26
27
+
| SessionId | EventType | Timestamp |
28
+
|--|--|--|
29
+
| 0 | A | 2017-10-01T00:00:00Z |
30
+
| 0 | B | 2017-10-01T00:01:00Z |
31
+
| 1 | B | 2017-10-01T00:02:00Z |
32
+
| 1 | A | 2017-10-01T00:03:00Z |
33
+
| 3 | A | 2017-10-01T00:04:00Z |
34
+
| 3 | B | 2017-10-01T00:10:00Z |
35
+
36
+
The following query creates the dataset and then identifies all the session IDs in which event type `A` was followed by an event type `B` within a `1min` time window.
37
+
25
38
:::moniker range="azure-data-explorer"
26
39
> [!div class="nextstepaction"]
27
-
> <ahref="https://dataexplorer.azure.com/clusters/help/databases/Samples?query=H4sIAAAAAAAAA8tJLVEIUbBVSEksAcKknFSN4NTi4sz8PM8Uq+KSosy8dB0F17LUvJKQyoJUuEhIZm5qcUliboEVUF9qCZCnycsVzculAATqBuo6CuqOQAImp2FkYGiua2iga2CoYGBgBUaaOsiqnfCoNkRWbUhItRGGanwuMUZWbUxItQmGajwuMYT5MtaalysEAKb/JupnAQAA"target="_blank">Run the query</a>
40
+
> <ahref="https://dataexplorer.azure.com/clusters/help/databases/Samples?query=H4sIAAAAAAAAA4WQTWvDMAyG74H8B91iQ1LsdjDI8GGFHnZubmOHdBGdu8YJjlgZ7MdPbsgHtKS2sbD12O8rnZGgAANVSTwPZxR77DrbuLcq78hbd0xh94OOit8Wx5vC1thRWbc5v0Pik4yj9zgCHolKUkheeRtyYq30c6ZVpjQolV+XTOf0doHWc1o/otc39JKTzZzePKKfbugFJ3qo8uMljgqIoz+4fKHHqZtgTNALmdY3J/wkGHufwp5KT2ZsdKBOjXXwbV1lrHPoeyOiD0EhxPsq22TI3lHa8YczncBJaNyETN4Fs5D13iQckC6IDoSq2dhqBZqjXKrnKvYPNlcRxHMCAAA="target="_blank">Run the query</a>
28
41
::: moniker-end
29
42
30
43
```kusto
@@ -38,38 +51,6 @@ let T = datatable(SessionId:string, EventType:string, Timestamp:datetime)
38
51
'3', 'B', datetime(2017-10-01 00:10:00),
39
52
];
40
53
T
41
-
```
42
-
43
-
**Output**
44
-
45
-
|SessionId|EventType|Timestamp|
46
-
|---|---|---|
47
-
|0|A|2017-10-01 00:00:00.0000000|
48
-
|0|B|2017-10-01 00:01:00.0000000|
49
-
|1|B|2017-10-01 00:02:00.0000000|
50
-
|1|A|2017-10-01 00:03:00.0000000|
51
-
|3|A|2017-10-01 00:04:00.0000000|
52
-
|3|B|2017-10-01 00:10:00.0000000|
53
-
54
-
**Problem statement**
55
-
56
-
Our query should answer the following question:
57
-
58
-
Find all the session IDs in which event type `A` was followed by an
59
-
event type `B` within a `1min` time window.
60
-
61
-
> [!NOTE]
62
-
> In the sample data above, the only such session ID is `0`.
63
-
64
-
Semantically, the following query answers this question, albeit inefficiently.
65
-
66
-
:::moniker range="azure-data-explorer"
67
-
> [!div class="nextstepaction"]
68
-
> <ahref="https://dataexplorer.azure.com/clusters/help/databases/Samples?query=H4sIAAAAAAAAA4WQTWvDMAyG74H8B91iQ1LsdjDI8GGFHnZubmOHdBGdu8YJjlgZ7MdPbsgHtKS2sbD12O8rnZGgAANVSTwPZxR77DrbuLcq78hbd0xh94OOit8Wx5vC1thRWbc5v0Pik4yj9zgCHolKUkheeRtyYq30c6ZVpjQolV+XTOf0doHWc1o/otc39JKTzZzePKKfbugFJ3qo8uMljgqIoz+4fKHHqZtgTNALmdY3J/wkGHufwp5KT2ZsdKBOjXXwbV1lrHPoeyOiD0EhxPsq22TI3lHa8YczncBJaNyETN4Fs5D13iQckC6IDoSq2dhqBZqjXKrnKvYPNlcRxHMCAAA="target="_blank">Run the query</a>
To optimize this query, we can rewrite it as described below
92
-
so that the time window is expressed as a join key.
72
+
## Example optimized with time window
93
73
94
-
**Rewrite the queryto account for the time window**
74
+
To optimize this query, we can rewrite it to account for the time window. THe time window is expressed as a join key. Rewrite the query so that the `datetime` values are "discretized" into buckets whose size is half the size of the time window. Use *`equi-join`* to compare the bucket IDs.
95
75
96
-
Rewrite the query so that the `datetime` values are "discretized" into buckets whose size is half the size of the time window. Use Kusto's *`equi-join`* to compare those bucket IDs.
97
-
98
-
```kusto
99
-
let lookupWindow = 1min;
100
-
let lookupBin = lookupWindow / 2.0; // lookup bin = equal to 1/2 of the lookup window
101
-
T
102
-
| where EventType == 'A'
103
-
| project SessionId, Start=Timestamp,
104
-
// TimeKey on the left side of the join is mapped to a discrete time axis for the join purpose
105
-
TimeKey = bin(Timestamp, lookupBin)
106
-
| join kind=inner
107
-
(
108
-
T
109
-
| where EventType == 'B'
110
-
| project SessionId, End=Timestamp,
111
-
// TimeKey on the right side of the join - emulates event 'B' appearing several times
// 'mv-expand' translates the TimeKey array range into a column
117
-
| mv-expand TimeKey to typeof(datetime)
118
-
) on SessionId, TimeKey
119
-
| where (End - Start) between (0min .. lookupWindow)
120
-
| project SessionId, Start, End
121
-
```
122
-
123
-
**Runnable query reference (with table inlined)**
76
+
The query finds pairs of events within the same session (*SessionId*) where an 'A' event is followed by a 'B' event within 1 minute. It projects the session ID, the start time of the 'A' event, and the end time of the 'B' event.
Copy file name to clipboardExpand all lines: data-explorer/kusto/query/lookup-operator.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: lookup operator
3
3
description: Learn how to use the lookup operator to extend columns of a fact table.
4
4
ms.reviewer: alexans
5
5
ms.topic: reference
6
-
ms.date: 12/04/2024
6
+
ms.date: 01/20/2025
7
7
---
8
8
# lookup operator
9
9
@@ -75,7 +75,7 @@ A table with:
75
75
* If `kind` is unspecified or `kind=leftouter`, then in addition to the inner matches, there's a row for every row on the left (and/or right), even if it has no match. In that case, the unmatched output cells contain nulls.
76
76
* If `kind=inner`, then there's a row in the output for every combination of matching rows from left and right.
77
77
78
-
## Examples
78
+
## Example
79
79
80
80
The following example shows how to perform a left outer join between the `FactTable` and `DimTable`, based on matching values in the `Personal` and `Family` columns.
0 commit comments